public inbox for gentoo-commits@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-commits] proj/portage:master commit in: pym/portage/cache/, pym/portage/repository/
@ 2011-10-14 23:55 Zac Medico
  0 siblings, 0 replies; only message in thread
From: Zac Medico @ 2011-10-14 23:55 UTC (permalink / raw
  To: gentoo-commits

commit:     1e8870bd45a4e2a9c43e7f112701c6ae84b0fd56
Author:     Brian Harring <ferringb <AT> chromium <DOT> org>
AuthorDate: Fri Oct 14 09:40:00 2011 +0000
Commit:     Zac Medico <zmedico <AT> gentoo <DOT> org>
CommitDate: Fri Oct 14 23:50:20 2011 +0000
URL:        http://git.overlays.gentoo.org/gitweb/?p=proj/portage.git;a=commit;h=1e8870bd

layout.conf: add git friendly pregenerated cache format

Enabled via cache-format = md5-dict
This format is essentially just flat_hash, using md5 rather than mtime,
and dropping the path component from _eclasses_ entries.

From a speed standpoint, the md5 overhead is ~16% in comparison to mtime,
timed on a modern sandybridge; specifically, validating 29k nodes takes
~8.8s for flat_md5, while the pms norm is ~7.7s.

That said, the cache is /usable/ in places PMS is not; in those cases,
it can definitely be a win since even if the cache is partially old,
it's better than regenerating everything from scratch.
(cherry picked from commit 95ddf97e2f7e7d3f6a072604b2df5f77e9298558)

Change-Id: Ic3561369b7a8be7f86480f339ab1686fddea6dff

---
 pym/portage/cache/flat_hash.py   |    9 +++++++--
 pym/portage/repository/config.py |    6 +++++-
 2 files changed, 12 insertions(+), 3 deletions(-)

diff --git a/pym/portage/cache/flat_hash.py b/pym/portage/cache/flat_hash.py
index b6bc074..2eae9f6 100644
--- a/pym/portage/cache/flat_hash.py
+++ b/pym/portage/cache/flat_hash.py
@@ -31,7 +31,7 @@ class database(fs_template.FsBased):
 			self.label.lstrip(os.path.sep).rstrip(os.path.sep))
 		write_keys = set(self._known_keys)
 		write_keys.add("_eclasses_")
-		write_keys.add("_mtime_")
+		write_keys.add("_%s_" % (self.validation_chf,))
 		self._write_keys = sorted(write_keys)
 		if not self.readonly and not os.path.exists(self.location):
 			self._ensure_dirs()
@@ -69,7 +69,6 @@ class database(fs_template.FsBased):
 			raise cache_errors.CacheCorruption(cpv, e)
 
 	def _setitem(self, cpv, values):
-#		import pdb;pdb.set_trace()
 		s = cpv.rfind("/")
 		fp = os.path.join(self.location,cpv[:s],".update.%i.%s" % (os.getpid(), cpv[s+1:]))
 		try:
@@ -153,3 +152,9 @@ class database(fs_template.FsBased):
 						dirs.append((depth+1, p))
 					continue
 				yield p[len_base+1:]
+
+
+class md5_database(database):
+
+	validation_chf = 'md5'
+	store_eclass_paths = False

diff --git a/pym/portage/repository/config.py b/pym/portage/repository/config.py
index a67e7f1..cf268f8 100644
--- a/pym/portage/repository/config.py
+++ b/pym/portage/repository/config.py
@@ -136,9 +136,13 @@ class RepoConfig(object):
 			format = 'pms'
 		if format == 'pms':
 			from portage.cache.metadata import database
+			name = 'metadata/cache'
+		elif format == 'md5-dict':
+			from portage.cache.flat_hash import md5_database as database
+			name = 'metadata/md5-cache'
 		else:
 			return None
-		return database(self.location, 'metadata/cache',
+		return database(self.location, name,
 			auxdbkeys, readonly=readonly)
 
 	def load_manifest(self, *args, **kwds):



^ permalink raw reply related	[flat|nested] only message in thread

only message in thread, other threads:[~2011-10-14 23:55 UTC | newest]

Thread overview: (only message) (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-14 23:55 [gentoo-commits] proj/portage:master commit in: pym/portage/cache/, pym/portage/repository/ Zac Medico

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox