* [gentoo-portage-dev] [PATCH] Manifest2 reloaded
@ 2006-03-04 3:50 Marius Mauch
2006-03-15 23:44 ` Marius Mauch
0 siblings, 1 reply; 8+ messages in thread
From: Marius Mauch @ 2006-03-04 3:50 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 1811 bytes --]
So while on my way to FOSDEM I decided to do something useful with the
time and wrote a new manifest2 implementation. This has nothing to do
with the original prototype I posted a while ago, it's been written
completely from scratch.
Basically all functionality (creation, parsing, validation) is
encapsulated in the new portage_manifest.Manifest class, including
compability code to read/write old style digests.
The changes to portage.py only change the digest*() functions to use
this new class instead of handling the task themselves (exception:
digestCheckFiles() which apparently was only used internally by other
digest* functions), they should more or less behave like with the old
code. Any new code however should use the Manifest() class directly
however.
While this patch implements the basic functionality some extra stuff
that was in the old code isn't included yet:
- gpg verification
- FEATURES=autoaddcvs
- FEATURES=cvs (probably obsolete anyway)
- emerge --digest / FEATURES=digest (may or may not work)
The first should be delayed until there is some consensus how the gpg
stuff should work in the future, the others I don't see the use for.
Also I only checked portage.py for changes, so emerge/repoman/... might
still have to be fixed.
Last but not least: I did some basic testing with this and the
important stuff seems to work, but I'm quite sure the code still has a
lot of bugs/issues, and this being a core functionality it needs a
*lot* of testing, so I'd really appreciate if you could all give it a
spin (but do not commit anything to the tree without manually checking
it first).
Marius
--
Public Key at http://www.genone.de/info/gpg-key.pub
In the beginning, there was nothing. And God said, 'Let there be
Light.' And there was still nothing, but you could see a bit better.
[-- Warning: decoded text below may be mangled, UTF-8 assumed --]
[-- Attachment #2: manifest2-prototype-pre5.diff --]
[-- Type: text/x-patch; name=manifest2-prototype-pre5.diff, Size: 32003 bytes --]
diff -ru --exclude=CVS --exclude=.svn -N pym/portage.py.org pym/portage.py
--- pym/portage.py.org 2006-03-04 02:25:20.957635000 +0000
+++ pym/portage.py 2006-03-04 03:12:19.545785750 +0000
@@ -90,6 +90,7 @@
from portage_data import ostype, lchown, userland, secpass, uid, wheelgid, \
portage_uid, portage_gid
+ from portage_manifest import Manifest
import portage_util
from portage_util import atomic_ofstream, dump_traceback, getconfig, grabdict, \
@@ -2049,181 +2050,67 @@
return 0
return 1
-
-def digestCreate(myfiles,basedir,oldDigest={}):
- """Takes a list of files and the directory they are in and returns the
- dict of dict[filename][CHECKSUM_KEY] = hash
- returns None on error."""
- mydigests={}
- for x in myfiles:
- print "<<<",x
- myfile=os.path.normpath(basedir+"///"+x)
- if os.path.exists(myfile):
- if not os.access(myfile, os.R_OK):
- print "!!! Given file does not appear to be readable. Does it exist?"
- print "!!! File:",myfile
- return None
- mydigests[x] = portage_checksum.perform_multiple_checksums(myfile, hashes=portage_const.MANIFEST1_HASH_FUNCTIONS)
- mysize = os.stat(myfile)[stat.ST_SIZE]
- else:
- if x in oldDigest:
- # DeepCopy because we might not have a unique reference.
- mydigests[x] = copy.deepcopy(oldDigest[x])
- mysize = copy.deepcopy(oldDigest[x]["size"])
- else:
- print "!!! We have a source URI, but no file..."
- print "!!! File:",myfile
- return None
-
- if mydigests[x].has_key("size") and (mydigests[x]["size"] != mysize):
- raise portage_exception.DigestException, "Size mismatch during checksums"
- mydigests[x]["size"] = copy.deepcopy(mysize)
- return mydigests
-
-def digestCreateLines(filelist, mydict):
- mylines = []
- mydigests = copy.deepcopy(mydict)
- for myarchive in filelist:
- mysize = mydigests[myarchive]["size"]
- if len(mydigests[myarchive]) == 0:
- raise portage_exception.DigestException, "No generate digest for '%(file)s'" % {"file":myarchive}
- for sumName in mydigests[myarchive].keys():
- if sumName not in portage_checksum.get_valid_checksum_keys():
- continue
- mysum = mydigests[myarchive][sumName]
-
- myline = sumName[:]
- myline += " "+mysum
- myline += " "+myarchive
- myline += " "+str(mysize)
- mylines.append(myline)
- return mylines
-
-def digestgen(myarchives,mysettings,overwrite=1,manifestonly=0):
+def digestgen(myarchives,mysettings,db=None,overwrite=1,manifestonly=0):
"""generates digest file if missing. Assumes all files are available. If
- overwrite=0, the digest will only be created if it doesn't already exist."""
-
- # archive files
- basedir=mysettings["DISTDIR"]+"/"
- digestfn=mysettings["FILESDIR"]+"/digest-"+mysettings["PF"]
-
- # portage files -- p(ortagefiles)basedir
- pbasedir=mysettings["O"]+"/"
- manifestfn=pbasedir+"Manifest"
-
- if not manifestonly:
- if not os.path.isdir(mysettings["FILESDIR"]):
- os.makedirs(mysettings["FILESDIR"])
- mycvstree=cvstree.getentries(pbasedir, recursive=1)
-
- if ("cvs" in features) and os.path.exists(pbasedir+"/CVS"):
- if not cvstree.isadded(mycvstree,"files"):
- if "autoaddcvs" in features:
- print ">>> Auto-adding files/ dir to CVS..."
- spawn("cd "+pbasedir+"; cvs add files",mysettings,free=1)
- else:
- print "--- Warning: files/ is not added to cvs."
-
- if (not overwrite) and os.path.exists(digestfn):
- return 1
-
- print green(">>> Generating the digest file...")
-
- # Track the old digest so we can assume checksums without requiring
- # all files to be downloaded. 'Assuming'
- myolddigest = {}
- if os.path.exists(digestfn):
- myolddigest = digestParseFile(digestfn)
-
- myarchives.sort()
- try:
- mydigests=digestCreate(myarchives, basedir, oldDigest=myolddigest)
- except portage_exception.DigestException, s:
- print "!!!",s
- return 0
- if mydigests==None: # There was a problem, exit with an errorcode.
- return 0
-
- try:
- outfile=open(digestfn, "w+")
- except SystemExit, e:
- raise
- except Exception, e:
- print "!!! Filesystem error skipping generation. (Read-Only?)"
- print "!!!",e
- return 0
- for x in digestCreateLines(myarchives, mydigests):
- outfile.write(x+"\n")
- outfile.close()
- try:
- os.chown(digestfn,os.getuid(),portage_gid)
- os.chmod(digestfn,0664)
- except SystemExit, e:
- raise
- except Exception,e:
- print e
-
- print green(">>> Generating the manifest file...")
- mypfiles=listdir(pbasedir,recursive=1,filesonly=1,ignorecvs=1,EmptyOnError=1)
- mypfiles=cvstree.apply_cvsignore_filter(mypfiles)
- mypfiles.sort()
- for x in ["Manifest"]:
- if x in mypfiles:
- mypfiles.remove(x)
-
- mydigests=digestCreate(mypfiles, pbasedir)
- if mydigests==None: # There was a problem, exit with an errorcode.
- return 0
-
- try:
- outfile=open(manifestfn, "w+")
- except SystemExit, e:
- raise
- except Exception, e:
- print "!!! Filesystem error skipping generation. (Read-Only?)"
- print "!!!",e
- return 0
- for x in digestCreateLines(mypfiles, mydigests):
- outfile.write(x+"\n")
- outfile.close()
- try:
- os.chown(manifestfn,os.getuid(),portage_gid)
- os.chmod(manifestfn,0664)
- except SystemExit, e:
- raise
- except Exception,e:
- print e
-
- if "cvs" in features and os.path.exists(pbasedir+"/CVS"):
- mycvstree=cvstree.getentries(pbasedir, recursive=1)
- myunaddedfiles=""
- if not manifestonly and not cvstree.isadded(mycvstree,digestfn):
- if digestfn[:len(pbasedir)]==pbasedir:
- myunaddedfiles=digestfn[len(pbasedir):]+" "
- else:
- myunaddedfiles=digestfn+" "
- if not cvstree.isadded(mycvstree,manifestfn[len(pbasedir):]):
- if manifestfn[:len(pbasedir)]==pbasedir:
- myunaddedfiles+=manifestfn[len(pbasedir):]+" "
- else:
- myunaddedfiles+=manifestfn
- if myunaddedfiles:
- if "autoaddcvs" in features:
- print blue(">>> Auto-adding digest file(s) to CVS...")
- spawn("cd "+pbasedir+"; cvs add "+myunaddedfiles,mysettings,free=1)
- else:
- print "--- Warning: digests are not yet added into CVS."
- print darkgreen(">>> Computed message digests.")
- print
+ overwrite=0, the digest will only be created if it doesn't already exist.
+ DEPRECATED: this now only is a compability wrapper for
+ portage_manifest.Manifest()"""
+
+ # NOTE: manifestonly is useless with manifest2 and therefore ignored
+ # NOTE: the old code contains a lot of crap that should really be elsewhere
+ # (e.g. cvs stuff should be in ebuild(1) and/or repoman)
+ # TODO: error/exception handling
+
+ if db == None:
+ db = portagetree().dbapi
+
+ mf = Manifest(mysettings["O"], db, mysettings)
+ for f in myarchives:
+ # the whole type evaluation is only for the case that myarchives isn't a
+ # DIST file as create() determines the type on its own
+ mytype = mf.guessType(f)
+ if mytype == "AUX":
+ f = f[5:]
+ elif mytype == None:
+ continue
+ myrealtype = mf.findFile(f)
+ if myrealtype != None:
+ mytype = myrealtype
+ mf.create(assumeDistfileHashes=True)
+ mf.updateFileHashes(mytype, f, checkExisting=False)
+ # NOTE: overwrite=0 is only used by emerge --digest, not sure we wanna keep that
+ if overwrite or not os.path.exists(mf.getFullname()):
+ mf.write(sign=False)
+
return 1
-
-def digestParseFile(myfilename):
+def digestParseFile(myfilename,mysettings=None,db=None):
"""(filename) -- Parses a given file for entries matching:
<checksumkey> <checksum_hex_string> <filename> <filesize>
Ignores lines that don't start with a valid checksum identifier
and returns a dict with the filenames as keys and {checksumkey:checksum}
- as the values."""
+ as the values.
+ DEPRECATED: this function is now only a compability wrapper for
+ portage_manifest.Manifest()."""
+
+ mysplit = myfilename.split(os.sep)
+ if mysplit[-2] == "files" and mysplit[-1].startswith("digest-"):
+ pkgdir = os.sep+os.sep.join(mysplit[:-2])
+ elif mysplit[-1] == "Manifest":
+ pkgdir = os.sep+os.sep.join(mysplit[:-1])
+
+ if db == None:
+ db = portagetree().dbapi
+ if mysettings == None:
+ mysettings = config(clone=settings)
+
+ mf = Manifest(pkgdir, db, mysettings)
+
+ return mf.getDigests()
+
+ #########################################
+ # Old code that's replaced by the above #
+ #########################################
if not os.path.exists(myfilename):
return None
@@ -2257,7 +2144,11 @@
"""(fileslist, digestdict, basedir) -- Takes a list of files and a dict
of their digests and checks the digests against the indicated files in
the basedir given. Returns 1 only if all files exist and match the checksums.
+ DEPRECATED: this function isn't compatible with manifest2, use
+ portage_manifest.Manifest() instead for any digest related tasks.
"""
+ print "!!! use of deprecated function digestCheckFiles(), use portage_manifest instead"""
+ return 0
for x in myfiles:
if not mydigests.has_key(x):
print
@@ -2289,8 +2180,46 @@
return 1
-def digestcheck(myfiles, mysettings, strict=0, justmanifest=0):
- """Verifies checksums. Assumes all files have been downloaded."""
+def digestcheck(myfiles, mysettings, strict=0, justmanifest=0, db=None):
+ """Verifies checksums. Assumes all files have been downloaded.
+ DEPRECATED: this is now only a compability wrapper for
+ portage_manifest.Manifest()."""
+
+ pkgdir = mysettings["O"]
+ if db == None:
+ db = portagetree().dbapi
+ mf = Manifest(pkgdir, db, mysettings)
+ try:
+ if strict:
+ print ">>> checking ebuild checksums",
+ mf.checkTypeHashes("EBUILD")
+ print ":-)"
+ print ">>> checking auxfile checksums",
+ mf.checkTypeHashes("AUX")
+ print ":-)"
+ print ">>> checking miscfile checksums",
+ mf.checkTypeHashes("MISC", ignoreMissingFiles=True)
+ print ":-)"
+ for f in myfiles:
+ if f.startswith("files/"):
+ f = f[5:]
+ print ">>> checking %s checksums" % f,
+ mf.checkFileHashes(mf.findFile(f), f)
+ print ":-)"
+ except portage_exception.DigestException, e:
+ print e.value
+ print red("!!! ")+"Digest verification failed:"
+ print red("!!! ")+" "+e.value[0]
+ print red("!!! ")+"Reason: "+e.value[1]
+ print red("!!! ")+"Got: "+str(e.value[2])
+ print red("!!! ")+"Expected: "+str(e.value[3])
+ return 0
+ return 1
+
+ #########################################
+ # Old code that's replaced by the above #
+ #########################################
+
# archive files
basedir=mysettings["DISTDIR"]+"/"
digestfn=mysettings["FILESDIR"]+"/digest-"+mysettings["PF"]
diff -ru --exclude=CVS --exclude=.svn -N pym/portage.py.orig.org pym/portage.py.orig
--- pym/portage.py.orig.org 2006-02-21 10:01:08.000000000 +0000
+++ pym/portage.py.orig 2006-03-04 00:55:47.000000000 +0000
@@ -4,7 +4,7 @@
# $Id: /var/cvsroot/gentoo-src/portage/pym/portage.py,v 1.524.2.76 2005/05/29 12:40:08 jstubbs Exp $
-VERSION="2.1_pre5"
+VERSION="2.1_pre5-r2"
# ===========================================================================
# START OF IMPORTS -- START OF IMPORTS -- START OF IMPORTS -- START OF IMPORT
@@ -5334,6 +5334,8 @@
return 1
def update_ents(self, update_iter):
+ if len(update_iter) == 0:
+ return
if not self.populated:
self.populate()
diff -ru --exclude=CVS --exclude=.svn -N pym/portage_checksum.py.org pym/portage_checksum.py
--- pym/portage_checksum.py.org 2006-02-12 11:53:04.000000000 +0000
+++ pym/portage_checksum.py 2006-03-04 02:31:40.161333750 +0000
@@ -58,6 +58,11 @@
except ImportError:
pass
+def getsize(filename):
+ size = os.stat(filename).st_size
+ return (size, size)
+hashfunc_map["size"] = getsize
+
# end actual hash functions
prelink_capable = False
@@ -68,7 +73,7 @@
del results
def perform_md5(x, calc_prelink=0):
- return perform_checksum(x, md5hash, calc_prelink)[0]
+ return perform_checksum(x, "MD5", calc_prelink)[0]
def perform_all(x, calc_prelink=0):
mydict = {}
@@ -94,7 +99,7 @@
if x == "size":
continue
elif x in hashfunc_map.keys():
- myhash = perform_checksum(filename, hashfunc_map[x], calc_prelink=calc_prelink)[0]
+ myhash = perform_checksum(filename, x, calc_prelink=calc_prelink)[0]
if mydict[x] != myhash:
if strict:
raise portage_exception.DigestException, "Failed to verify '$(file)s' on checksum type '%(type)s'" % {"file":filename, "type":x}
@@ -118,7 +123,7 @@
return (sum.hexdigest(), size)
-def perform_checksum(filename, hash_function=md5hash, calc_prelink=0):
+def perform_checksum(filename, hashname="MD5", calc_prelink=0):
myfilename = filename[:]
prelink_tmpfile = PRIVATE_PATH+"/prelink-checksum.tmp."+str(os.getpid())
mylock = None
@@ -132,7 +137,9 @@
#portage_util.writemsg(">>> prelink checksum '"+str(filename)+"'.\n")
myfilename=prelink_tmpfile
try:
- myhash, mysize = hash_function(myfilename)
+ if hashname not in hashfunc_map:
+ raise portage_exception.DigestException, hashname+" hash function not available (needs dev-python/pycrypto)"
+ myhash, mysize = hashfunc_map[hashname](myfilename)
except (OSError, IOError), e:
if e.errno == errno.ENOENT:
raise portage_exception.FileNotFound(e)
@@ -155,5 +162,5 @@
for x in hashes:
if x not in hashfunc_map:
raise portage_exception.DigestException, x+" hash function not available (needs dev-python/pycrypto)"
- rVal[x] = perform_checksum(filename, hashfunc_map[x], calc_prelink)[0]
+ rVal[x] = perform_checksum(filename, x, calc_prelink)[0]
return rVal
diff -ru --exclude=CVS --exclude=.svn -N pym/portage_checksum.py.orig.org pym/portage_checksum.py.orig
--- pym/portage_checksum.py.orig.org 1970-01-01 00:00:00.000000000 +0000
+++ pym/portage_checksum.py.orig 2006-03-04 00:55:47.000000000 +0000
@@ -0,0 +1,159 @@
+# portage_checksum.py -- core Portage functionality
+# Copyright 1998-2004 Gentoo Foundation
+# Distributed under the terms of the GNU General Public License v2
+# $Id: /var/cvsroot/gentoo-src/portage/pym/portage_checksum.py,v 1.10.2.2 2005/08/10 05:42:03 ferringb Exp $
+
+
+from portage_const import PRIVATE_PATH,PRELINK_BINARY,HASHING_BLOCKSIZE
+import os
+import errno
+import shutil
+import stat
+import portage_exception
+import portage_exec
+import portage_util
+import portage_locks
+import commands
+import sha
+
+
+# actual hash functions first
+
+#dict of all available hash functions
+hashfunc_map = {}
+
+# We _try_ to load this module. If it fails we do the slightly slower fallback.
+try:
+ import fchksum
+
+ def md5hash(filename):
+ return fchksum.fmd5t(filename)
+
+except ImportError:
+ import md5
+ def md5hash(filename):
+ return pyhash(filename, md5)
+hashfunc_map["MD5"] = md5hash
+
+def sha1hash(filename):
+ return pyhash(filename, sha)
+hashfunc_map["SHA1"] = sha1hash
+
+# Keep pycrypto optional for now, there are no internal fallbacks for these
+try:
+ import Crypto.Hash.SHA256
+
+ def sha256hash(filename):
+ return pyhash(filename, Crypto.Hash.SHA256)
+ hashfunc_map["SHA256"] = sha256hash
+except ImportError:
+ pass
+
+try:
+ import Crypto.Hash.RIPEMD
+
+ def rmd160hash(filename):
+ return pyhash(filename, Crypto.Hash.RIPEMD)
+ hashfunc_map["RMD160"] = rmd160hash
+except ImportError:
+ pass
+
+# end actual hash functions
+
+prelink_capable = False
+if os.path.exists(PRELINK_BINARY):
+ results = commands.getstatusoutput(PRELINK_BINARY+" --version > /dev/null 2>&1")
+ if (results[0] >> 8) == 0:
+ prelink_capable=1
+ del results
+
+def perform_md5(x, calc_prelink=0):
+ return perform_checksum(x, md5hash, calc_prelink)[0]
+
+def perform_all(x, calc_prelink=0):
+ mydict = {}
+ for k in hashfunc_map.keys():
+ mydict[k] = perform_checksum(x, hashfunc_map[k], calc_prelink)[0]
+ return mydict
+
+def get_valid_checksum_keys():
+ return hashfunc_map.keys()
+
+def verify_all(filename, mydict, calc_prelink=0, strict=0):
+ # Dict relates to single file only.
+ # returns: (passed,reason)
+ file_is_ok = True
+ reason = "Reason unknown"
+ try:
+ mysize = os.stat(filename)[stat.ST_SIZE]
+ if mydict["size"] != mysize:
+ return False,("Filesize does not match recorded size", mysize, mydict["size"])
+ except OSError, e:
+ return False, str(e)
+ for x in mydict.keys():
+ if x == "size":
+ continue
+ elif x in hashfunc_map.keys():
+ myhash = perform_checksum(filename, hashfunc_map[x], calc_prelink=calc_prelink)[0]
+ if mydict[x] != myhash:
+ if strict:
+ raise portage_exception.DigestException, "Failed to verify '$(file)s' on checksum type '%(type)s'" % {"file":filename, "type":x}
+ else:
+ file_is_ok = False
+ reason = (("Failed on %s verification" % x), myhash,mydict[x])
+ break
+ return file_is_ok,reason
+
+def pyhash(filename, hashobject):
+ f = open(filename, 'rb')
+ blocksize = HASHING_BLOCKSIZE
+ data = f.read(blocksize)
+ size = 0L
+ sum = hashobject.new()
+ while data:
+ sum.update(data)
+ size = size + len(data)
+ data = f.read(blocksize)
+ f.close()
+
+ return (sum.hexdigest(), size)
+
+def perform_checksum(filename, hash_function=md5hash, calc_prelink=0):
+ myfilename = filename[:]
+ prelink_tmpfile = PRIVATE_PATH+"/prelink-checksum.tmp."+str(os.getpid())
+ mylock = None
+
+ if calc_prelink and prelink_capable:
+ mylock = portage_locks.lockfile(prelink_tmpfile, wantnewlockfile=1)
+ # Create non-prelinked temporary file to checksum.
+ # Files rejected by prelink are summed in place.
+ retval=portage_exec.spawn([PRELINK_BINARY,"--undo","-o",prelink_tmpfile,filename],fd_pipes={})
+ if retval==0:
+ #portage_util.writemsg(">>> prelink checksum '"+str(filename)+"'.\n")
+ myfilename=prelink_tmpfile
+ try:
+ myhash, mysize = hash_function(myfilename)
+ except (OSError, IOError), e:
+ if e.errno == errno.ENOENT:
+ raise portage_exception.FileNotFound(e)
+ else:
+ raise e
+ if calc_prelink and prelink_capable:
+ try:
+ os.unlink(prelink_tmpfile)
+ except OSError, oe:
+ if oe.errno == errno.ENOENT:
+ pass
+ else:
+ raise oe
+ portage_locks.unlockfile(mylock)
+
+ return (myhash,mysize)
+
+def perform_multiple_checksums(filename, hashes=["MD5"], calc_prelink=0):
+ rVal = {}
+ for x in hashes:
+ if x not in hashfunc_map:
+ raise portage_exception.DigestException, x+" hash function not available (needs dev-python/pycrypto)"
+ rVal[x] = perform_checksum(filename, hashfunc_map[x], calc_prelink)[0]
+ return rVal
diff -ru --exclude=CVS --exclude=.svn -N pym/portage_const.py.org pym/portage_const.py
--- pym/portage_const.py.org 2006-01-30 02:45:53.000000000 +0000
+++ pym/portage_const.py 2006-03-04 01:50:13.497927000 +0000
@@ -46,10 +46,10 @@
EAPI = 0
HASHING_BLOCKSIZE = 32768
-# Disabling until behaviour when missing the relevant python module is
-# corrected. #116485
MANIFEST1_HASH_FUNCTIONS = ["MD5","SHA256","RMD160"]
+MANIFEST2_HASH_FUNCTIONS = ["SHA1","SHA256","RMD160"]
+MANIFEST2_IDENTIFIERS = ["AUX","MISC","DIST","EBUILD"]
# ===========================================================================
# END OF CONSTANTS -- END OF CONSTANTS -- END OF CONSTANTS -- END OF CONSTANT
# ===========================================================================
diff -ru --exclude=CVS --exclude=.svn -N pym/portage_manifest.py.org pym/portage_manifest.py
--- pym/portage_manifest.py.org 1970-01-01 00:00:00.000000000 +0000
+++ pym/portage_manifest.py 2006-03-04 03:13:25.973937250 +0000
@@ -0,0 +1,314 @@
+import os, sets
+
+import portage, portage_exception, portage_versions, portage_const
+from portage_checksum import *
+from portage_exception import *
+
+class FileNotInManifestException(PortageException):
+ pass
+
+def manifest2AuxfileFilter(filename):
+ filename = filename.strip("/")
+ return not (filename in ["CVS", ".svn"] or filename[:len("digest-")] == "digest-")
+
+def manifest2MiscfileFilter(filename):
+ filename = filename.strip("/")
+ return not (filename in ["CVS", ".svn", "files", "Manifest"] or filename[-7:] == ".ebuild")
+
+class Manifest(object):
+ def __init__(self, pkgdir, db, mysettings, hashes=portage_const.MANIFEST2_HASH_FUNCTIONS, manifest1_compat=True, fromScratch=False):
+ self.pkgdir = pkgdir+os.sep
+ self.fhashdict = {}
+ self.hashes = hashes
+ self.hashes.append("size")
+ if manifest1_compat:
+ self.hashes.extend(portage_const.MANIFEST1_HASH_FUNCTIONS)
+ self.hashes = sets.Set(self.hashes)
+ for t in portage_const.MANIFEST2_IDENTIFIERS:
+ self.fhashdict[t] = {}
+ self._read()
+ self.compat = manifest1_compat
+ self.db = db
+ self.mysettings = mysettings
+ if mysettings.has_key("PORTAGE_ACTUAL_DISTDIR"):
+ self.distdir = mysettings["PORTAGE_ACTUAL_DISTDIR"]
+ else:
+ self.distdir = mysettings["DISTDIR"]
+
+ def guessType(self, filename):
+ if filename.startswith("files/digest-"):
+ return None
+ if filename.startswith("files/"):
+ return "AUX"
+ elif filename.endswith(".ebuild"):
+ return "EBUILD"
+ elif filename in ["ChangeLog", "metadata.xml"]:
+ return "MISC"
+ else:
+ return "DIST"
+
+ def getFullname(self):
+ return self.pkgdir+"Manifest"
+
+ def getDigests(self):
+ rval = {}
+ for t in portage_const.MANIFEST2_IDENTIFIERS:
+ rval.update(self.fhashdict[t])
+ return rval
+
+ def _readDigests(self):
+ mycontent = ""
+ for d in portage.listdir(self.pkgdir+"files", filesonly=True, recursive=False):
+ if d.startswith("digest-"):
+ mycontent += open(self.pkgdir+"files"+os.sep+d, "r").read()
+ return mycontent
+
+ def _read(self):
+ if not os.path.exists(self.getFullname()):
+ return
+ fd = open(self.getFullname(), "r")
+ mylines = fd.readlines()
+ fd.close()
+ mylines.extend(self._readDigests().split("\n"))
+ for l in mylines:
+ myname = ""
+ mysplit = l.split()
+ if len(mysplit) == 4 and mysplit[0] in portage_const.MANIFEST1_HASH_FUNCTIONS:
+ myname = mysplit[2]
+ mytype = self.guessType(myname)
+ if mytype == "AUX" and myname.startswith("files/"):
+ myname = myname[6:]
+ if mytype == None:
+ continue
+ mysize = int(mysplit[3])
+ myhashes = {mysplit[0]: mysplit[1]}
+ if len(mysplit) > 4 and mysplit[0] in portage_const.MANIFEST2_IDENTIFIERS:
+ mytype = mysplit[0]
+ myname = mysplit[1]
+ mysize = int(mysplit[2])
+ myhashes = dict(zip(mysplit[3::2], mysplit[4::2]))
+ if len(myname) == 0:
+ continue
+ if not self.fhashdict[mytype].has_key(myname):
+ self.fhashdict[mytype][myname] = {}
+ self.fhashdict[mytype][myname].update(myhashes)
+ self.fhashdict[mytype][myname]["size"] = mysize
+
+ def _writeDigests(self):
+ cpvlist = [self.pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] for x in portage.listdir(self.pkgdir) if x.endswith(".ebuild")]
+ rval = []
+ for cpv in cpvlist:
+ dname = self.pkgdir+"files"+os.sep+"digest-"+portage.catsplit(cpv)[1]
+ mylines = []
+ distlist = self._getCpvDistfiles(cpv)
+ for f in self.fhashdict["DIST"].keys():
+ if f in distlist:
+ for h in self.fhashdict["DIST"][f].keys():
+ if h not in portage_const.MANIFEST1_HASH_FUNCTIONS:
+ continue
+ myline = " ".join([h, str(self.fhashdict["DIST"][f][h]), f, str(self.fhashdict["DIST"][f]["size"])])
+ mylines.append(myline)
+ fd = open(dname, "w")
+ fd.write("\n".join(mylines))
+ fd.write("\n")
+ fd.close()
+ rval.append(dname)
+ return rval
+
+ def _addDigestsToManifest(self, digests, fd):
+ mylines = []
+ for dname in digests:
+ myhashes = perform_multiple_checksums(dname, portage_const.MANIFEST1_HASH_FUNCTIONS+["size"])
+ for h in myhashes.keys():
+ mylines.append((" ".join([h, str(myhashes[h]), os.path.join("files", os.path.basename(dname)), str(myhashes["size"])])))
+ fd.write("\n".join(mylines))
+ fd.write("\n")
+
+ def _write(self, fd):
+ mylines = []
+ for t in self.fhashdict.keys():
+ for f in self.fhashdict[t].keys():
+ myline = " ".join([t, f, str(self.fhashdict[t][f]["size"])])
+ myhashes = self.fhashdict[t][f]
+ for h in myhashes.keys():
+ if h not in portage_const.MANIFEST2_HASH_FUNCTIONS:
+ continue
+ myline += " "+h+" "+str(myhashes[h])
+ mylines.append(myline)
+ if self.compat and t != "DIST":
+ for h in myhashes.keys():
+ if h not in portage_const.MANIFEST1_HASH_FUNCTIONS:
+ continue
+ mylines.append((" ".join([h, str(myhashes[h]), f, str(myhashes["size"])])))
+ fd.write("\n".join(mylines))
+ fd.write("\n")
+
+ def write(self, sign=False):
+ fd = open(self.getFullname(), "w")
+ self._write(fd)
+ if self.compat:
+ digests = self._writeDigests()
+ self._addDigestsToManifest(digests, fd)
+ fd.close()
+ if sign:
+ self.sign()
+
+ def sign(self):
+ raise NotImplementedError()
+
+ def validateSignature(self):
+ raise NotImplementedError()
+
+ def addFile(self, ftype, fname, hashdict=None):
+ if not os.path.exists(self.pkgdir+fname):
+ raise FileNotFound(fname)
+ if not ftype in portage_const.MANIFEST2_IDENTIFIERS:
+ raise InvalidDataType(ftype)
+ self.fhashdict[ftype][fname] = {}
+ if hashdict != None:
+ self.fhashdict[ftype][fname].update(hashdict)
+ if not portage_const.MANIFEST2_REQUIRED_HASH in self.fhashdict[ftype][fname].keys():
+ self.updateFileHashes(ftype, fname)
+
+ def removeFile(self, ftype, fname):
+ del self.fhashdict[ftype][fname]
+
+ def hasFile(self, ftype, fname):
+ return (fname in self.fhashdict[ftype].keys())
+
+ def findFile(self, fname):
+ for t in portage_const.MANIFEST2_IDENTIFIERS:
+ if fname in self.fhashdict[t]:
+ return t
+ return None
+
+ def create(self, checkExisting=False, assumeDistfileHashes=True):
+ """ Recreate this Manifest from scratch, not using any existing checksums
+ (exception: if assumeDistfileHashes is true then existing DIST checksums are
+ reused if the file doesn't exist in DISTDIR."""
+ if checkExisting:
+ self.checkAllHashes()
+ if assumeDistfileHashes:
+ distfilehashes = self.fhashdict["DIST"]
+ else:
+ distfilehashes = {}
+ self.__init__(self.pkgdir, self.db, self.mysettings, fromScratch=True)
+ for f in portage.listdir(self.pkgdir, filesonly=True, recursive=False):
+ if f.endswith(".ebuild"):
+ mytype = "EBUILD"
+ elif manifest2MiscfileFilter(f):
+ mytype = "MISC"
+ else:
+ continue
+ self.fhashdict[mytype][f] = perform_multiple_checksums(self.pkgdir+f, self.hashes)
+ for f in portage.listdir(self.pkgdir+"files", filesonly=True, recursive=True):
+ if not manifest2AuxfileFilter(f):
+ continue
+ self.fhashdict["AUX"][f] = perform_multiple_checksums(self.pkgdir+"files"+os.sep+f, self.hashes)
+ cpvlist = [self.pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] for x in portage.listdir(self.pkgdir) if x.endswith(".ebuild")]
+ distlist = []
+ for cpv in cpvlist:
+ distlist.extend(self._getCpvDistfiles(cpv))
+ for f in distlist:
+ fname = self.distdir+os.sep+f
+ if os.path.exists(fname):
+ self.fhashdict["DIST"][f] = perform_multiple_checksums(fname, self.hashes)
+ elif assumeDistfileHashes and f in distfilehashes.keys():
+ self.fhashdict["DIST"][f] = distfilehashes[f]
+ else:
+ raise FileNotFound(fname)
+
+ def _getAbsname(self, ftype, fname):
+ if ftype == "DIST":
+ absname = self.distdir+os.sep+fname
+ elif ftype == "AUX":
+ absname = os.sep.join([self.pkgdir, "files", fname])
+ else:
+ absname = self.pkgdir+os.sep+fname
+ return absname
+
+ def checkAllHashes(self, ignoreMissingFiles=False):
+ for t in portage_const.MANIFEST2_IDENTIFIERS:
+ self.checkTypeHashes(t, ignoreMissingFiles=ignoreMissingFiles)
+
+ def checkTypeHashes(self, idtype, ignoreMissingFiles=False):
+ for f in self.fhashdict[idtype].keys():
+ self.checkFileHashes(idtype, f, ignoreMissing=ignoreMissingFiles)
+
+ def checkFileHashes(self, ftype, fname, ignoreMissing=False):
+ myhashes = self.fhashdict[ftype][fname]
+ ok,reason = verify_all(self._getAbsname(ftype, fname), self.fhashdict[ftype][fname])
+ if not ok:
+ raise DigestException(tuple([self._getAbsname(ftype, fname)]+list(reason)))
+ return ok, reason
+
+ def checkCpvHashes(self, cpv, checkDistfiles=True, onlyDistfiles=False, checkMiscfiles=False):
+ """ check the hashes for all files associated to the given cpv, include all
+ AUX files and optionally all MISC files. """
+ if not onlyDistfiles:
+ self.checkTypeHashes("AUX", ignoreMissingFiles=False)
+ if checkMiscfiles:
+ self.checkTypeHashes("MISC", ignoreMissingFiles=False)
+ ebuildname = portage.catsplit(cpv)[1]+".ebuild"
+ self.checkFileHashes("EBUILD", ebuildname, ignoreMissing=False)
+ if checkDistfiles:
+ if onlyDistfiles:
+ for f in self._getCpvDistfiles(cpv):
+ self.checkFileHashes("DIST", f, ignoreMissing=False)
+
+ def _getCpvDistfiles(self, cpv):
+ """ Get a list of all DIST files associated to the given cpv """
+ return self.db.getfetchlist(cpv, mysettings=self.mysettings, all=True)[1]
+
+ def updateFileHashes(self, ftype, fname, checkExisting=True, ignoreMissing=True):
+ """ Regenerate hashes for the given file """
+ if checkExisting:
+ self.checkFileHashes(fname)
+ if not ignoreMissing and not self.fhashdict[ftype].has_key(fname):
+ raise FileNotInManifestException(fname)
+ if not self.fhashdict[ftype].has_key(fname):
+ self.fhashdict[ftype][fname] = {}
+ myhashes = perform_multiple_checksums(self._getAbsname(ftype, fname), self.hashes)
+ self.fhashdict[ftype][fname].update(myhashes)
+
+ def updateTypeHashes(self, idtype, checkExisting=False, ignoreMissingFiles=True):
+ """ Regenerate all hashes for all files of the given type """
+ for fname in self.fhashdict[idtype].keys():
+ self.updateFileHashes(idtype, fname, checkExisting)
+
+ def updateAllHashes(self, checkExisting=False, ignoreMissingFiles=True):
+ """ Regenerate all hashes for all files in this Manifest. """
+ for ftype in portage_const.MANIFEST2_IDENTIFIERS:
+ self.updateTypeHashes(idtype, fname, checkExisting)
+
+ def updateCpvHashes(self, cpv, ignoreMissingFiles=True):
+ """ Regenerate all hashes associated to the given cpv (includes all AUX and MISC
+ files)."""
+ self.updateTypeHashes("AUX", ignoreMissingFiles=ignoreMissingFiles)
+ self.updateTypeHashes("MISC", ignoreMissingFiles=ignoreMissingFiles)
+ ebuildname = portage.catsplit(cpv)[1]+".ebuild"
+ self.updateFileHashes("EBUILD", ebuildname, ignoreMissingFiles=ignoreMissingFiles)
+ for f in self._getCpvDistfiles(cpv):
+ self.updateFileHashes("DIST", f, ignoreMissingFiles=ignoreMissingFiles)
+
+ def getFileData(self, ftype, fname, key):
+ """ Return the value of a specific (type,filename,key) triple, mainly useful
+ to get the size for distfiles."""
+ return self.fhashdict[ftype][fname][key]
+
+ def getVersions(self):
+ """ Returns a list of manifest versions present in the manifest file. """
+ rVal = []
+ mfname = self.getFullname()
+ if not os.path.exists(mfname):
+ return rVal
+ myfile = open(mfname, "r")
+ lines = myfile.readlines()
+ myfile.close()
+ for l in lines:
+ mysplit = l.split()
+ if len(mysplit) == 4 and mysplit[0] in portage_const.MANIFEST1_HASH_FUNCTIONS and not 1 in rVal:
+ rVal.append(1)
+ elif len(mysplit) > 4 and mysplit[0] in portage_const.MANIFEST2_IDENTIFIERS and ((len(mysplit) - 3) % 2) == 0 and not 2 in rVal:
+ rVal.append(2)
+ return rVal
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
2006-03-04 3:50 [gentoo-portage-dev] [PATCH] Manifest2 reloaded Marius Mauch
@ 2006-03-15 23:44 ` Marius Mauch
2006-03-16 5:53 ` Zac Medico
0 siblings, 1 reply; 8+ messages in thread
From: Marius Mauch @ 2006-03-15 23:44 UTC (permalink / raw
To: gentoo-portage-dev
Marius Mauch schrieb:
> The first should be delayed until there is some consensus how the gpg
> stuff should work in the future, the others I don't see the use for.
> Also I only checked portage.py for changes, so emerge/repoman/... might
> still have to be fixed.
> Last but not least: I did some basic testing with this and the
> important stuff seems to work, but I'm quite sure the code still has a
> lot of bugs/issues, and this being a core functionality it needs a
> *lot* of testing, so I'd really appreciate if you could all give it a
> spin (but do not commit anything to the tree without manually checking
> it first).
Does the lack of feedback (only got a reaction from Brian so far) mean
that noone tried it or that it doesn't have any issues?
Marius
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
2006-03-15 23:44 ` Marius Mauch
@ 2006-03-16 5:53 ` Zac Medico
2006-03-16 7:15 ` Brian Harring
0 siblings, 1 reply; 8+ messages in thread
From: Zac Medico @ 2006-03-16 5:53 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 1317 bytes --]
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Marius Mauch wrote:
> Marius Mauch schrieb:
>> The first should be delayed until there is some consensus how the gpg
>> stuff should work in the future, the others I don't see the use for.
>> Also I only checked portage.py for changes, so emerge/repoman/... might
>> still have to be fixed.
>> Last but not least: I did some basic testing with this and the
>> important stuff seems to work, but I'm quite sure the code still has a
>> lot of bugs/issues, and this being a core functionality it needs a
>> *lot* of testing, so I'd really appreciate if you could all give it a
>> spin (but do not commit anything to the tree without manually checking
>> it first).
>
> Does the lack of feedback (only got a reaction from Brian so far) mean
> that noone tried it or that it doesn't have any issues?
The patch applies and seems to work well. At a quick glance the code looks pretty clean and it's nice to migrate more code out of portage.py to a separate module. I've attached a refreshed version of the patch that applies cleanly against current svn (I've made no changes).
Zac
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (GNU/Linux)
iD8DBQFEGP1S/ejvha5XGaMRAl/7AJ9cZbjhWtjCz+ac2/tjQNUoivj0twCg7xAG
cYvDbMiqU5HtpNrVk7fs6RM=
=Eqlo
-----END PGP SIGNATURE-----
[-- Attachment #2: manifest2-prototype-pre20060315.patch --]
[-- Type: text/x-patch, Size: 25086 bytes --]
=== added file 'pym/portage_manifest.py'
--- /dev/null
+++ pym/portage_manifest.py
@@ -0,0 +1,314 @@
+import os, sets
+
+import portage, portage_exception, portage_versions, portage_const
+from portage_checksum import *
+from portage_exception import *
+
+class FileNotInManifestException(PortageException):
+ pass
+
+def manifest2AuxfileFilter(filename):
+ filename = filename.strip("/")
+ return not (filename in ["CVS", ".svn"] or filename[:len("digest-")] == "digest-")
+
+def manifest2MiscfileFilter(filename):
+ filename = filename.strip("/")
+ return not (filename in ["CVS", ".svn", "files", "Manifest"] or filename[-7:] == ".ebuild")
+
+class Manifest(object):
+ def __init__(self, pkgdir, db, mysettings, hashes=portage_const.MANIFEST2_HASH_FUNCTIONS, manifest1_compat=True, fromScratch=False):
+ self.pkgdir = pkgdir+os.sep
+ self.fhashdict = {}
+ self.hashes = hashes
+ self.hashes.append("size")
+ if manifest1_compat:
+ self.hashes.extend(portage_const.MANIFEST1_HASH_FUNCTIONS)
+ self.hashes = sets.Set(self.hashes)
+ for t in portage_const.MANIFEST2_IDENTIFIERS:
+ self.fhashdict[t] = {}
+ self._read()
+ self.compat = manifest1_compat
+ self.db = db
+ self.mysettings = mysettings
+ if mysettings.has_key("PORTAGE_ACTUAL_DISTDIR"):
+ self.distdir = mysettings["PORTAGE_ACTUAL_DISTDIR"]
+ else:
+ self.distdir = mysettings["DISTDIR"]
+
+ def guessType(self, filename):
+ if filename.startswith("files/digest-"):
+ return None
+ if filename.startswith("files/"):
+ return "AUX"
+ elif filename.endswith(".ebuild"):
+ return "EBUILD"
+ elif filename in ["ChangeLog", "metadata.xml"]:
+ return "MISC"
+ else:
+ return "DIST"
+
+ def getFullname(self):
+ return self.pkgdir+"Manifest"
+
+ def getDigests(self):
+ rval = {}
+ for t in portage_const.MANIFEST2_IDENTIFIERS:
+ rval.update(self.fhashdict[t])
+ return rval
+
+ def _readDigests(self):
+ mycontent = ""
+ for d in portage.listdir(self.pkgdir+"files", filesonly=True, recursive=False):
+ if d.startswith("digest-"):
+ mycontent += open(self.pkgdir+"files"+os.sep+d, "r").read()
+ return mycontent
+
+ def _read(self):
+ if not os.path.exists(self.getFullname()):
+ return
+ fd = open(self.getFullname(), "r")
+ mylines = fd.readlines()
+ fd.close()
+ mylines.extend(self._readDigests().split("\n"))
+ for l in mylines:
+ myname = ""
+ mysplit = l.split()
+ if len(mysplit) == 4 and mysplit[0] in portage_const.MANIFEST1_HASH_FUNCTIONS:
+ myname = mysplit[2]
+ mytype = self.guessType(myname)
+ if mytype == "AUX" and myname.startswith("files/"):
+ myname = myname[6:]
+ if mytype == None:
+ continue
+ mysize = int(mysplit[3])
+ myhashes = {mysplit[0]: mysplit[1]}
+ if len(mysplit) > 4 and mysplit[0] in portage_const.MANIFEST2_IDENTIFIERS:
+ mytype = mysplit[0]
+ myname = mysplit[1]
+ mysize = int(mysplit[2])
+ myhashes = dict(zip(mysplit[3::2], mysplit[4::2]))
+ if len(myname) == 0:
+ continue
+ if not self.fhashdict[mytype].has_key(myname):
+ self.fhashdict[mytype][myname] = {}
+ self.fhashdict[mytype][myname].update(myhashes)
+ self.fhashdict[mytype][myname]["size"] = mysize
+
+ def _writeDigests(self):
+ cpvlist = [self.pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] for x in portage.listdir(self.pkgdir) if x.endswith(".ebuild")]
+ rval = []
+ for cpv in cpvlist:
+ dname = self.pkgdir+"files"+os.sep+"digest-"+portage.catsplit(cpv)[1]
+ mylines = []
+ distlist = self._getCpvDistfiles(cpv)
+ for f in self.fhashdict["DIST"].keys():
+ if f in distlist:
+ for h in self.fhashdict["DIST"][f].keys():
+ if h not in portage_const.MANIFEST1_HASH_FUNCTIONS:
+ continue
+ myline = " ".join([h, str(self.fhashdict["DIST"][f][h]), f, str(self.fhashdict["DIST"][f]["size"])])
+ mylines.append(myline)
+ fd = open(dname, "w")
+ fd.write("\n".join(mylines))
+ fd.write("\n")
+ fd.close()
+ rval.append(dname)
+ return rval
+
+ def _addDigestsToManifest(self, digests, fd):
+ mylines = []
+ for dname in digests:
+ myhashes = perform_multiple_checksums(dname, portage_const.MANIFEST1_HASH_FUNCTIONS+["size"])
+ for h in myhashes.keys():
+ mylines.append((" ".join([h, str(myhashes[h]), os.path.join("files", os.path.basename(dname)), str(myhashes["size"])])))
+ fd.write("\n".join(mylines))
+ fd.write("\n")
+
+ def _write(self, fd):
+ mylines = []
+ for t in self.fhashdict.keys():
+ for f in self.fhashdict[t].keys():
+ myline = " ".join([t, f, str(self.fhashdict[t][f]["size"])])
+ myhashes = self.fhashdict[t][f]
+ for h in myhashes.keys():
+ if h not in portage_const.MANIFEST2_HASH_FUNCTIONS:
+ continue
+ myline += " "+h+" "+str(myhashes[h])
+ mylines.append(myline)
+ if self.compat and t != "DIST":
+ for h in myhashes.keys():
+ if h not in portage_const.MANIFEST1_HASH_FUNCTIONS:
+ continue
+ mylines.append((" ".join([h, str(myhashes[h]), f, str(myhashes["size"])])))
+ fd.write("\n".join(mylines))
+ fd.write("\n")
+
+ def write(self, sign=False):
+ fd = open(self.getFullname(), "w")
+ self._write(fd)
+ if self.compat:
+ digests = self._writeDigests()
+ self._addDigestsToManifest(digests, fd)
+ fd.close()
+ if sign:
+ self.sign()
+
+ def sign(self):
+ raise NotImplementedError()
+
+ def validateSignature(self):
+ raise NotImplementedError()
+
+ def addFile(self, ftype, fname, hashdict=None):
+ if not os.path.exists(self.pkgdir+fname):
+ raise FileNotFound(fname)
+ if not ftype in portage_const.MANIFEST2_IDENTIFIERS:
+ raise InvalidDataType(ftype)
+ self.fhashdict[ftype][fname] = {}
+ if hashdict != None:
+ self.fhashdict[ftype][fname].update(hashdict)
+ if not portage_const.MANIFEST2_REQUIRED_HASH in self.fhashdict[ftype][fname].keys():
+ self.updateFileHashes(ftype, fname)
+
+ def removeFile(self, ftype, fname):
+ del self.fhashdict[ftype][fname]
+
+ def hasFile(self, ftype, fname):
+ return (fname in self.fhashdict[ftype].keys())
+
+ def findFile(self, fname):
+ for t in portage_const.MANIFEST2_IDENTIFIERS:
+ if fname in self.fhashdict[t]:
+ return t
+ return None
+
+ def create(self, checkExisting=False, assumeDistfileHashes=True):
+ """ Recreate this Manifest from scratch, not using any existing checksums
+ (exception: if assumeDistfileHashes is true then existing DIST checksums are
+ reused if the file doesn't exist in DISTDIR."""
+ if checkExisting:
+ self.checkAllHashes()
+ if assumeDistfileHashes:
+ distfilehashes = self.fhashdict["DIST"]
+ else:
+ distfilehashes = {}
+ self.__init__(self.pkgdir, self.db, self.mysettings, fromScratch=True)
+ for f in portage.listdir(self.pkgdir, filesonly=True, recursive=False):
+ if f.endswith(".ebuild"):
+ mytype = "EBUILD"
+ elif manifest2MiscfileFilter(f):
+ mytype = "MISC"
+ else:
+ continue
+ self.fhashdict[mytype][f] = perform_multiple_checksums(self.pkgdir+f, self.hashes)
+ for f in portage.listdir(self.pkgdir+"files", filesonly=True, recursive=True):
+ if not manifest2AuxfileFilter(f):
+ continue
+ self.fhashdict["AUX"][f] = perform_multiple_checksums(self.pkgdir+"files"+os.sep+f, self.hashes)
+ cpvlist = [self.pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] for x in portage.listdir(self.pkgdir) if x.endswith(".ebuild")]
+ distlist = []
+ for cpv in cpvlist:
+ distlist.extend(self._getCpvDistfiles(cpv))
+ for f in distlist:
+ fname = self.distdir+os.sep+f
+ if os.path.exists(fname):
+ self.fhashdict["DIST"][f] = perform_multiple_checksums(fname, self.hashes)
+ elif assumeDistfileHashes and f in distfilehashes.keys():
+ self.fhashdict["DIST"][f] = distfilehashes[f]
+ else:
+ raise FileNotFound(fname)
+
+ def _getAbsname(self, ftype, fname):
+ if ftype == "DIST":
+ absname = self.distdir+os.sep+fname
+ elif ftype == "AUX":
+ absname = os.sep.join([self.pkgdir, "files", fname])
+ else:
+ absname = self.pkgdir+os.sep+fname
+ return absname
+
+ def checkAllHashes(self, ignoreMissingFiles=False):
+ for t in portage_const.MANIFEST2_IDENTIFIERS:
+ self.checkTypeHashes(t, ignoreMissingFiles=ignoreMissingFiles)
+
+ def checkTypeHashes(self, idtype, ignoreMissingFiles=False):
+ for f in self.fhashdict[idtype].keys():
+ self.checkFileHashes(idtype, f, ignoreMissing=ignoreMissingFiles)
+
+ def checkFileHashes(self, ftype, fname, ignoreMissing=False):
+ myhashes = self.fhashdict[ftype][fname]
+ ok,reason = verify_all(self._getAbsname(ftype, fname), self.fhashdict[ftype][fname])
+ if not ok:
+ raise DigestException(tuple([self._getAbsname(ftype, fname)]+list(reason)))
+ return ok, reason
+
+ def checkCpvHashes(self, cpv, checkDistfiles=True, onlyDistfiles=False, checkMiscfiles=False):
+ """ check the hashes for all files associated to the given cpv, include all
+ AUX files and optionally all MISC files. """
+ if not onlyDistfiles:
+ self.checkTypeHashes("AUX", ignoreMissingFiles=False)
+ if checkMiscfiles:
+ self.checkTypeHashes("MISC", ignoreMissingFiles=False)
+ ebuildname = portage.catsplit(cpv)[1]+".ebuild"
+ self.checkFileHashes("EBUILD", ebuildname, ignoreMissing=False)
+ if checkDistfiles:
+ if onlyDistfiles:
+ for f in self._getCpvDistfiles(cpv):
+ self.checkFileHashes("DIST", f, ignoreMissing=False)
+
+ def _getCpvDistfiles(self, cpv):
+ """ Get a list of all DIST files associated to the given cpv """
+ return self.db.getfetchlist(cpv, mysettings=self.mysettings, all=True)[1]
+
+ def updateFileHashes(self, ftype, fname, checkExisting=True, ignoreMissing=True):
+ """ Regenerate hashes for the given file """
+ if checkExisting:
+ self.checkFileHashes(fname)
+ if not ignoreMissing and not self.fhashdict[ftype].has_key(fname):
+ raise FileNotInManifestException(fname)
+ if not self.fhashdict[ftype].has_key(fname):
+ self.fhashdict[ftype][fname] = {}
+ myhashes = perform_multiple_checksums(self._getAbsname(ftype, fname), self.hashes)
+ self.fhashdict[ftype][fname].update(myhashes)
+
+ def updateTypeHashes(self, idtype, checkExisting=False, ignoreMissingFiles=True):
+ """ Regenerate all hashes for all files of the given type """
+ for fname in self.fhashdict[idtype].keys():
+ self.updateFileHashes(idtype, fname, checkExisting)
+
+ def updateAllHashes(self, checkExisting=False, ignoreMissingFiles=True):
+ """ Regenerate all hashes for all files in this Manifest. """
+ for ftype in portage_const.MANIFEST2_IDENTIFIERS:
+ self.updateTypeHashes(idtype, fname, checkExisting)
+
+ def updateCpvHashes(self, cpv, ignoreMissingFiles=True):
+ """ Regenerate all hashes associated to the given cpv (includes all AUX and MISC
+ files)."""
+ self.updateTypeHashes("AUX", ignoreMissingFiles=ignoreMissingFiles)
+ self.updateTypeHashes("MISC", ignoreMissingFiles=ignoreMissingFiles)
+ ebuildname = portage.catsplit(cpv)[1]+".ebuild"
+ self.updateFileHashes("EBUILD", ebuildname, ignoreMissingFiles=ignoreMissingFiles)
+ for f in self._getCpvDistfiles(cpv):
+ self.updateFileHashes("DIST", f, ignoreMissingFiles=ignoreMissingFiles)
+
+ def getFileData(self, ftype, fname, key):
+ """ Return the value of a specific (type,filename,key) triple, mainly useful
+ to get the size for distfiles."""
+ return self.fhashdict[ftype][fname][key]
+
+ def getVersions(self):
+ """ Returns a list of manifest versions present in the manifest file. """
+ rVal = []
+ mfname = self.getFullname()
+ if not os.path.exists(mfname):
+ return rVal
+ myfile = open(mfname, "r")
+ lines = myfile.readlines()
+ myfile.close()
+ for l in lines:
+ mysplit = l.split()
+ if len(mysplit) == 4 and mysplit[0] in portage_const.MANIFEST1_HASH_FUNCTIONS and not 1 in rVal:
+ rVal.append(1)
+ elif len(mysplit) > 4 and mysplit[0] in portage_const.MANIFEST2_IDENTIFIERS and ((len(mysplit) - 3) % 2) == 0 and not 2 in rVal:
+ rVal.append(2)
+ return rVal
=== modified file 'pym/portage.py'
--- pym/portage.py
+++ pym/portage.py
@@ -73,6 +73,7 @@
from portage_data import ostype, lchown, userland, secpass, uid, wheelgid, \
portage_uid, portage_gid
+ from portage_manifest import Manifest
import portage_util
from portage_util import atomic_ofstream, apply_secpass_permissions, \
@@ -2009,181 +2010,67 @@
return 0
return 1
-
-def digestCreate(myfiles,basedir,oldDigest={}):
- """Takes a list of files and the directory they are in and returns the
- dict of dict[filename][CHECKSUM_KEY] = hash
- returns None on error."""
- mydigests={}
- for x in myfiles:
- print "<<<",x
- myfile=os.path.normpath(basedir+"///"+x)
- if os.path.exists(myfile):
- if not os.access(myfile, os.R_OK):
- print "!!! Given file does not appear to be readable. Does it exist?"
- print "!!! File:",myfile
- return None
- mydigests[x] = portage_checksum.perform_multiple_checksums(myfile, hashes=portage_const.MANIFEST1_HASH_FUNCTIONS)
- mysize = os.stat(myfile)[stat.ST_SIZE]
- else:
- if x in oldDigest:
- # DeepCopy because we might not have a unique reference.
- mydigests[x] = copy.deepcopy(oldDigest[x])
- mysize = copy.deepcopy(oldDigest[x]["size"])
- else:
- print "!!! We have a source URI, but no file..."
- print "!!! File:",myfile
- return None
-
- if mydigests[x].has_key("size") and (mydigests[x]["size"] != mysize):
- raise portage_exception.DigestException, "Size mismatch during checksums"
- mydigests[x]["size"] = copy.deepcopy(mysize)
- return mydigests
-
-def digestCreateLines(filelist, mydict):
- mylines = []
- mydigests = copy.deepcopy(mydict)
- for myarchive in filelist:
- mysize = mydigests[myarchive]["size"]
- if len(mydigests[myarchive]) == 0:
- raise portage_exception.DigestException, "No generate digest for '%(file)s'" % {"file":myarchive}
- for sumName in mydigests[myarchive].keys():
- if sumName not in portage_checksum.get_valid_checksum_keys():
- continue
- mysum = mydigests[myarchive][sumName]
-
- myline = sumName[:]
- myline += " "+mysum
- myline += " "+myarchive
- myline += " "+str(mysize)
- mylines.append(myline)
- return mylines
-
-def digestgen(myarchives,mysettings,overwrite=1,manifestonly=0):
+def digestgen(myarchives,mysettings,db=None,overwrite=1,manifestonly=0):
"""generates digest file if missing. Assumes all files are available. If
- overwrite=0, the digest will only be created if it doesn't already exist."""
-
- # archive files
- basedir=mysettings["DISTDIR"]+"/"
- digestfn=mysettings["FILESDIR"]+"/digest-"+mysettings["PF"]
-
- # portage files -- p(ortagefiles)basedir
- pbasedir=mysettings["O"]+"/"
- manifestfn=pbasedir+"Manifest"
-
- if not manifestonly:
- if not os.path.isdir(mysettings["FILESDIR"]):
- os.makedirs(mysettings["FILESDIR"])
- mycvstree=cvstree.getentries(pbasedir, recursive=1)
-
- if ("cvs" in features) and os.path.exists(pbasedir+"/CVS"):
- if not cvstree.isadded(mycvstree,"files"):
- if "autoaddcvs" in features:
- print ">>> Auto-adding files/ dir to CVS..."
- spawn("cd "+pbasedir+"; cvs add files",mysettings,free=1)
- else:
- print "--- Warning: files/ is not added to cvs."
-
- if (not overwrite) and os.path.exists(digestfn):
- return 1
-
- print green(">>> Generating the digest file...")
-
- # Track the old digest so we can assume checksums without requiring
- # all files to be downloaded. 'Assuming'
- myolddigest = {}
- if os.path.exists(digestfn):
- myolddigest = digestParseFile(digestfn)
-
- myarchives.sort()
- try:
- mydigests=digestCreate(myarchives, basedir, oldDigest=myolddigest)
- except portage_exception.DigestException, s:
- print "!!!",s
- return 0
- if mydigests==None: # There was a problem, exit with an errorcode.
- return 0
-
- try:
- outfile=open(digestfn, "w+")
- except SystemExit, e:
- raise
- except Exception, e:
- print "!!! Filesystem error skipping generation. (Read-Only?)"
- print "!!!",e
- return 0
- for x in digestCreateLines(myarchives, mydigests):
- outfile.write(x+"\n")
- outfile.close()
- try:
- os.chown(digestfn,os.getuid(),portage_gid)
- os.chmod(digestfn,0664)
- except SystemExit, e:
- raise
- except Exception,e:
- print e
-
- print green(">>> Generating the manifest file...")
- mypfiles=listdir(pbasedir,recursive=1,filesonly=1,ignorecvs=1,EmptyOnError=1)
- mypfiles=cvstree.apply_cvsignore_filter(mypfiles)
- mypfiles.sort()
- for x in ["Manifest"]:
- if x in mypfiles:
- mypfiles.remove(x)
-
- mydigests=digestCreate(mypfiles, pbasedir)
- if mydigests==None: # There was a problem, exit with an errorcode.
- return 0
-
- try:
- outfile=open(manifestfn, "w+")
- except SystemExit, e:
- raise
- except Exception, e:
- print "!!! Filesystem error skipping generation. (Read-Only?)"
- print "!!!",e
- return 0
- for x in digestCreateLines(mypfiles, mydigests):
- outfile.write(x+"\n")
- outfile.close()
- try:
- os.chown(manifestfn,os.getuid(),portage_gid)
- os.chmod(manifestfn,0664)
- except SystemExit, e:
- raise
- except Exception,e:
- print e
-
- if "cvs" in features and os.path.exists(pbasedir+"/CVS"):
- mycvstree=cvstree.getentries(pbasedir, recursive=1)
- myunaddedfiles=""
- if not manifestonly and not cvstree.isadded(mycvstree,digestfn):
- if digestfn[:len(pbasedir)]==pbasedir:
- myunaddedfiles=digestfn[len(pbasedir):]+" "
- else:
- myunaddedfiles=digestfn+" "
- if not cvstree.isadded(mycvstree,manifestfn[len(pbasedir):]):
- if manifestfn[:len(pbasedir)]==pbasedir:
- myunaddedfiles+=manifestfn[len(pbasedir):]+" "
- else:
- myunaddedfiles+=manifestfn
- if myunaddedfiles:
- if "autoaddcvs" in features:
- print blue(">>> Auto-adding digest file(s) to CVS...")
- spawn("cd "+pbasedir+"; cvs add "+myunaddedfiles,mysettings,free=1)
- else:
- print "--- Warning: digests are not yet added into CVS."
- print darkgreen(">>> Computed message digests.")
- print
+ overwrite=0, the digest will only be created if it doesn't already exist.
+ DEPRECATED: this now only is a compability wrapper for
+ portage_manifest.Manifest()"""
+
+ # NOTE: manifestonly is useless with manifest2 and therefore ignored
+ # NOTE: the old code contains a lot of crap that should really be elsewhere
+ # (e.g. cvs stuff should be in ebuild(1) and/or repoman)
+ # TODO: error/exception handling
+
+ if db == None:
+ db = portagetree().dbapi
+
+ mf = Manifest(mysettings["O"], db, mysettings)
+ for f in myarchives:
+ # the whole type evaluation is only for the case that myarchives isn't a
+ # DIST file as create() determines the type on its own
+ mytype = mf.guessType(f)
+ if mytype == "AUX":
+ f = f[5:]
+ elif mytype == None:
+ continue
+ myrealtype = mf.findFile(f)
+ if myrealtype != None:
+ mytype = myrealtype
+ mf.create(assumeDistfileHashes=True)
+ mf.updateFileHashes(mytype, f, checkExisting=False)
+ # NOTE: overwrite=0 is only used by emerge --digest, not sure we wanna keep that
+ if overwrite or not os.path.exists(mf.getFullname()):
+ mf.write(sign=False)
+
return 1
-
-def digestParseFile(myfilename):
+def digestParseFile(myfilename,mysettings=None,db=None):
"""(filename) -- Parses a given file for entries matching:
<checksumkey> <checksum_hex_string> <filename> <filesize>
Ignores lines that don't start with a valid checksum identifier
and returns a dict with the filenames as keys and {checksumkey:checksum}
- as the values."""
+ as the values.
+ DEPRECATED: this function is now only a compability wrapper for
+ portage_manifest.Manifest()."""
+
+ mysplit = myfilename.split(os.sep)
+ if mysplit[-2] == "files" and mysplit[-1].startswith("digest-"):
+ pkgdir = os.sep+os.sep.join(mysplit[:-2])
+ elif mysplit[-1] == "Manifest":
+ pkgdir = os.sep+os.sep.join(mysplit[:-1])
+
+ if db == None:
+ db = portagetree().dbapi
+ if mysettings == None:
+ mysettings = config(clone=settings)
+
+ mf = Manifest(pkgdir, db, mysettings)
+
+ return mf.getDigests()
+
+ #########################################
+ # Old code that's replaced by the above #
+ #########################################
if not os.path.exists(myfilename):
return None
@@ -2217,7 +2104,11 @@
"""(fileslist, digestdict, basedir) -- Takes a list of files and a dict
of their digests and checks the digests against the indicated files in
the basedir given. Returns 1 only if all files exist and match the checksums.
+ DEPRECATED: this function isn't compatible with manifest2, use
+ portage_manifest.Manifest() instead for any digest related tasks.
"""
+ print "!!! use of deprecated function digestCheckFiles(), use portage_manifest instead"""
+ return 0
for x in myfiles:
if not mydigests.has_key(x):
print
@@ -2249,8 +2140,46 @@
return 1
-def digestcheck(myfiles, mysettings, strict=0, justmanifest=0):
- """Verifies checksums. Assumes all files have been downloaded."""
+def digestcheck(myfiles, mysettings, strict=0, justmanifest=0, db=None):
+ """Verifies checksums. Assumes all files have been downloaded.
+ DEPRECATED: this is now only a compability wrapper for
+ portage_manifest.Manifest()."""
+
+ pkgdir = mysettings["O"]
+ if db == None:
+ db = portagetree().dbapi
+ mf = Manifest(pkgdir, db, mysettings)
+ try:
+ if strict:
+ print ">>> checking ebuild checksums",
+ mf.checkTypeHashes("EBUILD")
+ print ":-)"
+ print ">>> checking auxfile checksums",
+ mf.checkTypeHashes("AUX")
+ print ":-)"
+ print ">>> checking miscfile checksums",
+ mf.checkTypeHashes("MISC", ignoreMissingFiles=True)
+ print ":-)"
+ for f in myfiles:
+ if f.startswith("files/"):
+ f = f[5:]
+ print ">>> checking %s checksums" % f,
+ mf.checkFileHashes(mf.findFile(f), f)
+ print ":-)"
+ except portage_exception.DigestException, e:
+ print e.value
+ print red("!!! ")+"Digest verification failed:"
+ print red("!!! ")+" "+e.value[0]
+ print red("!!! ")+"Reason: "+e.value[1]
+ print red("!!! ")+"Got: "+str(e.value[2])
+ print red("!!! ")+"Expected: "+str(e.value[3])
+ return 0
+ return 1
+
+ #########################################
+ # Old code that's replaced by the above #
+ #########################################
+
# archive files
basedir=mysettings["DISTDIR"]+"/"
digestfn=mysettings["FILESDIR"]+"/digest-"+mysettings["PF"]
=== modified file 'pym/portage_checksum.py'
--- pym/portage_checksum.py
+++ pym/portage_checksum.py
@@ -58,6 +58,11 @@
except ImportError:
pass
+def getsize(filename):
+ size = os.stat(filename).st_size
+ return (size, size)
+hashfunc_map["size"] = getsize
+
# end actual hash functions
prelink_capable = False
@@ -68,7 +73,7 @@
del results
def perform_md5(x, calc_prelink=0):
- return perform_checksum(x, md5hash, calc_prelink)[0]
+ return perform_checksum(x, "MD5", calc_prelink)[0]
def perform_all(x, calc_prelink=0):
mydict = {}
@@ -94,7 +99,7 @@
if x == "size":
continue
elif x in hashfunc_map.keys():
- myhash = perform_checksum(filename, hashfunc_map[x], calc_prelink=calc_prelink)[0]
+ myhash = perform_checksum(filename, x, calc_prelink=calc_prelink)[0]
if mydict[x] != myhash:
if strict:
raise portage_exception.DigestException, "Failed to verify '$(file)s' on checksum type '%(type)s'" % {"file":filename, "type":x}
@@ -118,7 +123,7 @@
return (sum.hexdigest(), size)
-def perform_checksum(filename, hash_function=md5hash, calc_prelink=0):
+def perform_checksum(filename, hashname="MD5", calc_prelink=0):
myfilename = filename[:]
prelink_tmpfile = os.path.join("/", PRIVATE_PATH, "prelink-checksum.tmp." + str(os.getpid()))
mylock = None
@@ -132,7 +137,9 @@
#portage_util.writemsg(">>> prelink checksum '"+str(filename)+"'.\n")
myfilename=prelink_tmpfile
try:
- myhash, mysize = hash_function(myfilename)
+ if hashname not in hashfunc_map:
+ raise portage_exception.DigestException, hashname+" hash function not available (needs dev-python/pycrypto)"
+ myhash, mysize = hashfunc_map[hashname](myfilename)
except (OSError, IOError), e:
if e.errno == errno.ENOENT:
raise portage_exception.FileNotFound(e)
@@ -155,5 +162,5 @@
for x in hashes:
if x not in hashfunc_map:
raise portage_exception.DigestException, x+" hash function not available (needs dev-python/pycrypto)"
- rVal[x] = perform_checksum(filename, hashfunc_map[x], calc_prelink)[0]
+ rVal[x] = perform_checksum(filename, x, calc_prelink)[0]
return rVal
=== modified file 'pym/portage_const.py'
--- pym/portage_const.py
+++ pym/portage_const.py
@@ -47,10 +47,10 @@
EAPI = 0
HASHING_BLOCKSIZE = 32768
-# Disabling until behaviour when missing the relevant python module is
-# corrected. #116485
MANIFEST1_HASH_FUNCTIONS = ["MD5","SHA256","RMD160"]
+MANIFEST2_HASH_FUNCTIONS = ["SHA1","SHA256","RMD160"]
+MANIFEST2_IDENTIFIERS = ["AUX","MISC","DIST","EBUILD"]
# ===========================================================================
# END OF CONSTANTS -- END OF CONSTANTS -- END OF CONSTANTS -- END OF CONSTANT
# ===========================================================================
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
2006-03-16 5:53 ` Zac Medico
@ 2006-03-16 7:15 ` Brian Harring
2006-03-16 7:14 ` Donnie Berkholz
0 siblings, 1 reply; 8+ messages in thread
From: Brian Harring @ 2006-03-16 7:15 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 15258 bytes --]
On Wed, Mar 15, 2006 at 09:53:24PM -0800, Zac Medico wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> Marius Mauch wrote:
> > Marius Mauch schrieb:
> >> The first should be delayed until there is some consensus how the gpg
> >> stuff should work in the future, the others I don't see the use for.
> >> Also I only checked portage.py for changes, so emerge/repoman/... might
> >> still have to be fixed.
> >> Last but not least: I did some basic testing with this and the
> >> important stuff seems to work, but I'm quite sure the code still has a
> >> lot of bugs/issues, and this being a core functionality it needs a
> >> *lot* of testing, so I'd really appreciate if you could all give it a
> >> spin (but do not commit anything to the tree without manually checking
> >> it first).
> >
> > Does the lack of feedback (only got a reaction from Brian so far) mean
> > that noone tried it or that it doesn't have any issues?
>
> The patch applies and seems to work well. At a quick glance the code looks pretty clean and it's nice to migrate more code out of portage.py to a separate module. I've attached a refreshed version of the patch that applies cleanly against current svn (I've made no changes).
>
> Zac
> -----BEGIN PGP SIGNATURE-----
> Version: GnuPG v1.4.2.2 (GNU/Linux)
>
> iD8DBQFEGP1S/ejvha5XGaMRAl/7AJ9cZbjhWtjCz+ac2/tjQNUoivj0twCg7xAG
> cYvDbMiqU5HtpNrVk7fs6RM=
> =Eqlo
> -----END PGP SIGNATURE-----
> === added file 'pym/portage_manifest.py'
> --- /dev/null
> +++ pym/portage_manifest.py
> @@ -0,0 +1,314 @@
> +import os, sets
> +
> +import portage, portage_exception, portage_versions, portage_const
> +from portage_checksum import *
> +from portage_exception import *
> +
> +class FileNotInManifestException(PortageException):
> + pass
> +
> +def manifest2AuxfileFilter(filename):
> + filename = filename.strip("/")
> + return not (filename in ["CVS", ".svn"] or filename[:len("digest-")] == "digest-")
> +
> +def manifest2MiscfileFilter(filename):
> + filename = filename.strip("/")
> + return not (filename in ["CVS", ".svn", "files", "Manifest"] or filename[-7:] == ".ebuild")
python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.endswith(".ebuild")'
1000000 loops, best of 3: 0.88 usec per loop
python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[-7:] == ".ebuild"'
1000000 loops, best of 3: 0.564 usec per loop
Use endswith
oddly, worth noting that startswith differs in this behaviour...
python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[:7] == ".ebuild"'
1000000 loops, best of 3: 0.592 usec per loop
python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.startswith(".ebuild")'
1000000 loops, best of 3: 0.842 usec per loop
> +class Manifest(object):
> + def __init__(self, pkgdir, db, mysettings, hashes=portage_const.MANIFEST2_HASH_FUNCTIONS, manifest1_compat=True, fromScratch=False):
> + self.pkgdir = pkgdir+os.sep
rstrip os.sep prior to adding it
> + self.fhashdict = {}
> + self.hashes = hashes
> + self.hashes.append("size")
> + if manifest1_compat:
> + self.hashes.extend(portage_const.MANIFEST1_HASH_FUNCTIONS)
> + self.hashes = sets.Set(self.hashes)
> + for t in portage_const.MANIFEST2_IDENTIFIERS:
> + self.fhashdict[t] = {}
> + self._read()
> + self.compat = manifest1_compat
> + self.db = db
> + self.mysettings = mysettings
> + if mysettings.has_key("PORTAGE_ACTUAL_DISTDIR"):
> + self.distdir = mysettings["PORTAGE_ACTUAL_DISTDIR"]
> + else:
> + self.distdir = mysettings["DISTDIR"]
Why pass in mysettings?
Have the code push it in, manifest shouldn't know about the DISTDIR
key nor PORTAGE_ACTUAL_DISTDIR, should just have a directory to look
in.
> + def guessType(self, filename):
> + if filename.startswith("files/digest-"):
> + return None
> + if filename.startswith("files/"):
if you're intent on using os.sep, might want to correct the two '/'
uses above to use os.path.join/os.path.sep
If concerned about cost, just calculate it once in the class namespace
as a constant.
related, might I suggest converting away from internal strings to a
class level enumeration?
int comparison is faster then string, plus it unbinds the internal
code from the on disk symbols used (eg, just cause on disk is AUX
doesn't mean internally it should be throwing around "AUX").
> + return "AUX"
> + elif filename.endswith(".ebuild"):
> + return "EBUILD"
> + elif filename in ["ChangeLog", "metadata.xml"]:
> + return "MISC"
> + else:
> + return "DIST"
> +
> + def getFullname(self):
> + return self.pkgdir+"Manifest"
Err... move that into the initializer.
If you're concerned folks will screw up the var, use a property to
make it immutable.
Either way, func calls aren't cheap, and that's not really needed :)
> +
> + def getDigests(self):
> + rval = {}
> + for t in portage_const.MANIFEST2_IDENTIFIERS:
> + rval.update(self.fhashdict[t])
> + return rval
> +
> + def _readDigests(self):
> + mycontent = ""
> + for d in portage.listdir(self.pkgdir+"files", filesonly=True, recursive=False):
> + if d.startswith("digest-"):
> + mycontent += open(self.pkgdir+"files"+os.sep+d, "r").read()
> + return mycontent
Any reason to use portage.listdir?
It horks up the import's a bit, bloats the portage.listdir cache
for data that shouldn't be accessed all that often.
> +
> + def _read(self):
> + if not os.path.exists(self.getFullname()):
> + return
> + fd = open(self.getFullname(), "r")
> + mylines = fd.readlines()
> + fd.close()
try:
mylines = open(self.getFullname(), "r").readlines()
except OSError, oe:
if oe.errno == errno.ENOENT:
return
raise
one less stat.
> + mylines.extend(self._readDigests().split("\n"))
> + for l in mylines:
> + myname = ""
> + mysplit = l.split()
> + if len(mysplit) == 4 and mysplit[0] in portage_const.MANIFEST1_HASH_FUNCTIONS:
> + myname = mysplit[2]
> + mytype = self.guessType(myname)
> + if mytype == "AUX" and myname.startswith("files/"):
> + myname = myname[6:]
> + if mytype == None:
if mytype is None
rather then if mytype == None
None is singleton instance, explicit is check is faster (and never
will bite you in the ass, same for True and False)
As stated above, personally I'd convert that over to a class level
dict mapping external symbols to internal.
> + continue
> + mysize = int(mysplit[3])
> + myhashes = {mysplit[0]: mysplit[1]}
> + if len(mysplit) > 4 and mysplit[0] in portage_const.MANIFEST2_IDENTIFIERS:
> + mytype = mysplit[0]
> + myname = mysplit[1]
> + mysize = int(mysplit[2])
> + myhashes = dict(zip(mysplit[3::2], mysplit[4::2]))
> + if len(myname) == 0:
> + continue
> + if not self.fhashdict[mytype].has_key(myname):
> + self.fhashdict[mytype][myname] = {}
self.fhashdict[mytype].setdefault(myname, {})
> + self.fhashdict[mytype][myname].update(myhashes)
> + self.fhashdict[mytype][myname]["size"] = mysize
> +
> + def _writeDigests(self):
> + cpvlist = [self.pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] for x in portage.listdir(self.pkgdir) if x.endswith(".ebuild")]
again, if using os.sep (trying to be os agnostic), nuke the "/" usage
that's kinda fugly also- just chop splice x starting at the len of the
basedir.
> + rval = []
> + for cpv in cpvlist:
> + dname = self.pkgdir+"files"+os.sep+"digest-"+portage.catsplit(cpv)[1]
os.path.join...
> + mylines = []
> + distlist = self._getCpvDistfiles(cpv)
> + for f in self.fhashdict["DIST"].keys():
> + if f in distlist:
> + for h in self.fhashdict["DIST"][f].keys():
> + if h not in portage_const.MANIFEST1_HASH_FUNCTIONS:
> + continue
> + myline = " ".join([h, str(self.fhashdict["DIST"][f][h]), f, str(self.fhashdict["DIST"][f]["size"])])
> + mylines.append(myline)
> + fd = open(dname, "w")
> + fd.write("\n".join(mylines))
> + fd.write("\n")
> + fd.close()
rather then building your own list and then flushing it, use
zmedico's atomic_write and write straight to the file obj; forego the
intermediate building of the list and rely on file buffering (file
buffering will occur anyways).
> + rval.append(dname)
> + return rval
> +
> + def _addDigestsToManifest(self, digests, fd):
> + mylines = []
> + for dname in digests:
> + myhashes = perform_multiple_checksums(dname, portage_const.MANIFEST1_HASH_FUNCTIONS+["size"])
> + for h in myhashes.keys():
> + mylines.append((" ".join([h, str(myhashes[h]), os.path.join("files", os.path.basename(dname)), str(myhashes["size"])])))
> + fd.write("\n".join(mylines))
> + fd.write("\n")
Same thing here; just dump to the fd directly
> + def _write(self, fd):
> + mylines = []
> + for t in self.fhashdict.keys():
> + for f in self.fhashdict[t].keys():
> + myline = " ".join([t, f, str(self.fhashdict[t][f]["size"])])
> + myhashes = self.fhashdict[t][f]
> + for h in myhashes.keys():
drop the .keys, just do for h in myhashes.
myhashes will default to iterkeys which is faster, and less code
involved.
> + if h not in portage_const.MANIFEST2_HASH_FUNCTIONS:
> + continue
I'd moved that into a filter/iterfilter of myhashes in the for loop,
but your call
> + myline += " "+h+" "+str(myhashes[h])
> + mylines.append(myline)
> + if self.compat and t != "DIST":
> + for h in myhashes.keys():
> + if h not in portage_const.MANIFEST1_HASH_FUNCTIONS:
> + continue
> + mylines.append((" ".join([h, str(myhashes[h]), f, str(myhashes["size"])])))
> + fd.write("\n".join(mylines))
> + fd.write("\n")
and here...
> +
> + def write(self, sign=False):
> + fd = open(self.getFullname(), "w")
> + self._write(fd)
> + if self.compat:
> + digests = self._writeDigests()
> + self._addDigestsToManifest(digests, fd)
> + fd.close()
> + if sign:
> + self.sign()
> +
> + def sign(self):
> + raise NotImplementedError()
> +
> + def validateSignature(self):
> + raise NotImplementedError()
> +
> + def addFile(self, ftype, fname, hashdict=None):
> + if not os.path.exists(self.pkgdir+fname):
> + raise FileNotFound(fname)
> + if not ftype in portage_const.MANIFEST2_IDENTIFIERS:
> + raise InvalidDataType(ftype)
> + self.fhashdict[ftype][fname] = {}
> + if hashdict != None:
> + self.fhashdict[ftype][fname].update(hashdict)
> + if not portage_const.MANIFEST2_REQUIRED_HASH in self.fhashdict[ftype][fname].keys():
> + self.updateFileHashes(ftype, fname)
extra stat there, would just catch the throw from updatefilehashs
personally.
> +
> + def removeFile(self, ftype, fname):
> + del self.fhashdict[ftype][fname]
> +
> + def hasFile(self, ftype, fname):
> + return (fname in self.fhashdict[ftype].keys())
ick, no. you've forced that to be linear when a return fname in
self.fhashdict[type] can uses __contains__ which is just a hash look
up.
> +
> + def findFile(self, fname):
> + for t in portage_const.MANIFEST2_IDENTIFIERS:
> + if fname in self.fhashdict[t]:
> + return t
> + return None
These funcs really make me think you should just access fhashdict
directly...
> + def create(self, checkExisting=False, assumeDistfileHashes=True):
> + """ Recreate this Manifest from scratch, not using any existing checksums
> + (exception: if assumeDistfileHashes is true then existing DIST checksums are
> + reused if the file doesn't exist in DISTDIR."""
> + if checkExisting:
> + self.checkAllHashes()
> + if assumeDistfileHashes:
> + distfilehashes = self.fhashdict["DIST"]
> + else:
> + distfilehashes = {}
> + self.__init__(self.pkgdir, self.db, self.mysettings, fromScratch=True)
> + for f in portage.listdir(self.pkgdir, filesonly=True, recursive=False):
> + if f.endswith(".ebuild"):
> + mytype = "EBUILD"
> + elif manifest2MiscfileFilter(f):
> + mytype = "MISC"
> + else:
> + continue
> + self.fhashdict[mytype][f] = perform_multiple_checksums(self.pkgdir+f, self.hashes)
> + for f in portage.listdir(self.pkgdir+"files", filesonly=True, recursive=True):
> + if not manifest2AuxfileFilter(f):
> + continue
> + self.fhashdict["AUX"][f] = perform_multiple_checksums(self.pkgdir+"files"+os.sep+f, self.hashes)
> + cpvlist = [self.pkgdir.rstrip("/").split("/")[-2]+"/"+x[:-7] for x in portage.listdir(self.pkgdir) if x.endswith(".ebuild")]
> + distlist = []
> + for cpv in cpvlist:
> + distlist.extend(self._getCpvDistfiles(cpv))
> + for f in distlist:
> + fname = self.distdir+os.sep+f
> + if os.path.exists(fname):
> + self.fhashdict["DIST"][f] = perform_multiple_checksums(fname, self.hashes)
> + elif assumeDistfileHashes and f in distfilehashes.keys():
> + self.fhashdict["DIST"][f] = distfilehashes[f]
> + else:
> + raise FileNotFound(fname)
> +
> + def _getAbsname(self, ftype, fname):
> + if ftype == "DIST":
> + absname = self.distdir+os.sep+fname
os.path.join...
> + elif ftype == "AUX":
> + absname = os.sep.join([self.pkgdir, "files", fname])
os.path.join. that one is just fugly btw ;)
> + else:
> + absname = self.pkgdir+os.sep+fname
same...
> + return absname
> +
> + def checkAllHashes(self, ignoreMissingFiles=False):
> + for t in portage_const.MANIFEST2_IDENTIFIERS:
> + self.checkTypeHashes(t, ignoreMissingFiles=ignoreMissingFiles)
> +
> + def checkTypeHashes(self, idtype, ignoreMissingFiles=False):
> + for f in self.fhashdict[idtype].keys():
> + self.checkFileHashes(idtype, f, ignoreMissing=ignoreMissingFiles)
> +
> + def checkFileHashes(self, ftype, fname, ignoreMissing=False):
> + myhashes = self.fhashdict[ftype][fname]
> + ok,reason = verify_all(self._getAbsname(ftype, fname), self.fhashdict[ftype][fname])
> + if not ok:
> + raise DigestException(tuple([self._getAbsname(ftype, fname)]+list(reason)))
> + return ok, reason
> +
> + def checkCpvHashes(self, cpv, checkDistfiles=True, onlyDistfiles=False, checkMiscfiles=False):
> + """ check the hashes for all files associated to the given cpv, include all
> + AUX files and optionally all MISC files. """
> + if not onlyDistfiles:
> + self.checkTypeHashes("AUX", ignoreMissingFiles=False)
> + if checkMiscfiles:
> + self.checkTypeHashes("MISC", ignoreMissingFiles=False)
> + ebuildname = portage.catsplit(cpv)[1]+".ebuild"
> + self.checkFileHashes("EBUILD", ebuildname, ignoreMissing=False)
> + if checkDistfiles:
> + if onlyDistfiles:
> + for f in self._getCpvDistfiles(cpv):
> + self.checkFileHashes("DIST", f, ignoreMissing=False)
> +
Personally... these funcs seem wrong to me.
Manifest (to me) strikes me as just a data store;
verification/generation of chksums should be external...
your call however.
> + def _getCpvDistfiles(self, cpv):
> + """ Get a list of all DIST files associated to the given cpv """
> + return self.db.getfetchlist(cpv, mysettings=self.mysettings, all=True)[1]
Manifest code knowing about dbapi instances seems really rather icky
to me...
~harring
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
2006-03-16 7:15 ` Brian Harring
@ 2006-03-16 7:14 ` Donnie Berkholz
2006-03-16 7:23 ` Brian Harring
0 siblings, 1 reply; 8+ messages in thread
From: Donnie Berkholz @ 2006-03-16 7:14 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 833 bytes --]
Brian Harring wrote:
> python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.endswith(".ebuild")'
> 1000000 loops, best of 3: 0.88 usec per loop
> python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[-7:] == ".ebuild"'
> 1000000 loops, best of 3: 0.564 usec per loop
> Use endswith
> oddly, worth noting that startswith differs in this behaviour...
> python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[:7] == ".ebuild"'
> 1000000 loops, best of 3: 0.592 usec per loop
> python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.startswith(".ebuild")'
> 1000000 loops, best of 3: 0.842 usec per loop
Um, those both read the same way to me. You just switched the ordering
around, so the (starts|ends)with is on the bottom instead of the top,
but both times (starts|ends)with is longer.
Thanks,
Donnie
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 191 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
2006-03-16 7:14 ` Donnie Berkholz
@ 2006-03-16 7:23 ` Brian Harring
2006-03-16 9:06 ` tvali
0 siblings, 1 reply; 8+ messages in thread
From: Brian Harring @ 2006-03-16 7:23 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 1077 bytes --]
On Wed, Mar 15, 2006 at 11:14:04PM -0800, Donnie Berkholz wrote:
> Brian Harring wrote:
> > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.endswith(".ebuild")'
> > 1000000 loops, best of 3: 0.88 usec per loop
>
> > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[-7:] == ".ebuild"'
> > 1000000 loops, best of 3: 0.564 usec per loop
>
> > Use endswith
>
> > oddly, worth noting that startswith differs in this behaviour...
> > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[:7] == ".ebuild"'
> > 1000000 loops, best of 3: 0.592 usec per loop
>
> > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.startswith(".ebuild")'
> > 1000000 loops, best of 3: 0.842 usec per loop
>
> Um, those both read the same way to me. You just switched the ordering
> around, so the (starts|ends)with is on the bottom instead of the top,
> but both times (starts|ends)with is longer.
This is why crack is bad, mm'kay.
/me lights the pipe and goes back to his corner.
Pardon, just did a quick test and screwed the results ;)
~harring
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [gentoo-portage-dev] [PATCH] Manifest2 reloaded
2006-03-16 7:23 ` Brian Harring
@ 2006-03-16 9:06 ` tvali
2006-03-16 9:24 ` Brian Harring
0 siblings, 1 reply; 8+ messages in thread
From: tvali @ 2006-03-16 9:06 UTC (permalink / raw
To: gentoo-portage-dev
Just in case ...what i have to do to test those results
...ps. should i send it here if i have a working c++ class for forking
-- it evolved from that python thought here, which evolved into
general interest to those pipes and interacting with other apps in my
case (which is, as i have understood, important in "unix-like
operating systems") ..therefore i encapsulate it into some generic c++
class to do piping and add some error checking, which would give
simple way to use scripting languages, too. i have still not installed
that c++ ide i like, but anyway, Kate is not so bad :) I have it
almost ready but it seems that i have to do some work now for a while
...do i send it here, too, when done or noone needs such thing
anymore? it just runs some command and gives a simple way to send
messages to its stdin and read its stdout so that interacting with
things like python could be simple, too.
2006/3/16, Brian Harring <ferringb@gmail.com>:
> On Wed, Mar 15, 2006 at 11:14:04PM -0800, Donnie Berkholz wrote:
> > Brian Harring wrote:
> > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.endswith(".ebuild")'
> > > 1000000 loops, best of 3: 0.88 usec per loop
> >
> > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[-7:] == ".ebuild"'
> > > 1000000 loops, best of 3: 0.564 usec per loop
> >
> > > Use endswith
> >
> > > oddly, worth noting that startswith differs in this behaviour...
> > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's[:7] == ".ebuild"'
> > > 1000000 loops, best of 3: 0.592 usec per loop
> >
> > > python -m timeit -s 's="asdf"*400;s+="fdsa.ebuild"' 's.startswith(".ebuild")'
> > > 1000000 loops, best of 3: 0.842 usec per loop
> >
> > Um, those both read the same way to me. You just switched the ordering
> > around, so the (starts|ends)with is on the bottom instead of the top,
> > but both times (starts|ends)with is longer.
>
> This is why crack is bad, mm'kay.
>
> /me lights the pipe and goes back to his corner.
>
> Pardon, just did a quick test and screwed the results ;)
> ~harring
>
>
>
>
--
tvali
(e-mail: "qtvali@gmail.com"; msn: "qtvali@gmail.com";
icq: "317-492-912")
Ühe eesti internetifirma lehel kohtasin tsitaati:
If you don't do it excellently, dont do it at all. Because if it's not
excellent, it won't be profitable or fun, and if you're not in
business for fun or profit, what the hell are you doing here?
Robert Townsend
--
gentoo-portage-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2006-03-16 9:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-03-04 3:50 [gentoo-portage-dev] [PATCH] Manifest2 reloaded Marius Mauch
2006-03-15 23:44 ` Marius Mauch
2006-03-16 5:53 ` Zac Medico
2006-03-16 7:15 ` Brian Harring
2006-03-16 7:14 ` Donnie Berkholz
2006-03-16 7:23 ` Brian Harring
2006-03-16 9:06 ` tvali
2006-03-16 9:24 ` Brian Harring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox