From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 5D0801381F3 for ; Mon, 17 Jun 2013 21:41:25 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id D95C6E0960; Mon, 17 Jun 2013 21:41:20 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id A3CCCE086D for ; Mon, 17 Jun 2013 21:41:19 +0000 (UTC) Received: from [192.168.1.210] (S010600222de111ff.vc.shawcable.net [96.49.5.156]) (using TLSv1 with cipher ECDHE-RSA-AES256-SHA (256/256 bits)) (No client certificate requested) (Authenticated sender: dolsen) by smtp.gentoo.org (Postfix) with ESMTPSA id 649EB33E0CB for ; Mon, 17 Jun 2013 21:41:18 +0000 (UTC) Message-ID: <1371505260.28535.131.camel@big_daddy.dol-sen.ca> Subject: Re: [gentoo-dev] [RFC] unpacker.eclass extensions From: Brian Dolbec To: gentoo-dev@lists.gentoo.org Date: Mon, 17 Jun 2013 14:41:00 -0700 In-Reply-To: <51BF7372.8040608@gentoo.org> References: <51BC282E.9020306@mva.name> <201306170155.10347.vapier@gentoo.org> <51BEB7A9.6090901@flameeyes.eu> <201306171208.31369.vapier@gentoo.org> <51BF3F5D.7020708@gentoo.org> <51BF6F6A.60300@flameeyes.eu> <51BF7372.8040608@gentoo.org> Content-Type: multipart/mixed; boundary="=-ytCbHuKg4k54YqBPqc7S" X-Mailer: Evolution 3.6.4 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org Mime-Version: 1.0 X-Archives-Salt: 12a895e1-c72c-44e4-94a5-97b9a01ca946 X-Archives-Hash: b42331ae23d5718b1639f71892ddebc0 --=-ytCbHuKg4k54YqBPqc7S Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 8bit On Mon, 2013-06-17 at 16:37 -0400, Rick "Zero_Chaos" Farina wrote: > On 06/17/2013 04:19 PM, Diego Elio Pettenò wrote: > > On 17/06/2013 17:54, Rick "Zero_Chaos" Farina wrote: > >> I make all my files with "tar cJf" > >> > >> zero@ozzie ~ % file /usr/portage/distfiles/gr-osmosdr-0.0.2.tar.xz > >> /usr/portage/distfiles/gr-osmosdr-0.0.2.tar.xz: XZ compressed data > > > > cJ with _current_ tar will generate XZ > > cJ with _past_ tar could generate lzma > > xJ with _current_ tar will extract both XZ and lzma > > zJ with _past_ tar will only extract lzma > > > > > tar for the last several years properly and automatically detects the > format of the input using just 'x'. This is true for gnu and bsd tar > which should cover all of gentoo. We should be passing "tar xf" to > extract things unless we want to hijack the decompression command like > is possible in portage with BZIP2_COMMAND=lbzip2 in which case we want > to pass "tar -I $BZIP2_COMMAND -xf" or (not yet in portage) "tar -I pixz > -xf". Some good examples of how this are handled are in catalyst but > I'll let dolson talk about his own work as he is much more qualified > than I am. > > -Zero Well, I don't code bash, but... In catalyst, I had a request to add more types of compression/decompression capabilities. Plus I am also trying to clean up the messy, poorly structured code. What I have come up with is a separate class that can be loaded with either compression or decompression commands that can be configured, overridden, added to,... It will provide a python interface to perform the actions requested, keeping the detailed coding central and internal. It will provide automatic detection of the command to use via it's extension as a combination of a pre-determined configured preference (if it is compatible) or a suitable command determined by that extension. There can also be several predefined ways to do the same action, in which case the pre-configured preference will take priority. I hadn't gotten into reading this thread much, but did note the similarity of what I was currently doing and what it was discussing. What is directly relevant to the tar commands is listed in the compression_definitions and decompression_definitions at the top of the file attached. This python lib can easily be made into a standalone pkg with a cli interface to perform many common actions without the need of additional coding in eclasses, package managers, etc.. It could provide gentoo with a common, central, easily extended and updated method of performing normal compression/decompression. It is a work in progress, with only preliminary testing done inside my catalyst rewrite. I have not yet added configuration options and override preferences. It is available in my catalyst git repo in my dev space. It is in the compress branch. Note there is no master branch there, so a basic clone will error when it tries to checkout a master working copy on completion of the clone. Just "git checkout compress" after. http://dev.gentoo.org/~dolsen/catalyst/ I've attached the current compress.py file so you can peruse and decide if making this a standalone gentoo project is desired. If not, it will get merged into catalyst code base. -- Brian Dolbec --=-ytCbHuKg4k54YqBPqc7S Content-Disposition: attachment; filename="compress.py" Content-Type: text/x-python; name="compress.py"; charset="UTF-8" Content-Transfer-Encoding: 7bit # Maintained in full by: # Catalyst Team # Release Engineering Team ''' compress.py Utility class to hold and handle all possible compression and de-compression of files. Including rsync transfers. ''' import os from collections import namedtuple from support import cmd definition_fields = ["func", "cmd", "args", "id", "extension"] defintition_types = [ str, str, list, str, str] '''The definition entries are to follow the the definition_types with the exception of the first entry "Type" which is a mode identifier for use in the class as a type ID and printable string.''' compress_definitions = { "Type": ["Compression", "Compression definitions loaded"], "rsync" :["rsync", "rsync", ["-a", "--delete", "%(source)s", "%(destination)s"], "RSYNC", None], "lbzip2" :["_common", "tar", ["-I", "lbzip2", "-cf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "LBZIP2", "tbz2"], "tbz2" :["_common", "tar", ["-I", "lbzip2", "-cf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "LBZIP2", "tbz2"], "bz2" :["_common", "tar", ["-cpjf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "BZIP2", "tar.bz2"], "tar" :["_common", "tar", ["-cpf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "TAR", "tar"], "xz" :["_common", "tar", ["-cpJf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "XZ", "tar.xz"], "pixz" :["_common", "tar", ["-I", "pixz", "-cpf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "PIXZ", "xz"], "zip" :["_common", "tar", ["-cpzf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "GZIP", "zip"], } decompress_definitions = { "Type": ["Decompression", "Decompression definitions loaded"], "rsync" :["rsync", "rsync", ["-a", "--delete", "%(source)s", "%(destination)s"], "RSYNC", None], "lbzip2" :["_common", "tar", ["-I", "lbzip2", "-xpf", "%(source)s", "-C", "%(destination)s"], "LBZIP2", "bz2"], "tbz2" :["_common", "tar", ["-I", "lbzip2", "-xpf", "%(source)s", "-C", "%(destination)s"], "LBZIP2", "tbz2"], "bz2" :["_common", "tar", ["-xpf", "%(source)s", "-C", "%(destination)s"], "BZIP2", "bz2"], "tar" :["_common", "tar", ["-xpf", "%(source)s", "-C", "%(destination)s"], "TAR", "tar"], "xz" :["_common", "tar", ["-xpf", "%(source)s", "-C", "%(destination)s"], "XZ", "xz"], "pixz" :["_common", "tar", ["-I", "pixz", "-xpf", "%(source)s", "-C", "%(destination)s"], "PIXZ", "xz"], "zip" :["_common", "tar", ["-xpzf", "%(source)s", "-C", "%(destination)s"], "GZIP", "zip"], "gz" :["_common", "tar", ["-xpzf", "%(source)s", "-C", "%(destination)s"], "GZIP", "zip"], } '''Configure this here in case it is ever changed. This is the only edit point required then.''' extension_separator = '.' class CompressMap(object): '''Class for handling Catalyst's compression & decompression of archives''' '''fields: list of ordered field names for the (de)compression functions''' fields = definition_fields[:] def __init__(self, definitions=None, env=None, default_mode=None, separator=extension_separator): '''Class init @param compress_mode: boolean, defaults to True describes compression or de-compression definitions loaded @param definitions: dictionary of Key:[function, cmd, cmd_args, Print/id string, extension] @param env: environment to pass to the cmd subprocess ''' if definitions is None: definitions = {} self.loaded_type = ["None", "No definitions loaded"] else: self.loaded_type = definitions.pop('Type') self.env = env or {} self.mode_error = self.loaded_type[0] + \ " Error: No mode was passed in or automatically detected" self._map = {} self.extension_separator = separator # set some defaults depending on what is being loaded if self.loaded_type[0] in ['Compression']: self.mode = default_mode or 'tbz2' self.compress = self._compress self.extract = None else: self.mode = default_mode or 'auto' self.compress = None self.extract = self._extract # create the (de)compression definition namedtuple classes for name in list(definitions): obj = namedtuple(name, self.fields) obj.__slots__ = () self._map[name] = obj._make(definitions[name]) del obj def _compress(self, infodict=None, filename='', source=None, basedir='.', mode=None, auto_extension=False, fatal=True): '''Compression function @param infodict: optional dictionary of the next 4 parameters. @param filename: optional string, name ot the file to make @param source: optional string, path to a directory @param destination: optional string, path a directory @param mode: string, optional mode to use to (de)compress with @return boolean ''' if not infodict: infodict = self.create_infodict(source, None, basedir, filename, mode or self.mode, auto_extension) if not infodict['mode']: print self.mode_error return False if auto_extension: infodict['auto-ext'] = True return self._run(infodict, fatal=fatal) def _extract(self, infodict=None, source=None, destination=None, mode=None, fatal=True): '''De-compression function @param infodict: optional dictionary of the next 3 parameters. @param source: optional string, path to a directory @param destination: optional string, path a directory @param mode: string, optional mode to use to (de)compress with @return boolean ''' if self.loaded_type[0] not in ["Decompression"]: return False if not infodict: infodict = self.create_infodict(source, destination, mode=mode) if infodict['mode'] in [None]: infodict['mode'] = self.mode or 'auto' if infodict['mode'] in ['auto']: infodict['mode'] = self.get_extension(infodict['source']) if not infodict['mode']: print self.mode_error return False return self._run(infodict, fatal=fatal) def _run(self, infodict, fatal=True): '''Internal function that runs the designated function @param source: string, path to a directory or file @param destination: string, path a directoy or file @param mode; string, desired method to perform the compression or transfer @return boolean ''' if not self.is_supported(infodict['mode']): print "mode: %s is not supported in the current %s definitions" \ % (infodict['mode'], self.loaded_type[1]) return False try: func = getattr(self, self._map[infodict['mode']].func) success = func(infodict, fatal) except AttributeError: print "FAILED to find function '%s'" % str(self._map[infodict['mode']].func) return False #except Exception as e: #msg = "Error performing %s %s, " % (mode, self.loaded_type[0]) + \ #"is the appropriate utility installed on your system?" #print msg #print "Exception:", e #return False return success def get_extension(self, source): '''Extracts the file extension string from the source file @param source: string, file path of the file to determine @return string: file type extension of the source file ''' #if self.extension_separator not in source: #return None #return source.rsplit(self.extension_separator, 1)[1] return os.path.splitext(source)[1] def rsync(self, infodict=None, source=None, destination=None, mode=None, fatal=True): '''Convienience function. Performs an rsync transfer @param infodict: optional dictionary of the next 3 parameters. @param source: optional string, path to a directory @param destination: optional string, path a directory @param mode: string, optional mode to use to (de)compress with @return boolean ''' if not infodict: if not mode: mode = 'rsync' infodict = self.create_infodict(source, destination, mode=mode) return self._run(infodict, fatal=fatal) def _common(self, infodict, fatal=True): '''Internal function. Performs commonly supported compression or decompression. @param files: dict as returned by this class's pair_files() @param mode: string, mode to use to (de)compress with @return boolean ''' if not infodict['mode'] or not self.is_supported(infodict['mode']): print "ERROR: CompressMap; %s mode: %s not correctly set!" \ % (self.loaded_type[0], infodict['mode']) return False #Avoid modifying the source dictionary cmdinfo = infodict.copy() cmdlist = self._map[cmdinfo['mode']] if cmdinfo['auto-ext']: cmdinfo['filename'] += self.extension_separator + \ cmdlist.extension # Do the string substitution opts = ' '.join(cmdlist.args) %(cmdinfo) args = ' '.join([cmdlist.cmd, opts]) return cmd(args, cmdlist.id, env=self.env, fatal=fatal) def create_infodict(self, source, destination=None, basedir=None, filename='', mode=None, auto_extension=False): '''Puts the source and destination paths into a dictionary for use in string substitution in the defintions %(source) and %(destination) fields embedded into the commands @param source: string, path to a directory @param destination: string, path a directory @param filename: optional string @return dict: ''' return { 'source': source, 'destination': destination, 'basedir': basedir, 'filename': filename, 'mode': mode or self.mode, 'auto-ext': auto_extension, } def is_supported(self, mode): '''Truth function to test the mode desired is supported in the definitions loaded @param mode: string, mode to use to (de)compress with @return boolean ''' return mode in list(self._map) @property def available_modes(self): '''Convienence function to return the available modes''' return list(self._map) def best_mode(self, prefered_mode, source): '''Compare the prefered_mode's extension with the source extension and returns the best choice @param prefered_mode: string @param source: string, path the the source file @return string: best mode to use for the extraction ''' if source.endswith(self._map[prefered_mode].extension): return prefered_mode return self.get_extension(source) def extension(self, mode): '''Returns the predetermined extension auto-ext added to the filename for compression. @param mode: string @return string ''' if self.is_supported(mode): return self._map[mode].extension return '' --=-ytCbHuKg4k54YqBPqc7S--