public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Brian Dolbec <dolsen@gentoo.org>
To: gentoo-dev@lists.gentoo.org
Subject: Re: [gentoo-dev] [RFC] unpacker.eclass extensions
Date: Mon, 17 Jun 2013 14:41:00 -0700	[thread overview]
Message-ID: <1371505260.28535.131.camel@big_daddy.dol-sen.ca> (raw)
In-Reply-To: <51BF7372.8040608@gentoo.org>

[-- Attachment #1: Type: text/plain, Size: 3279 bytes --]

On Mon, 2013-06-17 at 16:37 -0400, Rick "Zero_Chaos" Farina wrote:
> On 06/17/2013 04:19 PM, Diego Elio Pettenò wrote:
> > On 17/06/2013 17:54, Rick "Zero_Chaos" Farina wrote:
> >> I make all my files with "tar cJf"
> >>
> >> zero@ozzie ~ % file /usr/portage/distfiles/gr-osmosdr-0.0.2.tar.xz
> >> /usr/portage/distfiles/gr-osmosdr-0.0.2.tar.xz: XZ compressed data
> > 
> > cJ with _current_ tar will generate XZ
> > cJ with _past_ tar could generate lzma
> > xJ with _current_ tar will extract both XZ and lzma
> > zJ with _past_ tar will only extract lzma
> > 
> > 
> tar for the last several years properly and automatically detects the
> format of the input using just 'x'.  This is true for gnu and bsd tar
> which should cover all of gentoo.  We should be passing "tar xf" to
> extract things unless we want to hijack the decompression command like
> is possible in portage with BZIP2_COMMAND=lbzip2 in which case we want
> to pass "tar -I $BZIP2_COMMAND -xf" or (not yet in portage) "tar -I pixz
> -xf".  Some good examples of how this are handled are in catalyst but
> I'll let dolson talk about his own work as he is much more qualified
> than I am.
> 
> -Zero

Well, I don't code bash, but...

In catalyst, I had a request to add more types of
compression/decompression capabilities.  Plus I am also trying to clean
up the messy, poorly structured code.  What I have come up with is a
separate class that can be loaded with either compression or
decompression commands that can be configured, overridden, added to,...
It will provide a python interface to perform the actions requested,
keeping the detailed coding central and internal.  It will provide
automatic detection of the command to use via it's extension as a
combination of a pre-determined configured preference (if it is
compatible) or a suitable command determined by that extension.  There
can also be several predefined ways to do the same action, in which case
the pre-configured preference will take priority.

I hadn't gotten into reading this thread much, but did note the
similarity of what I was currently doing and what it was discussing.

What is directly relevant to the tar commands is listed in the
compression_definitions and decompression_definitions at the top of the
file attached.

This python lib can easily be made into a standalone pkg with a cli
interface to perform many common actions without the need of additional
coding in eclasses, package managers, etc..  It could provide gentoo
with a common, central, easily extended and updated method of performing
normal compression/decompression.

It is a work in progress, with only preliminary testing done inside my
catalyst rewrite.  I have not yet added configuration options and
override preferences.

It is available in my catalyst git repo in my dev space.  It is in the
compress branch.  Note there is no master branch there, so a basic clone
will error when it tries to checkout a master working copy on completion
of the clone.  Just "git checkout compress" after.

http://dev.gentoo.org/~dolsen/catalyst/

I've attached the current compress.py file so you can peruse and decide
if making this a standalone gentoo project is desired.  If not, it will
get merged into catalyst code base.
-- 
Brian Dolbec <dolsen@gentoo.org>

[-- Attachment #2: compress.py --]
[-- Type: text/x-python, Size: 10170 bytes --]


# Maintained in full by:
# Catalyst Team <catalyst@gentoo.org>
# Release Engineering Team <releng@gentoo.org>

'''
compress.py

Utility class to hold and handle all possible compression
and de-compression of files.  Including rsync transfers.

'''


import os
from collections import namedtuple

from support import cmd


definition_fields = ["func", "cmd", "args", "id", "extension"]
defintition_types = [ str,    str,   list,   str,  str]

'''The definition entries are to follow the the definition_types
with the exception of the first entry "Type" which is a mode identifier
for use in the class as a type ID and printable string.'''
compress_definitions = {
	"Type": ["Compression", "Compression definitions loaded"],
	"rsync"		:["rsync", "rsync", ["-a", "--delete", "%(source)s",  "%(destination)s"], "RSYNC", None],
	"lbzip2"		:["_common", "tar", ["-I", "lbzip2", "-cf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "LBZIP2", "tbz2"],
	"tbz2"		:["_common", "tar", ["-I", "lbzip2", "-cf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "LBZIP2", "tbz2"],
	"bz2"		:["_common", "tar", ["-cpjf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "BZIP2", "tar.bz2"],
	"tar"		:["_common", "tar", ["-cpf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "TAR", "tar"],
	"xz"		:["_common", "tar", ["-cpJf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "XZ", "tar.xz"],
	"pixz"		:["_common", "tar", ["-I", "pixz", "-cpf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "PIXZ", "xz"],
	"zip"		:["_common", "tar", ["-cpzf", "%(filename)s", "-C", "%(basedir)s", "%(source)s"], "GZIP", "zip"],
	}


decompress_definitions = {
	"Type": ["Decompression", "Decompression definitions loaded"],
	"rsync"		:["rsync", "rsync", ["-a", "--delete", "%(source)s", "%(destination)s"], "RSYNC", None],
	"lbzip2"		:["_common", "tar", ["-I", "lbzip2", "-xpf", "%(source)s", "-C", "%(destination)s"], "LBZIP2", "bz2"],
	"tbz2"		:["_common", "tar", ["-I", "lbzip2", "-xpf", "%(source)s", "-C", "%(destination)s"], "LBZIP2", "tbz2"],
	"bz2"		:["_common", "tar", ["-xpf", "%(source)s", "-C", "%(destination)s"], "BZIP2", "bz2"],
	"tar"		:["_common", "tar", ["-xpf", "%(source)s", "-C", "%(destination)s"], "TAR", "tar"],
	"xz"		:["_common", "tar", ["-xpf", "%(source)s", "-C", "%(destination)s"], "XZ", "xz"],
	"pixz"		:["_common", "tar", ["-I", "pixz", "-xpf", "%(source)s", "-C", "%(destination)s"], "PIXZ", "xz"],
	"zip"		:["_common", "tar", ["-xpzf", "%(source)s", "-C", "%(destination)s"], "GZIP", "zip"],
	"gz"		:["_common", "tar", ["-xpzf", "%(source)s", "-C", "%(destination)s"], "GZIP", "zip"],
	}


'''Configure this here in case it is ever changed.
This is the only edit point required then.'''
extension_separator = '.'


class CompressMap(object):
	'''Class for handling
	Catalyst's compression & decompression of archives'''

	'''fields: list of ordered field names for the (de)compression functions'''
	fields = definition_fields[:]


	def __init__(self, definitions=None, env=None,
			default_mode=None, separator=extension_separator):
		'''Class init

		@param compress_mode: boolean, defaults to True
			describes compression or de-compression definitions loaded
		@param definitions: dictionary of
			Key:[function, cmd, cmd_args, Print/id string, extension]
		@param env: environment to pass to the cmd subprocess
		'''
		if definitions is None:
			definitions = {}
			self.loaded_type = ["None", "No definitions loaded"]
		else:
			self.loaded_type = definitions.pop('Type')
		self.env = env or {}
		self.mode_error = self.loaded_type[0] + \
			" Error: No mode was passed in or automatically detected"
		self._map = {}
		self.extension_separator = separator
		# set some defaults depending on what is being loaded
		if self.loaded_type[0] in ['Compression']:
			self.mode = default_mode or 'tbz2'
			self.compress = self._compress
			self.extract = None
		else:
			self.mode = default_mode or 'auto'
			self.compress = None
			self.extract = self._extract

		# create the (de)compression definition namedtuple classes
		for name in list(definitions):
			obj = namedtuple(name, self.fields)
			obj.__slots__ = ()
			self._map[name] = obj._make(definitions[name])
		del obj


	def _compress(self, infodict=None, filename='', source=None,
			basedir='.', mode=None, auto_extension=False, fatal=True):
		'''Compression function

		@param infodict: optional dictionary of the next 4 parameters.
		@param filename: optional string, name ot the file to make
		@param source: optional string, path to a directory
		@param destination: optional string, path a directory
		@param mode: string, optional mode to use to (de)compress with
		@return boolean
		'''
		if not infodict:
			infodict = self.create_infodict(source, None,
				basedir, filename, mode or self.mode, auto_extension)
		if not infodict['mode']:
			print self.mode_error
			return False
		if auto_extension:
			infodict['auto-ext'] = True
		return self._run(infodict, fatal=fatal)


	def _extract(self, infodict=None, source=None, destination=None,
			mode=None, fatal=True):
		'''De-compression function

		@param infodict: optional dictionary of the next 3 parameters.
		@param source: optional string, path to a directory
		@param destination: optional string, path a directory
		@param mode: string, optional mode to use to (de)compress with
		@return boolean
		'''
		if self.loaded_type[0] not in ["Decompression"]:
			return False
		if not infodict:
			infodict = self.create_infodict(source, destination, mode=mode)
		if infodict['mode'] in [None]:
			infodict['mode'] = self.mode or 'auto'
		if infodict['mode'] in ['auto']:
			infodict['mode'] = self.get_extension(infodict['source'])
			if not infodict['mode']:
				print self.mode_error
				return False
		return self._run(infodict, fatal=fatal)


	def _run(self, infodict, fatal=True):
		'''Internal function that runs the designated function

		@param source: string, path to a directory or file
		@param destination: string, path a directoy or file
		@param mode; string, desired method to perform the
			compression or transfer
		@return boolean
		'''
		if not self.is_supported(infodict['mode']):
			print "mode: %s is not supported in the current %s definitions" \
				% (infodict['mode'], self.loaded_type[1])
			return False
		try:
			func = getattr(self, self._map[infodict['mode']].func)
			success = func(infodict, fatal)
		except AttributeError:
			print "FAILED to find function '%s'" % str(self._map[infodict['mode']].func)
			return False
		#except Exception as e:
			#msg = "Error performing %s %s, " % (mode, self.loaded_type[0]) + \
				#"is the appropriate utility installed on your system?"
			#print msg
			#print "Exception:", e
			#return False
		return success


	def get_extension(self, source):
		'''Extracts the file extension string from the source file

		@param source: string, file path of the file to determine
		@return string: file type extension of the source file
		'''
		#if self.extension_separator not in source:
			#return None
		#return source.rsplit(self.extension_separator, 1)[1]
		return os.path.splitext(source)[1]


	def rsync(self, infodict=None, source=None, destination=None,
			mode=None, fatal=True):
		'''Convienience function. Performs an rsync transfer

		@param infodict: optional dictionary of the next 3 parameters.
		@param source: optional string, path to a directory
		@param destination: optional string, path a directory
		@param mode: string, optional mode to use to (de)compress with
		@return boolean
		'''
		if not infodict:
			if not mode:
				mode = 'rsync'
			infodict = self.create_infodict(source, destination, mode=mode)
		return self._run(infodict, fatal=fatal)


	def _common(self, infodict, fatal=True):
		'''Internal function.  Performs commonly supported
		compression or decompression.

		@param files: dict as returned by this class's pair_files()
		@param mode: string, mode to use to (de)compress with
		@return boolean
		'''
		if not infodict['mode'] or not self.is_supported(infodict['mode']):
			print "ERROR: CompressMap; %s mode: %s not correctly set!" \
				% (self.loaded_type[0], infodict['mode'])
			return False

		#Avoid modifying the source dictionary
		cmdinfo = infodict.copy()

		cmdlist = self._map[cmdinfo['mode']]

		if cmdinfo['auto-ext']:
			cmdinfo['filename'] += self.extension_separator + \
				cmdlist.extension

		# Do the string substitution
		opts = ' '.join(cmdlist.args) %(cmdinfo)
		args = ' '.join([cmdlist.cmd, opts])

		return cmd(args, cmdlist.id, env=self.env, fatal=fatal)


	def create_infodict(self, source, destination=None, basedir=None,
			filename='', mode=None, auto_extension=False):
		'''Puts the source and destination paths into a dictionary
		for use in string substitution in the defintions
		%(source) and %(destination) fields embedded into the commands

		@param source: string, path to a directory
		@param destination: string, path a directory
		@param filename: optional string
		@return dict:
		'''
		return {
			'source': source,
			'destination': destination,
			'basedir': basedir,
			'filename': filename,
			'mode': mode or self.mode,
			'auto-ext': auto_extension,
			}


	def is_supported(self, mode):
		'''Truth function to test the mode desired is supported
		in the definitions loaded

		@param mode: string, mode to use to (de)compress with
		@return boolean
		'''
		return mode in list(self._map)


	@property
	def available_modes(self):
		'''Convienence function to return the available modes'''
		return list(self._map)


	def best_mode(self, prefered_mode, source):
		'''Compare the prefered_mode's extension with the source extension
		and returns the best choice

		@param prefered_mode: string
		@param source: string, path the the source file
		@return string: best mode to use for the extraction
		'''
		if source.endswith(self._map[prefered_mode].extension):
			return prefered_mode
		return self.get_extension(source)


	def extension(self, mode):
		'''Returns the predetermined extension auto-ext added
		to the filename for compression.

		@param mode: string
		@return string
		'''
		if self.is_supported(mode):
			return self._map[mode].extension
		return ''

  reply	other threads:[~2013-06-17 21:41 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-06-15  8:39 [gentoo-dev] [RFC] unpacker.eclass extensions Vadim A. Misbakh-Soloviov
2013-06-15 14:33 ` Markos Chandras
2013-06-15 14:37   ` Markos Chandras
2013-06-17  5:55 ` Mike Frysinger
2013-06-17  7:15   ` Diego Elio Pettenò
2013-06-17 16:08     ` Mike Frysinger
2013-06-17 16:54       ` Rick "Zero_Chaos" Farina
2013-06-17 20:19         ` Diego Elio Pettenò
2013-06-17 20:37           ` Rick "Zero_Chaos" Farina
2013-06-17 21:41             ` Brian Dolbec [this message]
2013-06-19 21:15             ` Mike Frysinger
2013-06-22 17:55   ` [gentoo-dev] unpacker.eclass: add decompress probe helper Mike Frysinger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1371505260.28535.131.camel@big_daddy.dol-sen.ca \
    --to=dolsen@gentoo.org \
    --cc=gentoo-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox