From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 8BBC91391DB for ; Sun, 23 Mar 2014 16:00:06 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 58E4AE0B10; Sun, 23 Mar 2014 16:00:04 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id D3F0AE0B10 for ; Sun, 23 Mar 2014 16:00:03 +0000 (UTC) Received: from spoonbill.gentoo.org (spoonbill.gentoo.org [81.93.255.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 008D533EC9C for ; Sun, 23 Mar 2014 16:00:01 +0000 (UTC) Received: from localhost.localdomain (localhost [127.0.0.1]) by spoonbill.gentoo.org (Postfix) with ESMTP id 99564188E9 for ; Sun, 23 Mar 2014 16:00:00 +0000 (UTC) From: "Martin Mokrejs" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Martin Mokrejs" Message-ID: <1395589938.b6bc96d05888bcfb2a2ebac3a477663d1915c57c.mmokrejs@gentoo> Subject: [gentoo-commits] proj/sci:master commit in: sci-biology/biopython/, sci-biology/biopython/files/ X-VCS-Repository: proj/sci X-VCS-Files: sci-biology/biopython/ChangeLog sci-biology/biopython/biopython-1.62-r3.ebuild sci-biology/biopython/biopython-1.62-r4.ebuild sci-biology/biopython/biopython-1.63-r1.ebuild sci-biology/biopython/biopython-1.63.ebuild sci-biology/biopython/files/SeqRecord.py.patch sci-biology/biopython/files/SffIO_error_in_check_eof.patch sci-biology/biopython/files/adjust-trimpoints.patch sci-biology/biopython/files/biopython-1.51-flex.patch sci-biology/biopython/files/biopython-1.62-SffIO.patch sci-biology/biopython/metadata.xml X-VCS-Directories: sci-biology/biopython/ sci-biology/biopython/files/ X-VCS-Committer: mmokrejs X-VCS-Committer-Name: Martin Mokrejs X-VCS-Revision: b6bc96d05888bcfb2a2ebac3a477663d1915c57c X-VCS-Branch: master Date: Sun, 23 Mar 2014 16:00:00 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: ef1a59b8-39eb-49b8-b8f6-85e72dc4cda8 X-Archives-Hash: 929e06b8ba2d350e036eaafd25eb3086 commit: b6bc96d05888bcfb2a2ebac3a477663d1915c57c Author: Martin Mokrejš fold natur cuni cz> AuthorDate: Sun Mar 23 15:52:18 2014 +0000 Commit: Martin Mokrejs fold natur cuni cz> CommitDate: Sun Mar 23 15:52:18 2014 +0000 URL: http://git.overlays.gentoo.org/gitweb/?p=proj/sci.git;a=commit;h=b6bc96d0 sci-biology/biopython-1.63-r1: version bump and an upstream patch to improve error message Package-Manager: portage-2.2.7 --- sci-biology/biopython/ChangeLog | 17 +++ sci-biology/biopython/biopython-1.62-r3.ebuild | 51 +++++++ sci-biology/biopython/biopython-1.62-r4.ebuild | 51 +++++++ sci-biology/biopython/biopython-1.63-r1.ebuild | 51 +++++++ sci-biology/biopython/biopython-1.63.ebuild | 50 +++++++ sci-biology/biopython/files/SeqRecord.py.patch | 148 +++++++++++++++++++++ .../biopython/files/SffIO_error_in_check_eof.patch | 14 ++ .../biopython/files/adjust-trimpoints.patch | 76 +++++++++++ .../biopython/files/biopython-1.51-flex.patch | 21 +++ .../biopython/files/biopython-1.62-SffIO.patch | 36 +++++ sci-biology/biopython/metadata.xml | 5 + 11 files changed, 520 insertions(+) diff --git a/sci-biology/biopython/ChangeLog b/sci-biology/biopython/ChangeLog new file mode 100644 index 0000000..6dfe5e2 --- /dev/null +++ b/sci-biology/biopython/ChangeLog @@ -0,0 +1,17 @@ +# ChangeLog for sci-biology/biopython +# Copyright 1999-2014 Gentoo Foundation; Distributed under the GPL v2 +# $Header: $ + +*biopython-1.62-r3 (23 Mar 2014) +*biopython-1.62-r4 (23 Mar 2014) +*biopython-1.63-r1 (23 Mar 2014) +*biopython-1.63 (23 Mar 2014) + + 23 Mar 2014; Martin Mokrejs + +biopython-1.62-r3.ebuild, +biopython-1.62-r4.ebuild, + +biopython-1.63-r1.ebuild, +biopython-1.63.ebuild, +files/SeqRecord.py.patch, + +files/SffIO_error_in_check_eof.patch, +files/adjust-trimpoints.patch, + +files/biopython-1.51-flex.patch, +files/biopython-1.62-SffIO.patch, + +metadata.xml: + sci-biology/biopython-1.63-r1: version bump and an upstream patch to improve + error message diff --git a/sci-biology/biopython/biopython-1.62-r3.ebuild b/sci-biology/biopython/biopython-1.62-r3.ebuild new file mode 100644 index 0000000..1eed5a9 --- /dev/null +++ b/sci-biology/biopython/biopython-1.62-r3.ebuild @@ -0,0 +1,51 @@ +# Copyright 1999-2014 Gentoo Foundation +# Distributed under the terms of the GNU General Public License v2 +# $Header: /var/cvsroot/gentoo-x86/sci-biology/biopython/biopython-1.62.ebuild,v 1.1 2013/09/17 16:07:56 jlec Exp $ + +EAPI=5 + +PYTHON_COMPAT=( python{2_6,2_7} ) + +inherit distutils-r1 eutils + +DESCRIPTION="Python modules for computational molecular biology" +HOMEPAGE="http://www.biopython.org/ http://pypi.python.org/pypi/biopython/" +SRC_URI="http://www.biopython.org/DIST/${P}.tar.gz" + +LICENSE="HPND" +SLOT="0" +KEYWORDS="~amd64 ~ppc ~x86 ~amd64-linux ~x86-linux" +IUSE="mysql postgres" + +REQUIRED_USE="${PYTHON_REQUIRED_USE}" + +RDEPEND="${PYTHON_DEPS} + dev-python/matplotlib[${PYTHON_USEDEP}] + dev-python/networkx[${PYTHON_USEDEP}] + dev-python/numpy[${PYTHON_USEDEP}] + dev-python/pygraphviz[${PYTHON_USEDEP}] + dev-python/reportlab[${PYTHON_USEDEP}] + media-gfx/pydot[${PYTHON_USEDEP}] + mysql? ( dev-python/mysql-python[${PYTHON_USEDEP}] ) + postgres? ( dev-python/psycopg[${PYTHON_USEDEP}] )" +DEPEND="${RDEPEND} + sys-devel/flex" + +DOCS=( CONTRIB DEPRECATED NEWS README Doc/. ) + +src_prepare() { + distutils-r1_src_prepare + epatch "${FILESDIR}/${PN}-1.62-SffIO.patch" +} + +python_test() { + cd Tests || die + ${PYTHON} run_tests.py || die +} + +python_install_all() { + distutils-r1_python_install_all + + dodir /usr/share/${PN} + cp -r --preserve=mode Scripts Tests "${ED}"/usr/share/${PN} || die +} diff --git a/sci-biology/biopython/biopython-1.62-r4.ebuild b/sci-biology/biopython/biopython-1.62-r4.ebuild new file mode 100644 index 0000000..1eed5a9 --- /dev/null +++ b/sci-biology/biopython/biopython-1.62-r4.ebuild @@ -0,0 +1,51 @@ +# Copyright 1999-2014 Gentoo Foundation +# Distributed under the terms of the GNU General Public License v2 +# $Header: /var/cvsroot/gentoo-x86/sci-biology/biopython/biopython-1.62.ebuild,v 1.1 2013/09/17 16:07:56 jlec Exp $ + +EAPI=5 + +PYTHON_COMPAT=( python{2_6,2_7} ) + +inherit distutils-r1 eutils + +DESCRIPTION="Python modules for computational molecular biology" +HOMEPAGE="http://www.biopython.org/ http://pypi.python.org/pypi/biopython/" +SRC_URI="http://www.biopython.org/DIST/${P}.tar.gz" + +LICENSE="HPND" +SLOT="0" +KEYWORDS="~amd64 ~ppc ~x86 ~amd64-linux ~x86-linux" +IUSE="mysql postgres" + +REQUIRED_USE="${PYTHON_REQUIRED_USE}" + +RDEPEND="${PYTHON_DEPS} + dev-python/matplotlib[${PYTHON_USEDEP}] + dev-python/networkx[${PYTHON_USEDEP}] + dev-python/numpy[${PYTHON_USEDEP}] + dev-python/pygraphviz[${PYTHON_USEDEP}] + dev-python/reportlab[${PYTHON_USEDEP}] + media-gfx/pydot[${PYTHON_USEDEP}] + mysql? ( dev-python/mysql-python[${PYTHON_USEDEP}] ) + postgres? ( dev-python/psycopg[${PYTHON_USEDEP}] )" +DEPEND="${RDEPEND} + sys-devel/flex" + +DOCS=( CONTRIB DEPRECATED NEWS README Doc/. ) + +src_prepare() { + distutils-r1_src_prepare + epatch "${FILESDIR}/${PN}-1.62-SffIO.patch" +} + +python_test() { + cd Tests || die + ${PYTHON} run_tests.py || die +} + +python_install_all() { + distutils-r1_python_install_all + + dodir /usr/share/${PN} + cp -r --preserve=mode Scripts Tests "${ED}"/usr/share/${PN} || die +} diff --git a/sci-biology/biopython/biopython-1.63-r1.ebuild b/sci-biology/biopython/biopython-1.63-r1.ebuild new file mode 100644 index 0000000..aac2bdf --- /dev/null +++ b/sci-biology/biopython/biopython-1.63-r1.ebuild @@ -0,0 +1,51 @@ +# Copyright 1999-2014 Gentoo Foundation +# Distributed under the terms of the GNU General Public License v2 +# $Header: /var/cvsroot/gentoo-x86/sci-biology/biopython/biopython-1.62.ebuild,v 1.1 2013/09/17 16:07:56 jlec Exp $ + +EAPI=5 + +PYTHON_COMPAT=( python{2_6,2_7} ) + +inherit distutils-r1 eutils + +DESCRIPTION="Python modules for computational molecular biology" +HOMEPAGE="http://www.biopython.org/ http://pypi.python.org/pypi/biopython/" +SRC_URI="http://www.biopython.org/DIST/${P}.tar.gz" + +LICENSE="HPND" +SLOT="0" +KEYWORDS="~amd64 ~ppc ~x86 ~amd64-linux ~x86-linux" +IUSE="mysql postgres" + +REQUIRED_USE="${PYTHON_REQUIRED_USE}" + +RDEPEND="${PYTHON_DEPS} + dev-python/matplotlib[${PYTHON_USEDEP}] + dev-python/networkx[${PYTHON_USEDEP}] + dev-python/numpy[${PYTHON_USEDEP}] + dev-python/pygraphviz[${PYTHON_USEDEP}] + dev-python/reportlab[${PYTHON_USEDEP}] + media-gfx/pydot[${PYTHON_USEDEP}] + mysql? ( dev-python/mysql-python[${PYTHON_USEDEP}] ) + postgres? ( dev-python/psycopg[${PYTHON_USEDEP}] )" +DEPEND="${RDEPEND} + sys-devel/flex" + +DOCS=( CONTRIB DEPRECATED NEWS README Doc/. ) + +src_prepare() { + epatch "${FILESDIR}"/SffIO_error_in_check_eof.patch + distutils-r1_src_prepare +} + +python_test() { + cd Tests || die + ${PYTHON} run_tests.py || die +} + +python_install_all() { + distutils-r1_python_install_all + + dodir /usr/share/${PN} + cp -r --preserve=mode Scripts Tests "${ED}"/usr/share/${PN} || die +} diff --git a/sci-biology/biopython/biopython-1.63.ebuild b/sci-biology/biopython/biopython-1.63.ebuild new file mode 100644 index 0000000..5180b33 --- /dev/null +++ b/sci-biology/biopython/biopython-1.63.ebuild @@ -0,0 +1,50 @@ +# Copyright 1999-2014 Gentoo Foundation +# Distributed under the terms of the GNU General Public License v2 +# $Header: /var/cvsroot/gentoo-x86/sci-biology/biopython/biopython-1.62.ebuild,v 1.1 2013/09/17 16:07:56 jlec Exp $ + +EAPI=5 + +PYTHON_COMPAT=( python{2_6,2_7} ) + +inherit distutils-r1 eutils + +DESCRIPTION="Python modules for computational molecular biology" +HOMEPAGE="http://www.biopython.org/ http://pypi.python.org/pypi/biopython/" +SRC_URI="http://www.biopython.org/DIST/${P}.tar.gz" + +LICENSE="HPND" +SLOT="0" +KEYWORDS="~amd64 ~ppc ~x86 ~amd64-linux ~x86-linux" +IUSE="mysql postgres" + +REQUIRED_USE="${PYTHON_REQUIRED_USE}" + +RDEPEND="${PYTHON_DEPS} + dev-python/matplotlib[${PYTHON_USEDEP}] + dev-python/networkx[${PYTHON_USEDEP}] + dev-python/numpy[${PYTHON_USEDEP}] + dev-python/pygraphviz[${PYTHON_USEDEP}] + dev-python/reportlab[${PYTHON_USEDEP}] + media-gfx/pydot[${PYTHON_USEDEP}] + mysql? ( dev-python/mysql-python[${PYTHON_USEDEP}] ) + postgres? ( dev-python/psycopg[${PYTHON_USEDEP}] )" +DEPEND="${RDEPEND} + sys-devel/flex" + +DOCS=( CONTRIB DEPRECATED NEWS README Doc/. ) + +src_prepare() { + distutils-r1_src_prepare +} + +python_test() { + cd Tests || die + ${PYTHON} run_tests.py || die +} + +python_install_all() { + distutils-r1_python_install_all + + dodir /usr/share/${PN} + cp -r --preserve=mode Scripts Tests "${ED}"/usr/share/${PN} || die +} diff --git a/sci-biology/biopython/files/SeqRecord.py.patch b/sci-biology/biopython/files/SeqRecord.py.patch new file mode 100644 index 0000000..ac3785f --- /dev/null +++ b/sci-biology/biopython/files/SeqRecord.py.patch @@ -0,0 +1,148 @@ +diff --git a/Bio/SeqIO/SffIO.py b/Bio/SeqIO/SffIO.py +index 1971dba..43b38fd 100644 +--- a/Bio/SeqIO/SffIO.py ++++ b/Bio/SeqIO/SffIO.py +@@ -539,8 +539,15 @@ _valid_UAN_read_name = re.compile(r'^[a-zA-Z0-9]{14}$') + + + def _sff_read_seq_record(handle, number_of_flows_per_read, flow_chars, +- key_sequence, alphabet, trim=False): +- """Parse the next read in the file, return data as a SeqRecord (PRIVATE).""" ++ key_sequence, alphabet, trim=False, interpret_qual_trims=True, interpret_adapter_trims=False): ++ """Parse the next read in the file, return data as a SeqRecord (PRIVATE). ++ Allow user to specify which type of clipping values should be applied ++ while reading the SFF stream. To be backwards compatible, we interpret ++ only the quality-based trim points by default. That results in lower-cased ++ sequences in the low-qual region, regardless what adapter-based clip points ++ say. This should be the desired behavior. More discussion at ++ https://redmine.open-bio.org/issues/3437 ++ """ + #Now on to the reads... + #the read header format (fixed part): + #read_header_length H +@@ -589,20 +596,41 @@ def _sff_read_seq_record(handle, number_of_flows_per_read, flow_chars, + warnings.warn("Post quality %i byte padding region contained data, SFF data is not broken" + % padding) + #Follow Roche and apply most aggressive of qual and adapter clipping. +- #Note Roche seems to ignore adapter clip fields when writing SFF, +- #and uses just the quality clipping values for any clipping. +- clip_left = max(clip_qual_left, clip_adapter_left) +- #Right clipping of zero means no clipping +- if clip_qual_right: +- if clip_adapter_right: +- clip_right = min(clip_qual_right, clip_adapter_right) ++ #Note Roche does not use adapter clip fields when writing SFF files ++ #but instead combines the adapter clipping information with quality-based ++ #values and writes the most aggressive combination into clip fields (as ++ #allowed by SFF specs). ++ ++ if interpret_qual_trims: ++ if interpret_adapter_trims: ++ clip_left = max(clip_qual_left, clip_adapter_left) ++ #Right clipping of zero means no clipping ++ if clip_qual_right: ++ if clip_adapter_right: ++ clip_right = min(clip_qual_right, clip_adapter_right) ++ else: ++ #Typical case with Roche SFF files ++ clip_right = clip_qual_right ++ elif clip_adapter_right: ++ clip_right = clip_adapter_right ++ else: ++ clip_right = seq_len + else: +- #Typical case with Roche SFF files +- clip_right = clip_qual_right +- elif clip_adapter_right: +- clip_right = clip_adapter_right ++ clip_left = clip_qual_left ++ if clip_qual_right: ++ clip_right = clip_qual_right ++ else: ++ clip_right = seq_len ++ elif interpret_adapter_trims: ++ clip_left = clip_adapter_left ++ if clip_adapter_right: ++ clip_right = clip_adapter_right ++ else: ++ clip_right = seq_len + else: +- clip_right = seq_len ++ clip_left = 0 ++ clip_right = seq_len ++ + #Now build a SeqRecord + if trim: + seq = seq[clip_left:clip_right].upper() +diff --git a/Bio/SeqRecord.py b/Bio/SeqRecord.py +index c90e13b..66bdea0 100644 +--- a/Bio/SeqRecord.py ++++ b/Bio/SeqRecord.py +@@ -14,6 +14,8 @@ __docformat__ = "epytext en" # Simple markup to show doctests nicely + # also BioSQL.BioSeq.DBSeq which is the "Database Seq" class) + + ++from Bio.Seq import Seq ++ + class _RestrictedDict(dict): + """Dict which only allows sequences of given length as values (PRIVATE). + +@@ -76,7 +78,7 @@ class _RestrictedDict(dict): + if not hasattr(value, "__len__") or not hasattr(value, "__getitem__") \ + or (hasattr(self, "_length") and len(value) != self._length): + raise TypeError("We only allow python sequences (lists, tuples or " +- "strings) of length %i." % self._length) ++ "strings) of length %i whereas you passed an object of length %s." % (self._length, str(len(value)))) + dict.__setitem__(self, key, value) + + def update(self, new_dict): +@@ -290,10 +292,11 @@ class SeqRecord(object): + """) + + def _set_seq(self, value): +- #TODO - Add a deprecation warning that the seq should be write only? +- if self._per_letter_annotations: +- #TODO - Make this a warning? Silently empty the dictionary? +- raise ValueError("You must empty the letter annotations first!") ++ # we should be much more user friendly and accept even a plain sequence string ++ # and make the Seq or MutableSeq object ourselves ++ if not isinstance(value, Seq): ++ raise ValueError("You must pass a Seq object containing the new sequence instead of just plain string.") ++ else: + self._seq = value + try: + self._per_letter_annotations = _RestrictedDict(length=len(self.seq)) +@@ -696,7 +699,7 @@ class SeqRecord(object): + SeqIO.write(self, handle, format_spec) + return handle.getvalue() + +- def __len__(self): ++ def __len__(self, trim=False, interpret_qual_trims=True, interpret_adapter_trims=False): + """Returns the length of the sequence. + + For example, using Bio.SeqIO to read in a FASTA nucleotide file: +@@ -707,6 +710,10 @@ class SeqRecord(object): + 309 + >>> len(record.seq) + 309 ++ ++ It should be possible to get length of a raw object, of trimmed ++ object by quality or adapter criteria or both, whenever user wants ++ to, not only when data is parsed from input. + """ + return len(self.seq) + +@@ -725,6 +732,13 @@ class SeqRecord(object): + """ + return True + ++ def apply_trimpoints(self, trim=False, interpret_qual_trims=False, interpret_adapter_trims=False): ++ """We should apply either of the quality-based or adapter-based annotated ++ trim points and return a new, sliced object. ++ """ ++ pass ++ ++ + def __add__(self, other): + """Add another sequence or string to this sequence. + diff --git a/sci-biology/biopython/files/SffIO_error_in_check_eof.patch b/sci-biology/biopython/files/SffIO_error_in_check_eof.patch new file mode 100644 index 0000000..9059604 --- /dev/null +++ b/sci-biology/biopython/files/SffIO_error_in_check_eof.patch @@ -0,0 +1,14 @@ +diff --git a/Bio/SeqIO/SffIO.py b/Bio/SeqIO/SffIO.py +index 2bb0dac..735d55b 100644 +--- a/Bio/SeqIO/SffIO.py ++++ b/Bio/SeqIO/SffIO.py +@@ -941,7 +941,8 @@ def _check_eof(handle, index_offset, index_length): + BiopythonParserWarning) + + offset = handle.tell() +- assert offset % 8 == 0 ++ assert offset % 8 == 0, \ ++ "Wanted offset %i %% 8 = %i to be zero" % (offset, offset % 8) + # Should now be at the end of the file... + extra = handle.read(4) + if extra == _sff: diff --git a/sci-biology/biopython/files/adjust-trimpoints.patch b/sci-biology/biopython/files/adjust-trimpoints.patch new file mode 100644 index 0000000..dd6d548 --- /dev/null +++ b/sci-biology/biopython/files/adjust-trimpoints.patch @@ -0,0 +1,76 @@ +diff --git a/Bio/SeqIO/SffIO.py b/Bio/SeqIO/SffIO.py +index 1971dba..43b38fd 100644 +--- a/Bio/SeqIO/SffIO.py ++++ b/Bio/SeqIO/SffIO.py +@@ -539,8 +539,15 @@ _valid_UAN_read_name = re.compile(r'^[a-zA-Z0-9]{14}$') + + + def _sff_read_seq_record(handle, number_of_flows_per_read, flow_chars, +- key_sequence, alphabet, trim=False): +- """Parse the next read in the file, return data as a SeqRecord (PRIVATE).""" ++ key_sequence, alphabet, trim=False, interpret_qual_trims=True, interpret_adapter_trims=False): ++ """Parse the next read in the file, return data as a SeqRecord (PRIVATE). ++ Allow user to specify which type of clipping values should be applied ++ while reading the SFF stream. To be backwards compatible, we interpret ++ only the quality-based trim points by default. That results in lower-cased ++ sequences in the low-qual region, regardless what adapter-based clip points ++ say. This should be the desired behavior. More discussion at ++ https://redmine.open-bio.org/issues/3437 ++ """ + #Now on to the reads... + #the read header format (fixed part): + #read_header_length H +@@ -589,20 +596,41 @@ def _sff_read_seq_record(handle, number_of_flows_per_read, flow_chars, + warnings.warn("Post quality %i byte padding region contained data, SFF data is not broken" + % padding) + #Follow Roche and apply most aggressive of qual and adapter clipping. +- #Note Roche seems to ignore adapter clip fields when writing SFF, +- #and uses just the quality clipping values for any clipping. +- clip_left = max(clip_qual_left, clip_adapter_left) +- #Right clipping of zero means no clipping +- if clip_qual_right: +- if clip_adapter_right: +- clip_right = min(clip_qual_right, clip_adapter_right) ++ #Note Roche does not use adapter clip fields when writing SFF files ++ #but instead combines the adapter clipping information with quality-based ++ #values and writes the most aggressive combination into clip fields (as ++ #allowed by SFF specs). ++ ++ if interpret_qual_trims: ++ if interpret_adapter_trims: ++ clip_left = max(clip_qual_left, clip_adapter_left) ++ #Right clipping of zero means no clipping ++ if clip_qual_right: ++ if clip_adapter_right: ++ clip_right = min(clip_qual_right, clip_adapter_right) ++ else: ++ #Typical case with Roche SFF files ++ clip_right = clip_qual_right ++ elif clip_adapter_right: ++ clip_right = clip_adapter_right ++ else: ++ clip_right = seq_len + else: +- #Typical case with Roche SFF files +- clip_right = clip_qual_right +- elif clip_adapter_right: +- clip_right = clip_adapter_right ++ clip_left = clip_qual_left ++ if clip_qual_right: ++ clip_right = clip_qual_right ++ else: ++ clip_right = seq_len ++ elif interpret_adapter_trims: ++ clip_left = clip_adapter_left ++ if clip_adapter_right: ++ clip_right = clip_adapter_right ++ else: ++ clip_right = seq_len + else: +- clip_right = seq_len ++ clip_left = 0 ++ clip_right = seq_len ++ + #Now build a SeqRecord + if trim: + seq = seq[clip_left:clip_right].upper() diff --git a/sci-biology/biopython/files/biopython-1.51-flex.patch b/sci-biology/biopython/files/biopython-1.51-flex.patch new file mode 100644 index 0000000..afd5094 --- /dev/null +++ b/sci-biology/biopython/files/biopython-1.51-flex.patch @@ -0,0 +1,21 @@ +--- setup.py.old 2008-11-25 18:03:16.000000000 +0100 ++++ setup.py 2008-11-25 18:04:14.000000000 +0100 +@@ -341,12 +341,12 @@ + include_dirs=["Bio"] + ), + #Commented out due to the build dependency on flex, see Bug 2619 +-# Extension('Bio.PDB.mmCIF.MMCIFlex', +-# ['Bio/PDB/mmCIF/lex.yy.c', +-# 'Bio/PDB/mmCIF/MMCIFlexmodule.c'], +-# include_dirs=["Bio"], +-# libraries=["fl"] +-# ), ++ Extension('Bio.PDB.mmCIF.MMCIFlex', ++ ['Bio/PDB/mmCIF/lex.yy.c', ++ 'Bio/PDB/mmCIF/MMCIFlexmodule.c'], ++ include_dirs=["Bio"], ++ libraries=["fl"] ++ ), + Extension('Bio.Nexus.cnexus', + ['Bio/Nexus/cnexus.c'] + ), diff --git a/sci-biology/biopython/files/biopython-1.62-SffIO.patch b/sci-biology/biopython/files/biopython-1.62-SffIO.patch new file mode 100644 index 0000000..7f2208e --- /dev/null +++ b/sci-biology/biopython/files/biopython-1.62-SffIO.patch @@ -0,0 +1,36 @@ +--- Bio/SeqIO/SffIO.py.ori 2013-09-25 13:28:51.000000000 +0200 ++++ Bio/SeqIO/SffIO.py 2013-09-25 13:37:44.000000000 +0200 +@@ -383,7 +383,14 @@ + if padding: + padding = 8 - padding + if handle.read(padding).count(_null) != padding: +- raise ValueError("Post quality %i byte padding region contained data" ++ import warnings ++ from Bio import BiopythonParserWarning ++ warnings.warn("Your SFF file is valid but post quality %i byte " ++ "padding region contains UNUSED data. Was the " ++ "SFF file created by SRA sff-dump >2.1.7 and <2.1.10? " ++ "It did not clear some internal buffer while writing " ++ "out new data so that previous values remained in the" ++ "output unless overwritten by new real values." + % padding) + #print read, name, record_offset + yield name, record_offset +--- Bio/SeqIO/SffIO.py.ori 2013-09-25 14:07:14.000000000 +0200 ++++ Bio/SeqIO/SffIO.py 2013-09-25 14:08:59.000000000 +0200 +@@ -596,7 +596,14 @@ + if padding: + padding = 8 - padding + if handle.read(padding).count(_null) != padding: +- raise ValueError("Post quality %i byte padding region contained data" ++ import warnings ++ from Bio import BiopythonParserWarning ++ warnings.warn("Your SFF file is valid but post quality %i byte " ++ "padding region contains UNUSED data. Was the " ++ "SFF file created by SRA sff-dump >2.1.7 and <2.1.10? " ++ "It did not clear some internal buffer while writing " ++ "out new data so that previous values remained in the" ++ "output unless overwritten by new real values." + % padding) + #Follow Roche and apply most aggressive of qual and adapter clipping. + #Note Roche seems to ignore adapter clip fields when writing SFF, diff --git a/sci-biology/biopython/metadata.xml b/sci-biology/biopython/metadata.xml new file mode 100644 index 0000000..f17a827 --- /dev/null +++ b/sci-biology/biopython/metadata.xml @@ -0,0 +1,5 @@ + + + + sci-biology +