From: "Martin Mokrejs" <mmokrejs@fold.natur.cuni.cz>
To: gentoo-commits@lists.gentoo.org
Subject: [gentoo-commits] proj/sci:master commit in: sci-biology/biopython/, sci-biology/biopython/files/
Date: Fri, 18 Apr 2014 18:23:25 +0000 (UTC) [thread overview]
Message-ID: <1397845280.74035c984818c6829c1f82cbcc1f419ff45f93d0.mmokrejs@gentoo> (raw)
commit: 74035c984818c6829c1f82cbcc1f419ff45f93d0
Author: Martin Mokrejš <mmokrejs <AT> fold <DOT> natur <DOT> cuni <DOT> cz>
AuthorDate: Fri Apr 18 18:21:20 2014 +0000
Commit: Martin Mokrejs <mmokrejs <AT> fold <DOT> natur <DOT> cuni <DOT> cz>
CommitDate: Fri Apr 18 18:21:20 2014 +0000
URL: http://git.overlays.gentoo.org/gitweb/?p=proj/sci.git;a=commit;h=74035c98
sci-biology/biopython: ops, taking back my own testing patches
Package-Manager: portage-2.2.7
---
sci-biology/biopython/ChangeLog | 4 +
sci-biology/biopython/files/SeqRecord.py.patch | 148 ---------------------
.../biopython/files/adjust-trimpoints.patch | 76 -----------
3 files changed, 4 insertions(+), 224 deletions(-)
diff --git a/sci-biology/biopython/ChangeLog b/sci-biology/biopython/ChangeLog
index c326c4a..796f6a9 100644
--- a/sci-biology/biopython/ChangeLog
+++ b/sci-biology/biopython/ChangeLog
@@ -2,6 +2,10 @@
# Copyright 1999-2014 Gentoo Foundation; Distributed under the GPL v2
# $Header: $
+ 18 Apr 2014; Martin Mokrejs <mmokrejs@fold.natur.cuni.cz>
+ -files/SeqRecord.py.patch, -files/adjust-trimpoints.patch:
+ sci-biology/biopython: ops, taking back my own testing patches
+
*biopython-1.62-r3 (18 Apr 2014)
*biopython-1.62-r4 (18 Apr 2014)
diff --git a/sci-biology/biopython/files/SeqRecord.py.patch b/sci-biology/biopython/files/SeqRecord.py.patch
deleted file mode 100644
index ac3785f..0000000
--- a/sci-biology/biopython/files/SeqRecord.py.patch
+++ /dev/null
@@ -1,148 +0,0 @@
-diff --git a/Bio/SeqIO/SffIO.py b/Bio/SeqIO/SffIO.py
-index 1971dba..43b38fd 100644
---- a/Bio/SeqIO/SffIO.py
-+++ b/Bio/SeqIO/SffIO.py
-@@ -539,8 +539,15 @@ _valid_UAN_read_name = re.compile(r'^[a-zA-Z0-9]{14}$')
-
-
- def _sff_read_seq_record(handle, number_of_flows_per_read, flow_chars,
-- key_sequence, alphabet, trim=False):
-- """Parse the next read in the file, return data as a SeqRecord (PRIVATE)."""
-+ key_sequence, alphabet, trim=False, interpret_qual_trims=True, interpret_adapter_trims=False):
-+ """Parse the next read in the file, return data as a SeqRecord (PRIVATE).
-+ Allow user to specify which type of clipping values should be applied
-+ while reading the SFF stream. To be backwards compatible, we interpret
-+ only the quality-based trim points by default. That results in lower-cased
-+ sequences in the low-qual region, regardless what adapter-based clip points
-+ say. This should be the desired behavior. More discussion at
-+ https://redmine.open-bio.org/issues/3437
-+ """
- #Now on to the reads...
- #the read header format (fixed part):
- #read_header_length H
-@@ -589,20 +596,41 @@ def _sff_read_seq_record(handle, number_of_flows_per_read, flow_chars,
- warnings.warn("Post quality %i byte padding region contained data, SFF data is not broken"
- % padding)
- #Follow Roche and apply most aggressive of qual and adapter clipping.
-- #Note Roche seems to ignore adapter clip fields when writing SFF,
-- #and uses just the quality clipping values for any clipping.
-- clip_left = max(clip_qual_left, clip_adapter_left)
-- #Right clipping of zero means no clipping
-- if clip_qual_right:
-- if clip_adapter_right:
-- clip_right = min(clip_qual_right, clip_adapter_right)
-+ #Note Roche does not use adapter clip fields when writing SFF files
-+ #but instead combines the adapter clipping information with quality-based
-+ #values and writes the most aggressive combination into clip fields (as
-+ #allowed by SFF specs).
-+
-+ if interpret_qual_trims:
-+ if interpret_adapter_trims:
-+ clip_left = max(clip_qual_left, clip_adapter_left)
-+ #Right clipping of zero means no clipping
-+ if clip_qual_right:
-+ if clip_adapter_right:
-+ clip_right = min(clip_qual_right, clip_adapter_right)
-+ else:
-+ #Typical case with Roche SFF files
-+ clip_right = clip_qual_right
-+ elif clip_adapter_right:
-+ clip_right = clip_adapter_right
-+ else:
-+ clip_right = seq_len
- else:
-- #Typical case with Roche SFF files
-- clip_right = clip_qual_right
-- elif clip_adapter_right:
-- clip_right = clip_adapter_right
-+ clip_left = clip_qual_left
-+ if clip_qual_right:
-+ clip_right = clip_qual_right
-+ else:
-+ clip_right = seq_len
-+ elif interpret_adapter_trims:
-+ clip_left = clip_adapter_left
-+ if clip_adapter_right:
-+ clip_right = clip_adapter_right
-+ else:
-+ clip_right = seq_len
- else:
-- clip_right = seq_len
-+ clip_left = 0
-+ clip_right = seq_len
-+
- #Now build a SeqRecord
- if trim:
- seq = seq[clip_left:clip_right].upper()
-diff --git a/Bio/SeqRecord.py b/Bio/SeqRecord.py
-index c90e13b..66bdea0 100644
---- a/Bio/SeqRecord.py
-+++ b/Bio/SeqRecord.py
-@@ -14,6 +14,8 @@ __docformat__ = "epytext en" # Simple markup to show doctests nicely
- # also BioSQL.BioSeq.DBSeq which is the "Database Seq" class)
-
-
-+from Bio.Seq import Seq
-+
- class _RestrictedDict(dict):
- """Dict which only allows sequences of given length as values (PRIVATE).
-
-@@ -76,7 +78,7 @@ class _RestrictedDict(dict):
- if not hasattr(value, "__len__") or not hasattr(value, "__getitem__") \
- or (hasattr(self, "_length") and len(value) != self._length):
- raise TypeError("We only allow python sequences (lists, tuples or "
-- "strings) of length %i." % self._length)
-+ "strings) of length %i whereas you passed an object of length %s." % (self._length, str(len(value))))
- dict.__setitem__(self, key, value)
-
- def update(self, new_dict):
-@@ -290,10 +292,11 @@ class SeqRecord(object):
- """)
-
- def _set_seq(self, value):
-- #TODO - Add a deprecation warning that the seq should be write only?
-- if self._per_letter_annotations:
-- #TODO - Make this a warning? Silently empty the dictionary?
-- raise ValueError("You must empty the letter annotations first!")
-+ # we should be much more user friendly and accept even a plain sequence string
-+ # and make the Seq or MutableSeq object ourselves
-+ if not isinstance(value, Seq):
-+ raise ValueError("You must pass a Seq object containing the new sequence instead of just plain string.")
-+ else:
- self._seq = value
- try:
- self._per_letter_annotations = _RestrictedDict(length=len(self.seq))
-@@ -696,7 +699,7 @@ class SeqRecord(object):
- SeqIO.write(self, handle, format_spec)
- return handle.getvalue()
-
-- def __len__(self):
-+ def __len__(self, trim=False, interpret_qual_trims=True, interpret_adapter_trims=False):
- """Returns the length of the sequence.
-
- For example, using Bio.SeqIO to read in a FASTA nucleotide file:
-@@ -707,6 +710,10 @@ class SeqRecord(object):
- 309
- >>> len(record.seq)
- 309
-+
-+ It should be possible to get length of a raw object, of trimmed
-+ object by quality or adapter criteria or both, whenever user wants
-+ to, not only when data is parsed from input.
- """
- return len(self.seq)
-
-@@ -725,6 +732,13 @@ class SeqRecord(object):
- """
- return True
-
-+ def apply_trimpoints(self, trim=False, interpret_qual_trims=False, interpret_adapter_trims=False):
-+ """We should apply either of the quality-based or adapter-based annotated
-+ trim points and return a new, sliced object.
-+ """
-+ pass
-+
-+
- def __add__(self, other):
- """Add another sequence or string to this sequence.
-
diff --git a/sci-biology/biopython/files/adjust-trimpoints.patch b/sci-biology/biopython/files/adjust-trimpoints.patch
deleted file mode 100644
index dd6d548..0000000
--- a/sci-biology/biopython/files/adjust-trimpoints.patch
+++ /dev/null
@@ -1,76 +0,0 @@
-diff --git a/Bio/SeqIO/SffIO.py b/Bio/SeqIO/SffIO.py
-index 1971dba..43b38fd 100644
---- a/Bio/SeqIO/SffIO.py
-+++ b/Bio/SeqIO/SffIO.py
-@@ -539,8 +539,15 @@ _valid_UAN_read_name = re.compile(r'^[a-zA-Z0-9]{14}$')
-
-
- def _sff_read_seq_record(handle, number_of_flows_per_read, flow_chars,
-- key_sequence, alphabet, trim=False):
-- """Parse the next read in the file, return data as a SeqRecord (PRIVATE)."""
-+ key_sequence, alphabet, trim=False, interpret_qual_trims=True, interpret_adapter_trims=False):
-+ """Parse the next read in the file, return data as a SeqRecord (PRIVATE).
-+ Allow user to specify which type of clipping values should be applied
-+ while reading the SFF stream. To be backwards compatible, we interpret
-+ only the quality-based trim points by default. That results in lower-cased
-+ sequences in the low-qual region, regardless what adapter-based clip points
-+ say. This should be the desired behavior. More discussion at
-+ https://redmine.open-bio.org/issues/3437
-+ """
- #Now on to the reads...
- #the read header format (fixed part):
- #read_header_length H
-@@ -589,20 +596,41 @@ def _sff_read_seq_record(handle, number_of_flows_per_read, flow_chars,
- warnings.warn("Post quality %i byte padding region contained data, SFF data is not broken"
- % padding)
- #Follow Roche and apply most aggressive of qual and adapter clipping.
-- #Note Roche seems to ignore adapter clip fields when writing SFF,
-- #and uses just the quality clipping values for any clipping.
-- clip_left = max(clip_qual_left, clip_adapter_left)
-- #Right clipping of zero means no clipping
-- if clip_qual_right:
-- if clip_adapter_right:
-- clip_right = min(clip_qual_right, clip_adapter_right)
-+ #Note Roche does not use adapter clip fields when writing SFF files
-+ #but instead combines the adapter clipping information with quality-based
-+ #values and writes the most aggressive combination into clip fields (as
-+ #allowed by SFF specs).
-+
-+ if interpret_qual_trims:
-+ if interpret_adapter_trims:
-+ clip_left = max(clip_qual_left, clip_adapter_left)
-+ #Right clipping of zero means no clipping
-+ if clip_qual_right:
-+ if clip_adapter_right:
-+ clip_right = min(clip_qual_right, clip_adapter_right)
-+ else:
-+ #Typical case with Roche SFF files
-+ clip_right = clip_qual_right
-+ elif clip_adapter_right:
-+ clip_right = clip_adapter_right
-+ else:
-+ clip_right = seq_len
- else:
-- #Typical case with Roche SFF files
-- clip_right = clip_qual_right
-- elif clip_adapter_right:
-- clip_right = clip_adapter_right
-+ clip_left = clip_qual_left
-+ if clip_qual_right:
-+ clip_right = clip_qual_right
-+ else:
-+ clip_right = seq_len
-+ elif interpret_adapter_trims:
-+ clip_left = clip_adapter_left
-+ if clip_adapter_right:
-+ clip_right = clip_adapter_right
-+ else:
-+ clip_right = seq_len
- else:
-- clip_right = seq_len
-+ clip_left = 0
-+ clip_right = seq_len
-+
- #Now build a SeqRecord
- if trim:
- seq = seq[clip_left:clip_right].upper()
next reply other threads:[~2014-04-18 18:23 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-04-18 18:23 Martin Mokrejs [this message]
-- strict thread matches above, loose matches on Subject: below --
2016-07-23 7:25 [gentoo-commits] proj/sci:master commit in: sci-biology/biopython/, sci-biology/biopython/files/ Martin Mokrejs
2014-05-07 10:29 Justin Lecher
2014-04-18 18:14 Martin Mokrejs
2014-03-23 16:09 Martin Mokrejs
2014-03-23 16:00 Martin Mokrejs
2013-12-29 0:31 Justin Lecher
2013-09-25 12:29 Martin Mokrejs
2012-03-21 11:20 Justin Lecher
2011-12-07 15:14 Martin Mokrejs
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1397845280.74035c984818c6829c1f82cbcc1f419ff45f93d0.mmokrejs@gentoo \
--to=mmokrejs@fold.natur.cuni.cz \
--cc=gentoo-commits@lists.gentoo.org \
--cc=gentoo-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox