From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id 048E5139694 for ; Sun, 9 Jul 2017 17:20:16 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 55BBE23403E; Sun, 9 Jul 2017 17:20:15 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 37B1223403E for ; Sun, 9 Jul 2017 17:20:15 +0000 (UTC) Received: from oystercatcher.gentoo.org (unknown [IPv6:2a01:4f8:202:4333:225:90ff:fed9:fc84]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTPS id 8C43D341BB7 for ; Sun, 9 Jul 2017 17:20:13 +0000 (UTC) Received: from localhost.localdomain (localhost [IPv6:::1]) by oystercatcher.gentoo.org (Postfix) with ESMTP id B4BCF7488 for ; Sun, 9 Jul 2017 17:20:03 +0000 (UTC) From: "Zac Medico" To: gentoo-commits@lists.gentoo.org Content-Transfer-Encoding: 8bit Content-type: text/plain; charset=UTF-8 Reply-To: gentoo-dev@lists.gentoo.org, "Zac Medico" Message-ID: <1499620683.66df1d045a64f8ad6453d9668cdb66980c128b69.zmedico@gentoo> Subject: [gentoo-commits] proj/portage:master commit in: pym/_emerge/ X-VCS-Repository: proj/portage X-VCS-Files: pym/_emerge/search.py X-VCS-Directories: pym/_emerge/ X-VCS-Committer: zmedico X-VCS-Committer-Name: Zac Medico X-VCS-Revision: 66df1d045a64f8ad6453d9668cdb66980c128b69 X-VCS-Branch: master Date: Sun, 9 Jul 2017 17:20:03 +0000 (UTC) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-commits@lists.gentoo.org X-Archives-Salt: e76f053a-eafd-4020-b2d0-6ec8bd2a7a23 X-Archives-Hash: 6987fde6ea42ac709f0f866101fe0d5f commit: 66df1d045a64f8ad6453d9668cdb66980c128b69 Author: Zac Medico gentoo org> AuthorDate: Sat Jul 8 19:44:40 2017 +0000 Commit: Zac Medico gentoo org> CommitDate: Sun Jul 9 17:18:03 2017 +0000 URL: https://gitweb.gentoo.org/proj/portage.git/commit/?id=66df1d04 fuzzy search: weigh category similarity independently (bug 623648) Weigh the similarity of category and package names independently, in order to avoid matching lots of irrelevant packages in the same category when the package name is much shorter than the category name. X-Gentoo-bug: 623648 X-Gentoo-bug-url: https://bugs.gentoo.org/show_bug.cgi?id=623648 Acked-by: Brian Dolbec gentoo.org> pym/_emerge/search.py | 24 +++++++++++++++++++++--- 1 file changed, 21 insertions(+), 3 deletions(-) diff --git a/pym/_emerge/search.py b/pym/_emerge/search.py index 20a0c026e..dc91ad315 100644 --- a/pym/_emerge/search.py +++ b/pym/_emerge/search.py @@ -264,15 +264,33 @@ class search(object): if self.fuzzy: fuzzy = True cutoff = float(self.search_similarity) / 100 - seq_match = difflib.SequenceMatcher() - seq_match.set_seq2(self.searchkey.lower()) + if match_category: + # Weigh the similarity of category and package + # names independently, in order to avoid matching + # lots of irrelevant packages in the same category + # when the package name is much shorter than the + # category name. + part_split = portage.catsplit + else: + part_split = lambda match_string: (match_string,) - def fuzzy_search(match_string): + part_matchers = [] + for part in part_split(self.searchkey): + seq_match = difflib.SequenceMatcher() + seq_match.set_seq2(part.lower()) + part_matchers.append(seq_match) + + def fuzzy_search_part(seq_match, match_string): seq_match.set_seq1(match_string.lower()) return (seq_match.real_quick_ratio() >= cutoff and seq_match.quick_ratio() >= cutoff and seq_match.ratio() >= cutoff) + def fuzzy_search(match_string): + return all(fuzzy_search_part(seq_match, part) + for seq_match, part in zip( + part_matchers, part_split(match_string))) + for package in self._cp_all(): self._spinner_update()