From: Zac Medico <zmedico@gentoo.org>
To: gentoo-portage-dev@lists.gentoo.org
Subject: Re: [gentoo-portage-dev] [PATCH] emerge: add --search-fuzzy and --search-fuzzy-cutoff options (bug 65566)
Date: Thu, 7 Apr 2016 23:21:57 -0700 [thread overview]
Message-ID: <57074E05.4030202@gentoo.org> (raw)
In-Reply-To: <57022835.2060304@gentoo.org>
On 04/04/2016 01:39 AM, Alexander Berntsen wrote:
> This is a great idea!
Yeah, we should have done this sooner. The search index makes our search
function so much nicer, so that gave me some incentive to continue
improving it.
>
>
> On 04/04/16 07:03, Zac Medico wrote:
>> +.BR "\-\-search\-fuzzy [ y | n ]"
>> +Enable or disable fuzzy search for search actions.
> This is likely a good place to briefly explain what a "fuzzy search"
> is.
Okay, will do.
> Also, I'm not sold on "seach-fuzzy" as opposed to "fuzzy-search". Is
> there a particular reasoning for it? Since we don't seem to have a
> standardised "verbs mean this, nouns mean this" anyway, I would use
> the latter phrase.
Okay, that will work for me.
> You also need to document your note on regexes.
Will do.
> Lastly, you also need to document that a fuzzy search is slower than a
> regular search.
Will do.
>> +.TP
>> +.BR "\-\-search\-fuzzy\-cutoff CUTOFF"
>> +Set similarity ratio cutoff (a floating-point number between 0 and 1).
>> +Results with similarity ratios lower than the cutoff are discarded.
>> +This option has no effect unless the \fB\-\-search\-fuzzy\fR option
>> +is enabled.
> This explanation is a bit heavy to read. And I think that using 0 to 1
> isn't very nice. And calling the number "floating point" instead of
> decimal isn't very useful nor nice. How about making it a percentage,
> and describing it simply as a similarity percentage -- "package names
> must be at least N% similar to the search term to appear in search
> results". The option could then be called --seach-fuzzy-similarity,
> or (in keeping with the previous suggestion)
> --fuzzy-search-similarity, or -- wait for it -- something similar. ;)
Okay, that will work for me.
> Of course if you agree with this, you'll have to reverse the code to
> represent which results to show, rather than which ones to not show.
Reverse? You want it to measure dissimilarity? Not sure what you mean.
> You should also document here what happens if there's a mistake in the
> input.
>
>> + "--search-fuzzy-cutoff": {
>> + "help": "Set similarity ratio cutoff (a floating-point number between 0 and 1)",
>> + "action": "store"
>> + },
> See comments above regarding how to explain what this actually does.
Yeah, the N% similar thing.
>> + if myoptions.search_fuzzy_cutoff:
>> + try:
>> + fuzzy_cutoff = float(myoptions.search_fuzzy_cutoff)
>> + except ValueError:
>> + fuzzy_cutoff = 0.0
> Is this a reasonable fallback? I guess so... but you need to mention
> it in the manpage, as mentioned.
It's not supposed to be a fallback, but rather a failure path. It
triggers an error message and unsuccessful exit.
>> +
>> + if fuzzy_cutoff <= 0.0:
>> + fuzzy_cutoff = None
>> + if not silent:
>> + parser.error("Invalid --search-fuzzy-cutoff parameter: '%s'\n" % \
>> + (myoptions.search_fuzzy_cutoff,))
>> +
>> + myoptions.search_fuzzy_cutoff = fuzzy_cutoff
>> +
> I also don't understand why the first one is just 0.0, but this one
> is an error. Why aren't both either errors and revert to 0.8 cut-off
> (or 80% similarity) or 0.0/100?
I just want it to fail if the input is invalid.
> And this needs to go in the manpage too.
>
>> + self.fuzzy_cutoff = 0.8 if fuzzy_cutoff is None else fuzzy_cutoff
> See above.
>
>> + fuzzy = False
> Here's an interesting discussion: maybe this should be True? After
> all, it's True in any modern search engine. What do you think?
Yeah, I agree.
>> + # Fuzzy search does not support regular expressions, therefore
>> + # it is disabled for regular expression searches.
> Manpage.
Will do.
--
Thanks,
Zac
next prev parent reply other threads:[~2016-04-08 6:22 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-04 5:03 [gentoo-portage-dev] [PATCH] emerge: add --search-fuzzy and --search-fuzzy-cutoff options (bug 65566) Zac Medico
2016-04-04 8:39 ` Alexander Berntsen
2016-04-08 6:21 ` Zac Medico [this message]
2016-04-08 11:33 ` Alexander Berntsen
2016-07-25 2:58 ` Zac Medico
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57074E05.4030202@gentoo.org \
--to=zmedico@gentoo.org \
--cc=gentoo-portage-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox