From: Alexander Berntsen <bernalex@gentoo.org>
To: gentoo-portage-dev@lists.gentoo.org
Subject: Re: [gentoo-portage-dev] [PATCH] emerge: add --search-fuzzy and --search-fuzzy-cutoff options (bug 65566)
Date: Mon, 4 Apr 2016 10:39:17 +0200 [thread overview]
Message-ID: <57022835.2060304@gentoo.org> (raw)
In-Reply-To: <1459746182-13420-1-git-send-email-zmedico@gentoo.org>
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA512
This is a great idea!
On 04/04/16 07:03, Zac Medico wrote:
> +.BR "\-\-search\-fuzzy [ y | n ]"
> +Enable or disable fuzzy search for search actions.
This is likely a good place to briefly explain what a "fuzzy search"
is.
Also, I'm not sold on "seach-fuzzy" as opposed to "fuzzy-search". Is
there a particular reasoning for it? Since we don't seem to have a
standardised "verbs mean this, nouns mean this" anyway, I would use
the latter phrase.
You also need to document your note on regexes.
Lastly, you also need to document that a fuzzy search is slower than a
regular search.
> +.TP
> +.BR "\-\-search\-fuzzy\-cutoff CUTOFF"
> +Set similarity ratio cutoff (a floating-point number between 0 and 1).
> +Results with similarity ratios lower than the cutoff are discarded.
> +This option has no effect unless the \fB\-\-search\-fuzzy\fR option
> +is enabled.
This explanation is a bit heavy to read. And I think that using 0 to 1
isn't very nice. And calling the number "floating point" instead of
decimal isn't very useful nor nice. How about making it a percentage,
and describing it simply as a similarity percentage -- "package names
must be at least N% similar to the search term to appear in search
results". The option could then be called --seach-fuzzy-similarity,
or (in keeping with the previous suggestion)
- --fuzzy-search-similarity, or -- wait for it -- something similar. ;)
Of course if you agree with this, you'll have to reverse the code to
represent which results to show, rather than which ones to not show.
You should also document here what happens if there's a mistake in the
input.
> + "--search-fuzzy-cutoff": {
> + "help": "Set similarity ratio cutoff (a floating-point number between 0 and 1)",
> + "action": "store"
> + },
See comments above regarding how to explain what this actually does.
> + if myoptions.search_fuzzy_cutoff:
> + try:
> + fuzzy_cutoff = float(myoptions.search_fuzzy_cutoff)
> + except ValueError:
> + fuzzy_cutoff = 0.0
Is this a reasonable fallback? I guess so... but you need to mention
it in the manpage, as mentioned.
> +
> + if fuzzy_cutoff <= 0.0:
> + fuzzy_cutoff = None
> + if not silent:
> + parser.error("Invalid --search-fuzzy-cutoff parameter: '%s'\n" % \
> + (myoptions.search_fuzzy_cutoff,))
> +
> + myoptions.search_fuzzy_cutoff = fuzzy_cutoff
> +
I also don't understand why the first one is just 0.0, but this one
is an error. Why aren't both either errors and revert to 0.8 cut-off
(or 80% similarity) or 0.0/100?
And this needs to go in the manpage too.
> + self.fuzzy_cutoff = 0.8 if fuzzy_cutoff is None else fuzzy_cutoff
See above.
> + fuzzy = False
Here's an interesting discussion: maybe this should be True? After
all, it's True in any modern search engine. What do you think?
> + # Fuzzy search does not support regular expressions, therefore
> + # it is disabled for regular expression searches.
Manpage.
- --
Alexander
bernalex@gentoo.org
https://secure.plaimi.net/~alexander
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQIcBAEBCgAGBQJXAig0AAoJENQqWdRUGk8BOOEQAIEYXkn86ibMiYhN5BBDlsL1
2a6zBOCzygTkpxiBg+8vPsWJcHmzyTO7M6H1x3bUCY/JEfWq0354WdvNMtDM5qZk
zpwIg0uPs/Q4Fo40hozHsc66f+jqZxgmy5rML2mO8cAFZANZdNtuvTkVQYF5zQXz
4CI06tVDwXmYAmg7wIBEpWJ8O+is2F1abzPJcr42tLz5ELYm1IRn4Em8WO5m5klm
mrYWWeesvNS1l2y8kbKCmtpQbSuzLYfFyVfFkSL/p6t16Tiu7edqGJ0HOrq5B5dx
+cwuT+vwbTtA8d/Qo/cifbyuxnNtO8JthhEvemAdCYkDC4DQHDStsKFjA+Za1Sos
r/eSQexXNOQ/oMgksm72aX9rIkfurtn73AhIthKEnzrzou3pVW+H5eHR25vF58EO
qHUJO9/Z8ZkHec3HopxFtYng16i26VlW2pDehdkWGVoZSXomaOyH7x7XQXZoE7B+
4e4vDOMbeIvxyA/j1+H35WBZCu6f9FstOrEptD5FIE6/QM4oAW+CBllUQf5iQVEB
4Rpodu2AvKWgqTTOMLcn9+HK8JgnbMlm6cYLT+YXP7j6OnJFB6yq5/L3dfS5rrEX
sxwrvVTTx2dCbX/RImQoMpEIQFaTfimZgKQDw3rmtv+JfP3OnpdOrN+QJJfHbCgb
4c9suzs/UTBLbtiFQhdO
=XsDv
-----END PGP SIGNATURE-----
next prev parent reply other threads:[~2016-04-04 8:39 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-04-04 5:03 [gentoo-portage-dev] [PATCH] emerge: add --search-fuzzy and --search-fuzzy-cutoff options (bug 65566) Zac Medico
2016-04-04 8:39 ` Alexander Berntsen [this message]
2016-04-08 6:21 ` Zac Medico
2016-04-08 11:33 ` Alexander Berntsen
2016-07-25 2:58 ` Zac Medico
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=57022835.2060304@gentoo.org \
--to=bernalex@gentoo.org \
--cc=gentoo-portage-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox