* [gentoo-portage-dev] equery: RFC and code review
@ 2009-02-10 10:53 Douglas Anderson
2009-02-12 7:10 ` Alec Warner
0 siblings, 1 reply; 7+ messages in thread
From: Douglas Anderson @ 2009-02-10 10:53 UTC (permalink / raw
To: gentoo-portage-dev
Hi all,
It's been a few months since we first discussed refactoring equery on
this list and I think made pretty good time and I'm reasonably happy
with it. So I'm throwing out to you all. I've tried to keep a running
list of visible changes as I went along, but I'm sure I've missed some
things. You can read that here:
http://code.google.com/p/genscripts/source/browse/equery/trunk/equery/CHANGELOG
My primary goals for the refactorization were:
* Modularize
* Clean up the UI
* Rewrite a more solid option handler
* Eliminate as much duplicated code as possible
* Clean up slop and bitrot, follow clean coding standards and style
guidelines wherever possible
* Break up large functions, write docstrings for all functions
* Get code set for Python 2.6 (tested in 2.5 and 2.6) and prepare
smooth upgrade for Py3k.
* Create a more consistent experience for the user.
This last point is surely where the most contention will stem from, so
let me argue my position. All modules except for list, check and size
display the help message when called without input (ex: equery uses).
This makes sense. It's what most other programs do. It's what emerge
does. It's what eselect does. It's the neighborly thing to do.
However these three exceptions go and start a time-consuming process
which is of very limited usefulness. With a cold cache, each process
takes at least 30 seconds. I imagine that 99 percent of people who
type 'equery list' do it by accident. And who needs to see the size of
every installed package? And checksum every installed package? Almost
useless except to a select few people who know what they're doing.
So I took the liberty of changing the default behavior of these three
modules to be more in line with the rest of equery and the world. Each
of the three displays deprecation warning explaining how to recreate
the old default behavior. I can't imagine anyone objecting outright to
this, considering there will always be more than one version of
gentoolkit in the tree (there are 6 now) so it'd be a long time before
anyone was forced into the new behavior.
That's it. For anyone interested, please read the link above and let
me know if you like it/hate it/want something else done.
For python programmers, I'd appreciate a quick code review and any
comments/criticism whatsoever (I really like criticism).
Thanks in advance,
-Doug
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-portage-dev] equery: RFC and code review
2009-02-10 10:53 [gentoo-portage-dev] equery: RFC and code review Douglas Anderson
@ 2009-02-12 7:10 ` Alec Warner
2009-02-12 8:19 ` Brian Harring
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Alec Warner @ 2009-02-12 7:10 UTC (permalink / raw
To: gentoo-portage-dev
On Tue, Feb 10, 2009 at 2:53 AM, Douglas Anderson <dja@gendja.com> wrote:
> Hi all,
>
> It's been a few months since we first discussed refactoring equery on
> this list and I think made pretty good time and I'm reasonably happy
> with it. So I'm throwing out to you all. I've tried to keep a running
> list of visible changes as I went along, but I'm sure I've missed some
> things. You can read that here:
>
> http://code.google.com/p/genscripts/source/browse/equery/trunk/equery/CHANGELOG
>
> My primary goals for the refactorization were:
> * Modularize
> * Clean up the UI
> * Rewrite a more solid option handler
> * Eliminate as much duplicated code as possible
> * Clean up slop and bitrot, follow clean coding standards and style
> guidelines wherever possible
> * Break up large functions, write docstrings for all functions
> * Get code set for Python 2.6 (tested in 2.5 and 2.6) and prepare
> smooth upgrade for Py3k.
> * Create a more consistent experience for the user.
>
> This last point is surely where the most contention will stem from, so
> let me argue my position. All modules except for list, check and size
> display the help message when called without input (ex: equery uses).
> This makes sense. It's what most other programs do. It's what emerge
> does. It's what eselect does. It's the neighborly thing to do.
>
> However these three exceptions go and start a time-consuming process
> which is of very limited usefulness. With a cold cache, each process
> takes at least 30 seconds. I imagine that 99 percent of people who
> type 'equery list' do it by accident. And who needs to see the size of
> every installed package? And checksum every installed package? Almost
> useless except to a select few people who know what they're doing.
>
> So I took the liberty of changing the default behavior of these three
> modules to be more in line with the rest of equery and the world. Each
> of the three displays deprecation warning explaining how to recreate
> the old default behavior. I can't imagine anyone objecting outright to
> this, considering there will always be more than one version of
> gentoolkit in the tree (there are 6 now) so it'd be a long time before
> anyone was forced into the new behavior.
>
> That's it. For anyone interested, please read the link above and let
> me know if you like it/hate it/want something else done.
>
> For python programmers, I'd appreciate a quick code review and any
> comments/criticism whatsoever (I really like criticism).
You Asked...
I think one of the major things that annoyed me when reading the code
is that you did add a new option parser...and you added it about a
dozen times...(regexp used to basically find the start of the same
code block over and over).
antarus@kyoto ~/code/checkouts/genscripts-read-only/equery $ grep
"opts = .x" *.py | wc -l
13
Now the option parsing is quite complicated so I'm not sure if a
complicated data structure to pass to a generic function is really a
good use of ones time. But it stands out.
None of your files has a copyright notice.
I pray pp.print_error prints to stderr and not stdout; otherwise tends
to be a big nitpick of mine when people write errors to stdout.
belongs.py has this gem:
q.append(('^' if query[0] == '/' else '/') + re.escape(query) + '$')
I realize it works; but honestly, not readable.
append_str = ""
if query[0] == '/' append_str += '^' else append_str += '/'
append_str += re.escape(query) + '$'
q.append(append_str)
Slightly better (if append_str has a shorter name, maybe). I don't
like 1 line if EXPR STATEMENT else STATEMENT but sometimes they are
shorter
Putting a conditional in the function call is horrible looking though.
You use # FOO comments often; sometimes too often. Explaining what
you are doing is typically better noted in a docstring for a function
and not necessarily right next to the code it pertains to. An example
would be meta.py in get_useflags. You have a 4 line comment block
describing what you are doing but it really interferes with the
reading of the code. Functions should typically be short enough that
referring to the docstring is no big deal (and the example I chose is
a very short function). In addition you have a lot of "# do something
obvious here." If it is obvious; you don't need a comment. An
example of this would be meta.py around line 418.
in __init__.py your find_matches function is too long (180 lines or
4.5 of my screen heights). You've already split it up into convenient
sections by comments (which are fine by the way) but you can probably
make subroutines out of some of the subsections. Also most of the
code has nothing to do with finding matches and lots to do with query
preparation.
in __init__.py you have InvalidRegex with a TODO to move it to another
file. It seems like exceptions.py might be a worthy place.
I think there are a number of cases where I'd prefer to see string
interpolation instead of explicit addition:
'%s/%s' % (cat, package)
But thats just me being picky (and you use string interpolation often,
which is good).
Thats what I found in about 20 minutes of looking.
-Alec
>
> Thanks in advance,
> -Doug
>
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-portage-dev] equery: RFC and code review
2009-02-12 7:10 ` Alec Warner
@ 2009-02-12 8:19 ` Brian Harring
2009-02-12 10:01 ` René 'Necoro' Neumann
2009-02-12 10:25 ` Douglas Anderson
2009-02-12 11:27 ` Marijn Schouten (hkBst)
2 siblings, 1 reply; 7+ messages in thread
From: Brian Harring @ 2009-02-12 8:19 UTC (permalink / raw
To: gentoo-portage-dev
[-- Attachment #1: Type: text/plain, Size: 199 bytes --]
On Wed, Feb 11, 2009 at 11:10:43PM -0800, Alec Warner wrote:
> belongs.py has this gem:
> q.append(('^' if query[0] == '/' else '/') + re.escape(query) + '$')
Also a python2.6 only feature...
~brian
[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-portage-dev] equery: RFC and code review
2009-02-12 8:19 ` Brian Harring
@ 2009-02-12 10:01 ` René 'Necoro' Neumann
2009-02-12 10:39 ` Douglas Anderson
0 siblings, 1 reply; 7+ messages in thread
From: René 'Necoro' Neumann @ 2009-02-12 10:01 UTC (permalink / raw
To: gentoo-portage-dev
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Brian Harring schrieb:
> On Wed, Feb 11, 2009 at 11:10:43PM -0800, Alec Warner wrote:
>> belongs.py has this gem:
>> q.append(('^' if query[0] == '/' else '/') + re.escape(query) + '$')
> Also a python2.6 only feature...
> ~brian
python-2.5 it is ...
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkmT83cACgkQ4UOg/zhYFuAnUQCeJs+JtVjJEs83xbffyEZTV7DE
j/MAnRo0hIynhEZUdrtF2FzJiW1Z3nB7
=zTDf
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-portage-dev] equery: RFC and code review
2009-02-12 10:01 ` René 'Necoro' Neumann
@ 2009-02-12 10:39 ` Douglas Anderson
0 siblings, 0 replies; 7+ messages in thread
From: Douglas Anderson @ 2009-02-12 10:39 UTC (permalink / raw
To: gentoo-portage-dev
On Thu, Feb 12, 2009 at 7:01 PM, René 'Necoro' Neumann <lists@necoro.eu> wrote:
> Hash: SHA1
>
> Brian Harring schrieb:
>> On Wed, Feb 11, 2009 at 11:10:43PM -0800, Alec Warner wrote:
>>> belongs.py has this gem:
>>> q.append(('^' if query[0] == '/' else '/') + re.escape(query) + '$')
>> Also a python2.6 only feature...
>> ~brian
>
> python-2.5 it is ...
Yep, this python ternary syntax was introduced in 2.5. I'm using quite
a few >=2.5 features...
* the new ternary syntax
* passing a tuple to str.startswith
* with statement
* max( key=...)
Again, there are a lot of version of gentoolkit in the tree, so I
didn't foresee this being a problem.
-Doug
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-portage-dev] equery: RFC and code review
2009-02-12 7:10 ` Alec Warner
2009-02-12 8:19 ` Brian Harring
@ 2009-02-12 10:25 ` Douglas Anderson
2009-02-12 11:27 ` Marijn Schouten (hkBst)
2 siblings, 0 replies; 7+ messages in thread
From: Douglas Anderson @ 2009-02-12 10:25 UTC (permalink / raw
To: gentoo-portage-dev
On Thu, Feb 12, 2009 at 4:10 PM, Alec Warner <antarus@gentoo.org> wrote:
> I think one of the major things that annoyed me when reading the code
> is that you did add a new option parser...and you added it about a
> dozen times...(regexp used to basically find the start of the same
> code block over and over).
>
> antarus@kyoto ~/code/checkouts/genscripts-read-only/equery $ grep
> "opts = .x" *.py | wc -l
> 13
>
> Now the option parsing is quite complicated so I'm not sure if a
> complicated data structure to pass to a generic function is really a
> good use of ones time. But it stands out.
Actually, I think you'll find that the amount of duplicated opt
parsing code in each module is really minimal, essentially only that
one line that your regex matched. Each module has its own set of
options and handles those options differently, so putting it all in
__init__ would be a royal mess.
I can see how at a glance it would appear like duplicated code, but
check out the different between normal getopt and gnu_getopt and how I
have those working between __init__ and the modules. There's nothing
being parsed twice. If you can come up with a simpler, cleaner way,
write back with a little more detail. But I just reexamined it and
even tried moving some stuff around, and I think it's about as good as
it gets right now.
>
> None of your files has a copyright notice.
>
There's one in equery.py. The rest of equery will reside in
python/site-packages. Does each one need the same copyright notice? (I
honestly don't know the policy)
> I pray pp.print_error prints to stderr and not stdout; otherwise tends
> to be a big nitpick of mine when people write errors to stdout.
$ grep -A 2 'def print_error' /usr/lib/gentoolkit/pym/gentoolkit/pprinter.py
def print_error(s):
"""Prints an error string to stderr."""
sys.stderr.write(output.red("!!! ") + s + "\n")
Safe :)
> belongs.py has this gem:
> q.append(('^' if query[0] == '/' else '/') + re.escape(query) + '$')
>
> I realize it works; but honestly, not readable.
You're right, that's pretty nasty. But look what it came from!
q = map(lambda x: ((len(x) and x[0] == "/") and "^" or "/") +
re.escape(x) + "$", query)
But I agree that it can be further improved. How's this:
if query.startswith('/'):
query = "^%s$" % re.escape(query)
else:
query = "/%s$" % re.escape(query)
q.append(query)
> You use # FOO comments often; sometimes too often.
Yep, good point. I comment a little excessively, and believe it or not
emeta (now the meta module) was my first ever python script, so it had
lots of "obvious" comments. I'll try to get rid of the silly stuff as
I run into them again.
> in __init__.py your find_matches function is too long (180 lines or
> 4.5 of my screen heights). You've already split it up into convenient
> sections by comments (which are fine by the way) but you can probably
> make subroutines out of some of the subsections.
Good idea. I'll do that.
> in __init__.py you have InvalidRegex with a TODO to move it to another
> file. It seems like exceptions.py might be a worthy place.
Yep, not 100% done yet and modifying more blocks to use exceptions,
and then moving the exceptions out to exceptions.py is on my list of
TODOs.
> I think there are a number of cases where I'd prefer to see string
> interpolation instead of explicit addition:
>
> '%s/%s' % (cat, package)
>
> But thats just me being picky (and you use string interpolation often,
> which is good).
Explicit string catting is also one of my pet peeves, but py-2.6 is
getting a brand new string formatting syntax and "%" style
interpolation will be deprecated in 3k, so I only rewrote strings
where readability was suffering. I'm biding my time... but I
definitely agree with you here.
>
> Thats what I found in about 20 minutes of looking.
>
> -Alec
Muchly appreciated!
-Doug
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-portage-dev] equery: RFC and code review
2009-02-12 7:10 ` Alec Warner
2009-02-12 8:19 ` Brian Harring
2009-02-12 10:25 ` Douglas Anderson
@ 2009-02-12 11:27 ` Marijn Schouten (hkBst)
2 siblings, 0 replies; 7+ messages in thread
From: Marijn Schouten (hkBst) @ 2009-02-12 11:27 UTC (permalink / raw
To: gentoo-portage-dev
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Alec Warner wrote:
> On Tue, Feb 10, 2009 at 2:53 AM, Douglas Anderson <dja@gendja.com> wrote:
<snip>
> belongs.py has this gem:
> q.append(('^' if query[0] == '/' else '/') + re.escape(query) + '$')
>
> I realize it works; but honestly, not readable.
>
> append_str = ""
> if query[0] == '/' append_str += '^' else append_str += '/'
> append_str += re.escape(query) + '$'
> q.append(append_str)
>
> Slightly better (if append_str has a shorter name, maybe). I don't
> like 1 line if EXPR STATEMENT else STATEMENT but sometimes they are
> shorter
> Putting a conditional in the function call is horrible looking though.
It's called functional programming. Maybe it takes a little getting used to, but
as your own example shows, it can cut your code base by a significant factor.
Marijn
- --
Sarcasm puts the iron in irony, cynicism the steel.
Marijn Schouten (hkBst), Gentoo Lisp project, Gentoo ML
<http://www.gentoo.org/proj/en/lisp/>, #gentoo-{lisp,ml} on FreeNode
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.9 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iEYEARECAAYFAkmUB54ACgkQp/VmCx0OL2wIOQCfdd0U1TJ/GXxTwvV8vmMpW2u9
hfsAoIXkWvJP8sOGK4NstuXiCAgy5OIf
=qfq4
-----END PGP SIGNATURE-----
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-02-12 10:39 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-02-10 10:53 [gentoo-portage-dev] equery: RFC and code review Douglas Anderson
2009-02-12 7:10 ` Alec Warner
2009-02-12 8:19 ` Brian Harring
2009-02-12 10:01 ` René 'Necoro' Neumann
2009-02-12 10:39 ` Douglas Anderson
2009-02-12 10:25 ` Douglas Anderson
2009-02-12 11:27 ` Marijn Schouten (hkBst)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox