From: "W. Trevor King" <wking@tremily.us>
To: gentoo-portage-dev@lists.gentoo.org
Subject: Re: [gentoo-portage-dev] [PATCH 3/3] pym/portage/package/ebuild/fetch.py: Factor out _get_uris
Date: Sun, 19 Jan 2014 15:06:29 -0800 [thread overview]
Message-ID: <20140119230629.GV29063@odin.tremily.us> (raw)
In-Reply-To: <CAAr7Pr-vakRAbU5Y8wcHown8JvfE_Y6SjCS+HL+32KiUDm=bLQ@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 7046 bytes --]
On Sun, Jan 19, 2014 at 02:36:43PM -0800, Alec Warner wrote:
> On Sat, Jan 18, 2014 at 7:07 PM, W. Trevor King <wking@tremily.us> wrote:
> > +def _get_file_uri_tuples(uris):
> > + """Return a list of (filename, uri) tuples
> > + """
> >
>
> As mike noted on another thread:
>
> ""Return a list of (filename, uri) tuples."""
I'd replaced this with:
"""
Return a list of (filename, uri) tuples
"""
in v2, but I'll queue the one-line form in v3.
> > +def _expand_mirror(uri, custom_mirrors=(), third_party_mirrors=()):
> > + """Replace the 'mirror://' scheme in the uri
> >
>
> Sentence should end in a period.
Ok. I'll do that for the other summaries as well, and uppercase URI,
in v3.
> > + writemsg(_("Invalid mirror definition in SRC_URI:\n"),
> > + noiselevel=-1)
> > + writemsg(" %s\n" % (uri), noiselevel=-1)
>
>
> Is this a py3k thing? You should be able to write:
>
> writemsg(" %s\n" % uri, noiselevel=-1)
>
> The tuple is only needed for multiple arguments in py2k.
1. I think I just copied and pasted it ;).
2. (uri) is not actually a tuple:
>>> type(('hello'))
<type 'str'>
To get a tuple, you need (uri,).
I'm fine removing the parens in v3.
> > +def _get_uris(uris, settings, custom_mirrors=(), locations=()):
> >
>
> I want you to be careful here. This is bad style when you are
> constructing mutable objects in parameter defaults:
I used tuples instead of lists because they are not mutable (as you
point out later on).
> I'm not sure if I appreciate that more or less than the other form....
>
> def _get_uris(uris, settings, custom_mirrors=None, locations=None):
> if not custom_mirrors:
> custom_mirrors = ()
> if not locations:
> locations = ()
If you want me to do it that way, I'll queue it for v3. I think the
default tuples are more readable. They are certainly shorter.
> Another advisable way to write it is to simply not have default
> arguments, and to force the caller to sync in an empty iterable:
Meh. I don't have strong feelings about this, but custom_mirrors
struck me as something that would not always be required ;).
> > + third_party_mirrors = settings.thirdpartymirrors()
> >
>
> Why pass in all of settings?
We need other stuff too, see v2. The less settings parsing I need to
do in fetch() itself, the happier I am. If that makes the helper
functions a bit uglier, that's fine. Fewer people will have to read
those.
> Think about testing this function.
Working on it for v3.
> Do you really want to try to construct some sort of 'testing'
> settings object, or simply construct a list?
I was going to use a dict for testing, and stub out a
thirdpartymirrors mock for _get_uris.
> > + third_party_mirror_uris = {}
> > + filedict = OrderedDict()
> > + primaryuri_dict = {}
> > + for filename, uri in _get_file_uri_tuples(uris=uris):
> > + if filename not in filedict:
> > + filedict[filename] = [
> > + os.path.join(location, 'distfiles',
> > filename)
> > + for location in locations]
> > + if uri is None:
> > + continue
> > + if uri.startswith('mirror://'):
> > + uris = _expand_mirror(
> >
>
> too many uri / uris variables...
>
> can we called this 'expanded_uris'?
Queued for v3.
> I'm really having trouble tracking what is what. Remember that
> 'uris' is already a parameter to this function. Are you shadowing it
> here, or trying to overwrite it?
Clobbering it. The original value is only used for the
_get_file_uri_tuples call.
> > + uri=uri, custom_mirrors=custom_mirrors,
> > + third_party_mirrors=third_party_mirrors)
> > + filedict[filename].extend(uri for group, uri in
> > uris)
> >
>
> group appears unused in your implicit iterable here.
>
> perhaps:
>
> filedict[filename].extend(uri for _, uri in uris)
Queued for v3.
> > + third_party_mirror_uris.setdefault(filename,
> > []).extend(
> > + uri for group, uri in uris
> > + if group == 'third-party')
> >
>
> I'm curious if this matters. We are iterator over 'uris' twice. Is it
> cheaper to do it once and build the iterables once?
>
> third_party_uris = []
> for group, uri in uris:
> if group == 'third_party':
> third_party_uris.append(uri)
> filedict[filename].append(uri)
>
> I'm guessing the iterable is so small it doesn't matter.
My guess is that the speed gain from list comprehension outweighs the
loss of double iterations, but I agree that it is probably doesn't
matter ;).
> > + if not filedict[filename]:
> > + writemsg(
> > + _("No known mirror by the name:
> > %s\n")
> > + % (mirror,))
> > + else:
> > + if restrict_fetch or force_mirror:
> >
>
> Are these globals or am I missing somethng?
Fixed in v2.
> > + # Only fetch from specific mirrors is
> > allowed.
> > + continue
> > + primaryuris = primaryuri_dict.get(filename)
> > + if primaryuris is None:
> > + primaryuris = []
> > + primaryuri_dict[filename] = primaryuris
> > + primaryuris.append(uri)
> > +
> > + # Order primaryuri_dict values to match that in SRC_URI.
> > + for uris in primaryuri_dict.values():
> > + uris.reverse()
>
> Is there any guaranteed ordering for dict.values()?
No.
> I thought dict order was random (and they seriously make it random
> in modern python versions, to detect bugs.) How does this
> uris.reverse() match the SRC_URI ordering?
No idea (copy-pasted code). I'd be happy to cut this out in v3.
> > +
> > + # Prefer third_party_mirrors over normal mirrors in cases when
> > + # the file does not yet exist on the normal mirrors.
> > + for filename, uris in third_party_mirror_uris.items():
> > + primaryuri_dict.setdefault(filename, []).extend(uris)
> > +
> > + # Now merge primaryuri values into filedict (includes mirrors
> > + # explicitly referenced in SRC_URI).
> > + if "primaryuri" in restrict:
> >
>
> same questoin here about where 'restrict' comes from.
Fixed in v2.
Thanks,
Trevor
--
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]
next prev parent reply other threads:[~2014-01-19 23:06 UTC|newest]
Thread overview: 45+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-01-19 3:07 [gentoo-portage-dev] [PATCH 0/3] Initial fetch() refactoring W. Trevor King
2014-01-19 3:07 ` [gentoo-portage-dev] [PATCH 1/3] pym/portage/package/ebuild/fetch.py: Factor out _get_checksum_failure_max_tries W. Trevor King
2014-01-20 1:26 ` Tom Wijsman
2014-01-20 1:56 ` W. Trevor King
2014-01-19 3:07 ` [gentoo-portage-dev] [PATCH 2/3] pym/portage/package/ebuild/fetch.py: Factor out _get_fetch_resume_size W. Trevor King
2014-01-20 1:41 ` Tom Wijsman
2014-01-20 2:01 ` W. Trevor King
2014-01-20 2:26 ` Tom Wijsman
2014-01-19 3:07 ` [gentoo-portage-dev] [PATCH 3/3] pym/portage/package/ebuild/fetch.py: Factor out _get_uris W. Trevor King
2014-01-19 21:36 ` Sebastian Luther
2014-01-19 21:43 ` W. Trevor King
2014-01-19 22:36 ` Alec Warner
2014-01-19 23:06 ` W. Trevor King [this message]
2014-01-19 23:31 ` W. Trevor King
2014-01-19 20:05 ` [gentoo-portage-dev] [PATCH 0/3] Initial fetch() refactoring Sebastian Luther
2014-01-19 20:58 ` W. Trevor King
2014-01-19 22:14 ` [gentoo-portage-dev] [PATCH v2 " W. Trevor King
2014-01-19 22:14 ` [gentoo-portage-dev] [PATCH v2 1/3] pym/portage/package/ebuild/fetch.py: Factor out _get_checksum_failure_max_tries W. Trevor King
2014-01-19 22:44 ` Alec Warner
2014-01-19 22:45 ` Alec Warner
2014-01-19 22:51 ` W. Trevor King
2014-01-19 22:52 ` Alec Warner
2014-01-19 22:14 ` [gentoo-portage-dev] [PATCH v2 2/3] pym/portage/package/ebuild/fetch.py: Factor out _get_fetch_resume_size W. Trevor King
2014-01-19 22:14 ` [gentoo-portage-dev] [PATCH v2 3/3] pym/portage/package/ebuild/fetch.py: Factor out _get_uris W. Trevor King
2014-01-20 3:26 ` [gentoo-portage-dev] [PATCH v3 0/4] Initial fetch() refactoring W. Trevor King
2014-01-20 3:26 ` [gentoo-portage-dev] [PATCH v3 1/4] pym/portage/package/ebuild/fetch.py: Factor out _get_checksum_failure_max_tries W. Trevor King
2014-01-20 3:26 ` [gentoo-portage-dev] [PATCH v3 2/4] pym/portage/package/ebuild/fetch.py: Factor out _get_fetch_resume_size W. Trevor King
2014-01-20 3:26 ` [gentoo-portage-dev] [PATCH v3 3/4] pym/portage/package/ebuild/fetch.py: Factor out _get_uris W. Trevor King
2014-01-21 2:57 ` [gentoo-portage-dev] " W. Trevor King
2014-01-20 3:26 ` [gentoo-portage-dev] [PATCH v3 4/4] pym/portage/package/ebuild/fetch.py: Flatten conditionals in _get_fetch_resume_size W. Trevor King
2014-01-22 5:35 ` [gentoo-portage-dev] [PATCH v3 0/4] Initial fetch() refactoring Mike Frysinger
2014-01-22 16:10 ` W. Trevor King
2014-01-22 19:00 ` Mike Frysinger
2014-01-27 4:00 ` [gentoo-portage-dev] " W. Trevor King
2014-01-19 21:22 ` [gentoo-portage-dev] [PATCH 0/3] " Sebastian Luther
2014-01-19 22:45 ` Alexander Berntsen
2014-01-19 22:46 ` Alec Warner
2014-01-19 22:50 ` Alexander Berntsen
2014-01-19 22:54 ` Alec Warner
2014-01-19 23:51 ` Alexander Berntsen
2014-01-19 23:53 ` Alec Warner
2014-01-19 23:11 ` W. Trevor King
2014-01-22 5:34 ` Mike Frysinger
2014-01-19 21:39 ` Sebastian Luther
2014-01-19 22:46 ` W. Trevor King
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20140119230629.GV29063@odin.tremily.us \
--to=wking@tremily.us \
--cc=gentoo-portage-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox