public inbox for gentoo-portage-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: "W. Trevor King" <wking@tremily.us>
To: gentoo-portage-dev@lists.gentoo.org
Subject: Re: [gentoo-portage-dev] [PATCH 3/3] pym/portage/package/ebuild/fetch.py: Factor out _get_uris
Date: Sun, 19 Jan 2014 15:06:29 -0800	[thread overview]
Message-ID: <20140119230629.GV29063@odin.tremily.us> (raw)
In-Reply-To: <CAAr7Pr-vakRAbU5Y8wcHown8JvfE_Y6SjCS+HL+32KiUDm=bLQ@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 7046 bytes --]

On Sun, Jan 19, 2014 at 02:36:43PM -0800, Alec Warner wrote:
> On Sat, Jan 18, 2014 at 7:07 PM, W. Trevor King <wking@tremily.us> wrote:
> > +def _get_file_uri_tuples(uris):
> > +       """Return a list of (filename, uri) tuples
> > +       """
> >
> 
> As mike noted on another thread:
> 
> ""Return a list of (filename, uri) tuples."""

I'd replaced this with:

  """
  Return a list of (filename, uri) tuples
  """

in v2, but I'll queue the one-line form in v3.

> > +def _expand_mirror(uri, custom_mirrors=(), third_party_mirrors=()):
> > +       """Replace the 'mirror://' scheme in the uri
> >
> 
> Sentence should end in a period.

Ok.  I'll do that for the other summaries as well, and uppercase URI,
in v3.

> > +               writemsg(_("Invalid mirror definition in SRC_URI:\n"),
> > +                        noiselevel=-1)
> > +               writemsg("  %s\n" % (uri), noiselevel=-1)
> 
> 
> Is this a py3k thing? You should be able to write:
> 
> writemsg("  %s\n" % uri, noiselevel=-1)
> 
> The tuple is only needed for multiple arguments in py2k.

1. I think I just copied and pasted it ;).
2. (uri) is not actually a tuple:

     >>> type(('hello'))
     <type 'str'>

   To get a tuple, you need (uri,).

I'm fine removing the parens in v3.

> > +def _get_uris(uris, settings, custom_mirrors=(), locations=()):
> >
> 
> I want you to be careful here. This is bad style when you are
> constructing mutable objects in parameter defaults:

I used tuples instead of lists because they are not mutable (as you
point out later on).

> I'm not sure if I appreciate that more or less than the other form....
> 
> def _get_uris(uris, settings, custom_mirrors=None, locations=None):
>   if not custom_mirrors:
>     custom_mirrors = ()
>   if not locations:
>     locations = ()

If you want me to do it that way, I'll queue it for v3.  I think the
default tuples are more readable.  They are certainly shorter.

> Another advisable way to write it is to simply not have default
> arguments, and to force the caller to sync in an empty iterable:

Meh.  I don't have strong feelings about this, but custom_mirrors
struck me as something that would not always be required ;).

> > +       third_party_mirrors = settings.thirdpartymirrors()
> >
> 
> Why pass in all of settings?

We need other stuff too, see v2.  The less settings parsing I need to
do in fetch() itself, the happier I am.  If that makes the helper
functions a bit uglier, that's fine.  Fewer people will have to read
those.

> Think about testing this function.

Working on it for v3.

> Do you really want to try to construct some sort of 'testing'
> settings object, or simply construct a list?

I was going to use a dict for testing, and stub out a
thirdpartymirrors mock for _get_uris.
> > +       third_party_mirror_uris = {}
> > +       filedict = OrderedDict()
> > +       primaryuri_dict = {}
> > +       for filename, uri in _get_file_uri_tuples(uris=uris):
> > +               if filename not in filedict:
> > +                       filedict[filename] = [
> > +                               os.path.join(location, 'distfiles',
> > filename)
> > +                               for location in locations]
> > +               if uri is None:
> > +                       continue
> > +               if uri.startswith('mirror://'):
> > +                       uris = _expand_mirror(
> >
> 
> too many uri / uris variables...
> 
> can we called this 'expanded_uris'?

Queued for v3.

> I'm really having trouble tracking what is what. Remember that
> 'uris' is already a parameter to this function. Are you shadowing it
> here, or trying to overwrite it?

Clobbering it.  The original value is only used for the
_get_file_uri_tuples call.

> > +                               uri=uri, custom_mirrors=custom_mirrors,
> > +                               third_party_mirrors=third_party_mirrors)
> > +                       filedict[filename].extend(uri for group, uri in
> > uris)
> >
> 
> group appears unused in your implicit iterable here.
> 
> perhaps:
> 
> filedict[filename].extend(uri for _, uri in uris)

Queued for v3.

> > +                       third_party_mirror_uris.setdefault(filename,
> > []).extend(
> > +                               uri for group, uri in uris
> > +                               if group == 'third-party')
> >
> 
> I'm curious if this matters. We are iterator over 'uris' twice. Is it
> cheaper to do it once and build the iterables once?
> 
> third_party_uris = []
> for group, uri in uris:
>   if group == 'third_party':
>     third_party_uris.append(uri)
>   filedict[filename].append(uri)
> 
> I'm guessing the iterable is so small it doesn't matter.

My guess is that the speed gain from list comprehension outweighs the
loss of double iterations, but I agree that it is probably doesn't
matter ;).

> > +                       if not filedict[filename]:
> > +                               writemsg(
> > +                                       _("No known mirror by the name:
> > %s\n")
> > +                                       % (mirror,))
> > +               else:
> > +                       if restrict_fetch or force_mirror:
> >
> 
> Are these globals or am I missing somethng?

Fixed in v2.


> > +                               # Only fetch from specific mirrors is
> > allowed.
> > +                               continue
> > +                       primaryuris = primaryuri_dict.get(filename)
> > +                       if primaryuris is None:
> > +                               primaryuris = []
> > +                               primaryuri_dict[filename] = primaryuris
> > +                       primaryuris.append(uri)
> > +
> > +       # Order primaryuri_dict values to match that in SRC_URI.
> > +       for uris in primaryuri_dict.values():
> > +               uris.reverse()
> 
> Is there any guaranteed ordering for dict.values()?

No.

> I thought dict order was random (and they seriously make it random
> in modern python versions, to detect bugs.) How does this
> uris.reverse() match the SRC_URI ordering?

No idea (copy-pasted code).  I'd be happy to cut this out in v3.

> > +
> > +       # Prefer third_party_mirrors over normal mirrors in cases when
> > +       # the file does not yet exist on the normal mirrors.
> > +       for filename, uris in third_party_mirror_uris.items():
> > +               primaryuri_dict.setdefault(filename, []).extend(uris)
> > +
> > +       # Now merge primaryuri values into filedict (includes mirrors
> > +       # explicitly referenced in SRC_URI).
> > +       if "primaryuri" in restrict:
> >
> 
> same questoin here about where 'restrict' comes from.

Fixed in v2.

Thanks,
Trevor

-- 
This email may be signed or encrypted with GnuPG (http://www.gnupg.org).
For more information, see http://en.wikipedia.org/wiki/Pretty_Good_Privacy

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

  reply	other threads:[~2014-01-19 23:06 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-01-19  3:07 [gentoo-portage-dev] [PATCH 0/3] Initial fetch() refactoring W. Trevor King
2014-01-19  3:07 ` [gentoo-portage-dev] [PATCH 1/3] pym/portage/package/ebuild/fetch.py: Factor out _get_checksum_failure_max_tries W. Trevor King
2014-01-20  1:26   ` Tom Wijsman
2014-01-20  1:56     ` W. Trevor King
2014-01-19  3:07 ` [gentoo-portage-dev] [PATCH 2/3] pym/portage/package/ebuild/fetch.py: Factor out _get_fetch_resume_size W. Trevor King
2014-01-20  1:41   ` Tom Wijsman
2014-01-20  2:01     ` W. Trevor King
2014-01-20  2:26       ` Tom Wijsman
2014-01-19  3:07 ` [gentoo-portage-dev] [PATCH 3/3] pym/portage/package/ebuild/fetch.py: Factor out _get_uris W. Trevor King
2014-01-19 21:36   ` Sebastian Luther
2014-01-19 21:43     ` W. Trevor King
2014-01-19 22:36   ` Alec Warner
2014-01-19 23:06     ` W. Trevor King [this message]
2014-01-19 23:31       ` W. Trevor King
2014-01-19 20:05 ` [gentoo-portage-dev] [PATCH 0/3] Initial fetch() refactoring Sebastian Luther
2014-01-19 20:58   ` W. Trevor King
2014-01-19 22:14     ` [gentoo-portage-dev] [PATCH v2 " W. Trevor King
2014-01-19 22:14       ` [gentoo-portage-dev] [PATCH v2 1/3] pym/portage/package/ebuild/fetch.py: Factor out _get_checksum_failure_max_tries W. Trevor King
2014-01-19 22:44         ` Alec Warner
2014-01-19 22:45           ` Alec Warner
2014-01-19 22:51             ` W. Trevor King
2014-01-19 22:52               ` Alec Warner
2014-01-19 22:14       ` [gentoo-portage-dev] [PATCH v2 2/3] pym/portage/package/ebuild/fetch.py: Factor out _get_fetch_resume_size W. Trevor King
2014-01-19 22:14       ` [gentoo-portage-dev] [PATCH v2 3/3] pym/portage/package/ebuild/fetch.py: Factor out _get_uris W. Trevor King
2014-01-20  3:26       ` [gentoo-portage-dev] [PATCH v3 0/4] Initial fetch() refactoring W. Trevor King
2014-01-20  3:26         ` [gentoo-portage-dev] [PATCH v3 1/4] pym/portage/package/ebuild/fetch.py: Factor out _get_checksum_failure_max_tries W. Trevor King
2014-01-20  3:26         ` [gentoo-portage-dev] [PATCH v3 2/4] pym/portage/package/ebuild/fetch.py: Factor out _get_fetch_resume_size W. Trevor King
2014-01-20  3:26         ` [gentoo-portage-dev] [PATCH v3 3/4] pym/portage/package/ebuild/fetch.py: Factor out _get_uris W. Trevor King
2014-01-21  2:57           ` [gentoo-portage-dev] " W. Trevor King
2014-01-20  3:26         ` [gentoo-portage-dev] [PATCH v3 4/4] pym/portage/package/ebuild/fetch.py: Flatten conditionals in _get_fetch_resume_size W. Trevor King
2014-01-22  5:35         ` [gentoo-portage-dev] [PATCH v3 0/4] Initial fetch() refactoring Mike Frysinger
2014-01-22 16:10           ` W. Trevor King
2014-01-22 19:00             ` Mike Frysinger
2014-01-27  4:00         ` [gentoo-portage-dev] " W. Trevor King
2014-01-19 21:22 ` [gentoo-portage-dev] [PATCH 0/3] " Sebastian Luther
2014-01-19 22:45   ` Alexander Berntsen
2014-01-19 22:46     ` Alec Warner
2014-01-19 22:50       ` Alexander Berntsen
2014-01-19 22:54         ` Alec Warner
2014-01-19 23:51           ` Alexander Berntsen
2014-01-19 23:53             ` Alec Warner
2014-01-19 23:11       ` W. Trevor King
2014-01-22  5:34       ` Mike Frysinger
2014-01-19 21:39 ` Sebastian Luther
2014-01-19 22:46   ` W. Trevor King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140119230629.GV29063@odin.tremily.us \
    --to=wking@tremily.us \
    --cc=gentoo-portage-dev@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox