From: "Emma Strubell" <emma.strubell@gmail.com>
To: gentoo-portage-dev@lists.gentoo.org
Subject: Re: [gentoo-portage-dev] Re: search functionality in emerge
Date: Mon, 1 Dec 2008 21:23:04 -0500 [thread overview]
Message-ID: <5a8c638a0812011823x3fc3c3eesc0aa73566d6bc838@mail.gmail.com> (raw)
In-Reply-To: <cea53e3c0812011620w94e8847vb3777d2b05832ded@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 5275 bytes --]
yes, yes, i know, you're right :]
and thanks a bunch for the outline! about the compression, I agree that it
would be a good idea, but I don't know how to implement it. not that it
would be difficult... I'm guessing there's a gzip module for python that
would make it pretty straightforward? I think I'm getting ahead of myself,
though. I haven't even implemented the suffix tree yet!
Emma
On Mon, Dec 1, 2008 at 7:20 PM, Tambet <qtvali@gmail.com> wrote:
> 2008/12/2 Emma Strubell <emma.strubell@gmail.com>
>
>> True, true. Like I said, I don't really use overlays, so excuse my
>> igonrance.
>>
>
> Do you know an order of doing things:
>
> Rules of Optimization:
>
> - Rule 1: Don't do it.
> - Rule 2 (for experts only): Don't do it yet.
>
> What this actually means - functionality comes first. Readability comes
> next. Optimization comes last. Unless you are creating a fancy 3D engine for
> kung fu game.
>
> If you are going to exclude overlays, you are removing functionality - and,
> indeed, absolutely has-to-be-there functionality, because noone would
> intuitively expect search function to search only one subset of packages,
> however reasonable this subset would be. So, you can't, just can't, add this
> package into portage base - you could write just another external search
> package for portage.
>
> I looked this code a bit and:
> Portage's "__init__.py" contains comment "# search functionality". After
> this comment, there is a nice and simple search class.
> It also contains method "def action_sync(...)", which contains
> synchronization stuff.
>
> Now, search class will be initialized by setting up 3 databases - porttree,
> bintree and vartree, whatever those are. Those will be in self._dbs array
> and porttree will be in self._portdb.
>
> It contains some more methods:
> _findname(...) will return result of self._portdb.findname(...) with same
> parameters or None if it does not exist.
> Other methods will do similar things - map one or another method.
> execute will do the real search...
> Now - "for package in self.portdb.cp_all()" is important here ...it
> currently loops over whole portage tree. All kinds of matching will be done
> inside.
> self.portdb obviously points to porttree.py (unless it points to fake
> tree).
> cp_all will take all porttrees and do simple file search inside. This
> method should contain optional index search.
>
> self.porttrees = [self.porttree_root] + \
> [os.path.realpath(t) for t in self.mysettings["PORTDIR_OVERLAY"].split()]
>
> So, self.porttrees contains list of trees - first of them is root, others
> are overlays.
>
> Now, what you have to do will not be harder just because of having overlay
> search, too.
>
> You have to create method def cp_index(self), which will return dictionary
> containing package names as keys. For oroot... will be "self.porttrees[1:]",
> not "self.porttrees" - this will only search overlays. d = {} will be
> replaced with d = self.cp_index(). If index is not there, old version will
> be used (thus, you have to make internal porttrees variable, which contains
> all or all except first).
>
> Other methods used by search are xmatch and aux_get - first used several
> times and last one used to get description. You have to cache results of
> those specific queries and make them use your cache - as you can see, those
> parts of portage are already able to use overlays. Thus, you have to put
> your code again in beginning of those functions - create index_xmatch and
> index_aux_get methods, then make those methods use them and return their
> results unless those are None (or something other in case none is already
> legal result) - if they return None, old code will be run and do it's job.
> If index is not created, result is None. In index_** methods, just check if
> query is what you can answer and if it is, then answer it.
>
> Obviously, the simplest way to create your index is to delete index, then
> use those same methods to query for all nessecary information - and fastest
> way would be to add updating index directly into sync, which you could do
> later.
>
> Please, also, make those commands to turn index on and off (last one should
> also delete it to save disk space). Default should be off until it's fast,
> small and reliable. Also notice that if index is kept on hard drive, it
> might be faster if it's compressed (gz, for example) - decompressing takes
> less time and more processing power than reading it fully out.
>
> Have luck!
>
> -----BEGIN PGP SIGNED MESSAGE-----
>>> Hash: SHA1
>>>
>>> Emma Strubell schrieb:
>>> > 2) does anyone really need to search an overlay anyway?
>>>
>>> Of course. Take large (semi-)official overlays like sunrise. They can
>>> easily be seen as a second portage tree.
>>> -----BEGIN PGP SIGNATURE-----
>>> Version: GnuPG v2.0.9 (GNU/Linux)
>>> Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
>>>
>>> iEYEARECAAYFAkk0YpEACgkQ4UOg/zhYFuD3jQCdG/ChDmyOncpgUKeMuqDxD1Tt
>>> 0mwAn2FXskdEAyFlmE8shUJy7WlhHr4S
>>> =+lCO
>>> -----END PGP SIGNATURE-----
>>>
>>> On Mon, Dec 1, 2008 at 5:17 PM, René 'Necoro' Neumann <lists@necoro.eu>wrote:
>>
>>
>
[-- Attachment #2: Type: text/html, Size: 6454 bytes --]
next prev parent reply other threads:[~2008-12-02 2:23 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-11-23 12:17 [gentoo-portage-dev] search functionality in emerge Emma Strubell
2008-11-23 14:01 ` tvali
2008-11-23 14:33 ` Pacho Ramos
2008-11-23 14:43 ` Emma Strubell
2008-11-23 16:56 ` Lucian Poston
2008-11-23 18:49 ` Emma Strubell
2008-11-23 20:00 ` tvali
2008-11-23 21:20 ` Mike Auty
2008-11-23 21:59 ` René 'Necoro' Neumann
2008-11-24 0:53 ` tvali
2008-11-24 9:34 ` René 'Necoro' Neumann
2008-11-24 9:48 ` Fabian Groffen
2008-11-24 14:30 ` tvali
2008-11-24 15:14 ` tvali
2008-11-24 15:15 ` René 'Necoro' Neumann
2008-11-24 15:18 ` tvali
2008-11-24 17:15 ` tvali
2008-11-30 23:42 ` Emma Strubell
2008-12-01 7:34 ` [gentoo-portage-dev] " Duncan
2008-12-01 10:40 ` Emma Strubell
2008-12-01 17:52 ` Zac Medico
2008-12-01 21:25 ` Emma Strubell
2008-12-01 21:52 ` Tambet
2008-12-01 22:08 ` Emma Strubell
2008-12-01 22:17 ` René 'Necoro' Neumann
2008-12-01 22:47 ` Emma Strubell
2008-12-02 0:20 ` Tambet
2008-12-02 2:23 ` Emma Strubell [this message]
2008-12-02 10:21 ` Alec Warner
2008-12-02 12:42 ` Tambet
2008-12-02 13:51 ` Tambet
2008-12-02 19:54 ` Alec Warner
2008-12-02 21:47 ` Tambet
2008-12-02 17:42 ` Tambet
2008-11-23 14:56 ` [gentoo-portage-dev] " Douglas Anderson
2008-11-24 3:12 ` Marius Mauch
2008-11-24 5:01 ` devsk
2008-11-24 6:25 ` Marius Mauch
2008-11-24 6:47 ` [gentoo-portage-dev] " Duncan
2009-02-12 19:16 ` [gentoo-portage-dev] " René 'Necoro' Neumann
[not found] ` <5a8c638a0902121258s7402d9d7l1ad2b9a8ecf9820d@mail.gmail.com>
2009-02-12 21:01 ` Fwd: " Emma Strubell
2009-02-12 21:05 ` Mike Auty
2009-02-12 21:14 ` Emma Strubell
2009-02-13 13:37 ` Marijn Schouten (hkBst)
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5a8c638a0812011823x3fc3c3eesc0aa73566d6bc838@mail.gmail.com \
--to=emma.strubell@gmail.com \
--cc=gentoo-portage-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox