public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Portage metadata
@ 2003-02-27 10:23 robbat2
  2003-02-27 15:26 ` Yannick Koehler
  0 siblings, 1 reply; 3+ messages in thread
From: robbat2 @ 2003-02-27 10:23 UTC (permalink / raw
  To: gentoo-dev

[-- Attachment #1: Type: text/plain, Size: 1461 bytes --]

I've been looking into writing some code to get useful information out
of the portage system. Things like lists of packages using a given
USE flag and so forth. (see bug 16331 for something I wrote already)

Currently I was just wanting to stick to parsing all the ebuilds and
data in the tree directly, despite the lack of speed that involves.

The data in PORTDIR/metadata/cache appears to be promising, but I can't
find any defined format document on it.
It also appears to be a duplicate of /var/cache/edb/dep/ for the most
part.

Digging around google I get this from the google cache only:
http://www.google.ca/search?q=cache:Y_c08C4PKQwC:www.gentoo.org/~karltk/projects/munchie/submission-guide.html+gentoo+ebuild+submission+guide&hl=en&ie=UTF-8
This appears to be slightly out of date tho.

1. Is there any defined format to the metadata?
2. For later speedups, is anybody looking into using actual database
formats for the data? (GDBM/BDB/NDBM/etc.)

Once I get a little more time on my hands from school, I'd like to prove
my mettle as a developer to the gentoo community and join up as a
developer.
(For those interested in the meantime:
http://www.orbis-terrarum.net/?l=people.robbat2.resume )

-- 
Robin Hugh Johnson
E-Mail     : robbat2@orbis-terrarum.net
Home Page  : http://www.orbis-terrarum.net/?l=people.robbat2
ICQ#       : 30269588 or 41961639
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

[-- Attachment #2: Type: application/pgp-signature, Size: 232 bytes --]

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [gentoo-dev] Portage metadata
  2003-02-27 10:23 [gentoo-dev] Portage metadata robbat2
@ 2003-02-27 15:26 ` Yannick Koehler
  2003-02-28  7:33   ` Nick Jones
  0 siblings, 1 reply; 3+ messages in thread
From: Yannick Koehler @ 2003-02-27 15:26 UTC (permalink / raw
  To: gentoo-dev

On February 27, 2003 05:23 am, robbat2@orbis-terrarum.net wrote:
> 1. Is there any defined format to the metadata?

Each line is a DB field.

The DB field is defined inside portage.py  ->

auxdbkeys=['DEPEND','RDEPEND','SLOT','SRC_URI','RESTRICT','HOMEPAGE','LICENSE','DESCRIPTION','KEYWORDS','INHERITED','IUSE','CDEPEND','PDEPEND']

> 2. For later speedups, is anybody looking into using actual database
> formats for the data? (GDBM/BDB/NDBM/etc.)

Python internal DB is cache in memory and is quite fast, for certain task a DB 
would be faster but most of the time, the python DB seems to kick ass unless 
you do description search.

I have written in the past a bash script that generated an XML files out of 
the /var/db/pkg.  If you can import that 4 Megs files inside a DB then you 
could run some query and time them and see how things get faster.

-- 

Yannick Koehler
 

--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [gentoo-dev] Portage metadata
  2003-02-27 15:26 ` Yannick Koehler
@ 2003-02-28  7:33   ` Nick Jones
  0 siblings, 0 replies; 3+ messages in thread
From: Nick Jones @ 2003-02-28  7:33 UTC (permalink / raw
  To: Yannick Koehler; +Cc: gentoo-dev

>> 2. For later speedups, is anybody looking into using actual database
>> formats for the data? (GDBM/BDB/NDBM/etc.)
> 
> Python internal DB is cache in memory and is quite fast, for certain task a DB 
> would be faster but most of the time, the python DB seems to kick ass unless 
> you do description search.

Used a 2.0.47-series?  ~20-60 seconds depending on your box. That
includes printing all the descriptions, checking versions, etc.

This of course assumes you're using an rsync via 'emerge sync' and
you aren't anhiliating/damaging /var/cache/edb/dep in some fashion.

> 
> I have written in the past a bash script that generated an XML files out of 
> the /var/db/pkg.  If you can import that 4 Megs files inside a DB then you 
> could run some query and time them and see how things get faster.

Databases are not going to be that great of a speedup as it incurs the
overhead of the DB. Portage's DB is an on-demand one-file-per-ebuild
cache that is trivial to parse as it's one line per field. Ensuring
that the db is proper would take just as long, if not longer, than using
the server side cache (metadata/cache) as 'emerge sync' does already.

--NJ

--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2003-02-28  7:55 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-02-27 10:23 [gentoo-dev] Portage metadata robbat2
2003-02-27 15:26 ` Yannick Koehler
2003-02-28  7:33   ` Nick Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox