From: Brian Harring <ferringb@gmail.com>
To: gentoo-portage-dev@lists.gentoo.org
Subject: sql based cache [was Re: [gentoo-portage-dev] Few things, which imho would make portage better]
Date: Tue, 14 Mar 2006 16:29:38 -0800 [thread overview]
Message-ID: <20060315002938.GD10744@nightcrawler.had1.or.comcast.net> (raw)
In-Reply-To: <cea53e3c0603140652u2d089b72x@mail.gmail.com>
[-- Attachment #1: Type: text/plain, Size: 3117 bytes --]
On Tue, Mar 14, 2006 at 04:52:14PM +0200, tvali wrote:
> > You're talking about the cache, take a look at the cache subsystem and
> > write a mysql module for it. This will never become a default though (we
> > would get killed if portage starts to depend on mysql).
>
> I think that it should not become default as mysql module, but if it
> is working, it should become default as "portable" sql module.
>
> # emerge sqlite pysqlite
>
> I havent used sqlite, but it seems to be small and usable. I think
> that it should start with it.
>
> I think that portage should *support* sql by default, but of course it
> should not be default before it's clear that many people like it and
> use it. What is imho more important is how to make one usable
> interface, which would cover both fs and sql portage db's so that
> development didnt go into two products.
See the restrictions framework I've started-
http://gentooexperimental.org/~ferringb/blog/archives/2005-07.html#e2005-07-13T01_21_42.txt
http://gentooexperimental.org/~ferring/bzr/pkgcore/dev-notes/framework/restrictions
Short version is that converting to sql internally sucks badly since
you'll have to parse (ad hoc) sql statements for any file based
backend. Using sql directly in portage requires encapsulating the sql
code so that rdbms syntax differences (replace comes to mind) can be
worked around...
Re: rdbms being faster then an on disk file db... it's only faster in
certain cases.
Properly designed/coded backends, RDBMS is _only_ faster when it's
returning N records when comparing it to a local file db.
As to why adding rdbms into stable is a bad idea right now, the
problem is in querying; you _could_ add a sql backend (pretty easy,
2.1 ships with a sql_template and sqlite backend from my earlier
work), but it'll actually be slower. Portage does cache lookups
individually; want the data for all bsdiff versions? portage does
thus-
keys=[]
for x in portdb.cp_all("dev-util/bsdiff"):
keys.append(portdb.aux_get(x, ["DEPENDS"]))
Each lookup is a seperate call- there is no way to leverage rdbms
speed for N record return if the calling api is (effectively) single
row queries.
To fully leverage a rdbms backend, need to restructure portage calls
so that it's dealing in lists instead of individual elements- fex,
under the rewrite
repository.match(atom("dev-util/bsdiff"))
Via that (and the restriction framework it uses) the api calls are
designed so that rdbms can shine; instead of N calls, the
repository/cache backend can convert the restrictions into a sql
statement and run _one_ search.
Finally...rdbms still has problems. If the repository isn't 'frozen'
(eg, it can regen it's metadata, as all portage trees in stable
currently can) you cannot rely on the cache backend aside from doing
random access lookups in it.
Why?
Cache holds dev-util/bsdiff-4.2 and dev-util/bsdiff-4.3, but not
dev-util/bsdiff-4.4 . If you hand off to the cache backend, it'll
return just those two, when it should return all 3.
~harring
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
next prev parent reply other threads:[~2006-03-15 0:30 UTC|newest]
Thread overview: 34+ messages / expand[flat|nested] mbox.gz Atom feed top
2006-03-14 11:02 [gentoo-portage-dev] Few things, which imho would make portage better tvali
2006-03-14 11:10 ` Simon Stelling
2006-03-14 11:24 ` tvali
2006-03-14 11:53 ` tvali
2006-03-14 12:14 ` tvali
2006-03-14 13:19 ` Devon Miller
2006-03-14 13:25 ` tvali
2006-03-14 13:50 ` tvali
2006-03-14 14:33 ` tvali
2006-03-14 15:04 ` Brian
2006-03-14 15:32 ` tvali
2006-03-14 16:21 ` Brian
2006-03-15 0:33 ` esearch integration [was Re: [gentoo-portage-dev] Few things, which imho would make portage better] Brian Harring
2006-03-15 14:20 ` tvali
2006-03-15 14:21 ` tvali
2006-03-15 0:37 ` sync suggestions " Brian Harring
2006-03-15 14:18 ` tvali
2006-03-14 13:21 ` [gentoo-portage-dev] Few things, which imho would make portage better tvali
2006-03-14 13:35 ` Marius Mauch
2006-03-14 14:52 ` tvali
2006-03-15 0:29 ` Brian Harring [this message]
2006-03-14 14:44 ` Alec Warner
2006-03-14 15:03 ` tvali
2006-03-14 15:53 ` Johannes Fahrenkrug
2006-03-14 16:24 ` tvali
2006-03-14 13:50 ` Marius Mauch
2006-03-14 15:18 ` solar
2006-03-14 16:35 ` tvali
2006-03-14 18:26 ` tvali
2006-03-14 18:30 ` Grant Goodyear
2006-03-14 18:46 ` tvali
2006-03-14 18:49 ` Grant Goodyear
2006-03-14 14:50 ` felix
2006-03-14 13:44 ` Marius Mauch
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20060315002938.GD10744@nightcrawler.had1.or.comcast.net \
--to=ferringb@gmail.com \
--cc=gentoo-portage-dev@lists.gentoo.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox