* [gentoo-soc] Project Grumpy - report #3
@ 2010-07-04 13:03 Priit Laes
2010-07-04 19:37 ` Petteri Räty
2010-07-06 15:19 ` Donnie Berkholz
0 siblings, 2 replies; 3+ messages in thread
From: Priit Laes @ 2010-07-04 13:03 UTC (permalink / raw
To: gentoo-soc; +Cc: leio, ferringb
This is a progress report #3 for Project Grumpy.
Now, since report two, there has been a big change of focus in the course of
development, which means that we decided to drop our beloved and also greatly
hated NoSQL approach (MongoDB) and instead go forward using regular RDBMS
which in our case is good old PostgreSQL.
Although there were some compelling arguments (ease of use being my
favorable) for MongoDB, the biggest nail in its coffing was its lack of
"support" for it from Gentoo's infra team. For them it was just another
application they would have to take care of and around interwebs there's lots
of 'MongoDB ate my data' reports on how error-prone MongoDB actually is
(although data volumes in most of these cases were so high, that I cannot
really imagine Grumpy running into these problem). But I can really
understand their concerns. Besides, if you take a look at list of commits in
MongoDB's official development repository [1], you can see why people are a
bit concerned ;)
[1] http://github.com/mongodb/mongo/commits
Therefore we switched over to PostgreSQL, using SQLAlchemy as a glue layer
between the database and application. SQLAlchemy is a blessing because using
its object relational model, you do not actually have to write any SQL (just
take a peek in the 'grumpy_sync' utility).
Progress so far
===============
So far I have implemented portage -> database sync utility that is used to
keep database in sync with portage content. Although it seems to handle most
of the various portage quirks (like package moves via 'profiles/sync'), it
still might run into issues in some corner cases and there is also minimal
error recovery: it is currently designed to crash with RuntimeError when it
detects something out of ordinary.
Of course, the data model is far from complete - no proper handling of
keywords, and I do not even store ebuild depends, rdepends and licenses in
database - mainly because I currently don't have any use cases for these.
Syncer can be found under 'utils' directory in the project directory.
Future plans
============
As model and controller are ready, next stop is to write rudimentary web app
for browsing portage contents, so people can finally see that I actually
haven't slacked all this time.. :)
Also, during portage import I noticed some really simple QA issues like
invalid herd names in 'metadata.xml'. Plan is to write a 'herdcheck' plugin
and implement database storage for these QA issues. And as I cannot let
anyone to simply write to database, I need to implement API to let plugins
interact with app.
Having API means that I can start integrating with other QA tools around
there, mainly tinderbox.
And finally, testing. I currently have simple doctesting and auditing (via
PyFlakes) framework in place, but general unit testing is still missing.
As you can see, I'm a bit lagging my proposed timeline - I still haven't
actually started looking how to create the 30-day stabilisation and upstream
version checkers, but hopefully I can pick up the speed because I can now say
that I have passed the biggest hurdle.. :)
And I have also dropped my 'secret agenda' of documenting my experience with
NoSQL databases as a series of articles written during this project...
Project info
============
Git repository of Grumpy repo is available from [2].
[2] http://git.overlays.gentoo.org/gitweb/?p=proj/grumpy.git;a=summary
Project's semi-official IRC channel is #gentoo-grumpy on Freenode network,
if you run into troubles when testing out this project, then just ping me with
a message.
PS. Bonus points for those who noticed that I dropped 'weekly' ;)
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [gentoo-soc] Project Grumpy - report #3
2010-07-04 13:03 [gentoo-soc] Project Grumpy - report #3 Priit Laes
@ 2010-07-04 19:37 ` Petteri Räty
2010-07-06 15:19 ` Donnie Berkholz
1 sibling, 0 replies; 3+ messages in thread
From: Petteri Räty @ 2010-07-04 19:37 UTC (permalink / raw
To: gentoo-soc
[-- Attachment #1: Type: text/plain, Size: 452 bytes --]
On 07/04/2010 04:03 PM, Priit Laes wrote:
> Also, during portage import I noticed some really simple QA issues like
> invalid herd names in 'metadata.xml'. Plan is to write a 'herdcheck' plugin
> and implement database storage for these QA issues. And as I cannot let
> anyone to simply write to database, I need to implement API to let plugins
> interact with app.
>
I would like to see the herd check being part of repoman.
Petteri
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 900 bytes --]
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [gentoo-soc] Project Grumpy - report #3
2010-07-04 13:03 [gentoo-soc] Project Grumpy - report #3 Priit Laes
2010-07-04 19:37 ` Petteri Räty
@ 2010-07-06 15:19 ` Donnie Berkholz
1 sibling, 0 replies; 3+ messages in thread
From: Donnie Berkholz @ 2010-07-06 15:19 UTC (permalink / raw
To: gentoo-soc; +Cc: leio, ferringb
On 16:03 Sun 04 Jul , Priit Laes wrote:
> Now, since report two, there has been a big change of focus in the
> course of development, which means that we decided to drop our beloved
> and also greatly hated NoSQL approach (MongoDB) and instead go forward
> using regular RDBMS which in our case is good old PostgreSQL.
I'm happy to hear that. There is significant value in using common tools
so that future contributors don't have an increased barrier to entry.
Although there may be a tool perfectly suited to every job, if you're
using tons of different tools, then nobody else will ever be able to
pick them all up to maintain or contribute to the code. Same thing goes
for using obscure language features; in my experience, people tend not
to consider future maintenance or contributions by people besides
themselves.
--
Thanks,
Donnie
Donnie Berkholz
Admin, Summer of Code
Gentoo Linux
Blog: http://dberkholz.wordpress.com
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2010-07-06 15:21 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-07-04 13:03 [gentoo-soc] Project Grumpy - report #3 Priit Laes
2010-07-04 19:37 ` Petteri Räty
2010-07-06 15:19 ` Donnie Berkholz
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox