* [gentoo-qa] [GSoC-status] Collagen - database schema and further changes @ 2009-06-26 12:51 Stanislav Ochotnicky 2009-07-03 15:41 ` [gentoo-qa] " Stanislav Ochotnicky 0 siblings, 1 reply; 8+ messages in thread From: Stanislav Ochotnicky @ 2009-06-26 12:51 UTC (permalink / raw To: gentoo-qa; +Cc: gentoo-soc [-- Attachment #1: Type: text/plain, Size: 2574 bytes --] So another (if a bit late) status update for Tree-wide collision checking and files database is coming. I don't plan on having any major architectural changes from this point on (I will update docs on soc.gentooexperimental.org during the weekend). We have matchbox as master server and tinderboxes as compile slave. Previously mentioned binary host is not yet implemented at all since we want to get to actually compiling stuff as soon as possible and speed is a bit down the list for now. We have basic database model for storing information collected by tinderboxes ready (doc/ddl.sql - it is a dump of postgresql database, model is at gentooexperimental web). There are few changes that are not included there yet, such as tinderbox slave table with information about them. There will definitely be more changes to ddl as we go, but hopefully nothing major. I hit a few minor issues with chroot for compilation creation. Whole process goes like this: (not chrooted yet) * We get information about use flags/dependencies etc for the package * Call external shell script to prepare chroot and mount proc and dev * chroot and call portage.doebuild(...) Now the external shell script I created uses official stage file to create base chroot, then rsyncs /usr/portage to chroot. From this point on further customization of BASE chroot is possible. Issue is that we need to have same version of portage in BASE_CHROOT as we have on tinderbox, otherwise things can get really ugly. Chroot preparation script will therefore see some changes. I am looking into options for making sure that everything is set up correctly. One easy possiblity is to manually change BASE_CHROOT after basic setup by script. Better solution is to integrate catalyst into chroot creation. Now it's one big puzzle with one bit missing here, one bit missing there. But it's slowly starting to come together. Fortunately I have tried most things as small POCs and I am starting to see light at the end of the tunnel (pretty far away but visible). P.S. In case it's not so obvious, repository is here: git://git.overlays.gentoo.org/proj/collagen.git -- Stanislav Ochotnicky Working for Gentoo Linux http://www.gentoo.org Implementing Tree-wide collision checking and provided files database http://soc.gentooexperimental.org/projects/show/collision-database Blog: http://inputvalidation.blogspot.com/search/label/gsoc jabber: sochotnicky@gmail.com icq: 74274152 PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-qa] Re: [GSoC-status] Collagen - database schema and further changes 2009-06-26 12:51 [gentoo-qa] [GSoC-status] Collagen - database schema and further changes Stanislav Ochotnicky @ 2009-07-03 15:41 ` Stanislav Ochotnicky 2009-07-09 21:36 ` Stanislav Ochotnicky 0 siblings, 1 reply; 8+ messages in thread From: Stanislav Ochotnicky @ 2009-07-03 15:41 UTC (permalink / raw To: gentoo-qa; +Cc: gentoo-soc [-- Attachment #1: Type: text/plain, Size: 2904 bytes --] I decided to post my next status report as reply to my previous post, so that no unnecessary threads are created. Over the course of last week bulk of my work was around installing dependencies, logging and fixing bugs in chroot creation script. Installing dependencies is a bit hacky right now, since I am using emerge.emerge_main() to install them. This means that I don't have to repeat work of emerge and search dep tree etc etc. I'll show this simple hack on following ascii-non-art package hierarchy: A->B1->C1 | | ->C2 | |->B2->C3 Package A is the one we want to install, B1 and B2 are its dependencies (we can read them from ebuild of package A). We could walk the hierarchy, but it's may not be necessary since we were asked to only try and compile package A. Therefore to install package B1 and B2 we actually ask emerge (and it will resolve deps for us). Then we install package A ourselves (by using portage.doebuild). If it fails then something is probably wrong in ebuild for package A. Creating of chroot environment for package installation had a lot of bugfixes too. It's still not as good as it should be and there is always need to manually "synchronize" internal version of emerge with that on the outside of chroot. I now use a lot of -o bind for mounting subdirectories in chroot, this is speeding up stuff quite a bit. As far as logging is concened I am using standard logging python module, nothing fancy. But it works, and compile machines can now report errors in more human-readable form, not just build.log. For example: "Unable to emerge package A-1.2.3 with deps B1-2.0,B2-2.0" I also had to add data that are transferred between matchbox and tinderboxes since I realized that otherwise I would not be able to fill information that need to be present when inserting data into database. This part is not finished yet, so tinderbox currently sends no data to matchbox. This regression should be fixed today. On that note, what I plan to work on during weekend/next week: * start inserting data into database (therefore actually create app->db layer) * now that we really have functioning compilation/dep resolving try to install more packages. Therefore create list of 10-20 packages. dev-util/git (and its subversion[-dso] dep), postfix/sendmail blockers come to mind. If you have more ideas for combinations that usually cause problems, I'd love some input * Improve logging in a few places -- Stanislav Ochotnicky Working for Gentoo Linux http://www.gentoo.org Implementing Tree-wide collision checking and provided files database http://soc.gentooexperimental.org/projects/show/collision-database Blog: http://inputvalidation.blogspot.com/search/label/gsoc jabber: sochotnicky@gmail.com icq: 74274152 PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-qa] Re: [GSoC-status] Collagen - database schema and further changes 2009-07-03 15:41 ` [gentoo-qa] " Stanislav Ochotnicky @ 2009-07-09 21:36 ` Stanislav Ochotnicky 2009-07-17 13:39 ` Stanislav Ochotnicky 0 siblings, 1 reply; 8+ messages in thread From: Stanislav Ochotnicky @ 2009-07-09 21:36 UTC (permalink / raw To: gentoo-qa; +Cc: gentoo-soc [-- Attachment #1: Type: text/plain, Size: 3233 bytes --] Heya everyone, another (almost) week went by so here is another status report. As I stated in my last report one of key goals for this week was db layer for storing information retrieved by tinderboxes. I was looking into using various ORM frameworks. It was suggested to me to try Django and I though "Hey, that's not even ORM framework, but a web framework". Well one part of my project is creating web interface for database at later stage. So in spirit of not doing same thing twice I looked into using ORM part of Django. And guess what? I it doable, and basic implementation is in devel branch of my repo. There were certain caveats of course. Django is designed to work for web applications, not as general purpose ORM framework. So when using its ORM part without rest of Django, I have to take care of DB exceptions and rollback of transactions myself. I soon realized I am doing same thing in every db function I was writing so I ended up writing a decorator in Python (finally had a reason! :-) ). It looks something like this: --- CODE def dbquery(f): def decor(*args, **kwargs): reset_queries() try: return f(*args, **kwargs) except Exception, e: _rollback_on_exception() raise e return decor @dbquery def add_package(...) --- CODE This way we can be sure that failed transactions are rolled back. Because I am using Django to generate SQL now, orignal database schema that I commited to repository some time ago is now deprecated. We can generate database (and initial data) by using django-admin syncdb command now. This approach seems fairly good so far since everything was set-up by code that fits on one screen. I only wish using only small part of Django was less painful. And now comes the big part. Actually populating the database with some meaningful data. I did some work in that part. Last week there were some modifications to protocol I was using between Matchbox and Tinderboxes. Most of changes were touching code dealing with log/environment collection and fact that we have been compiling inside chroot environment. I also added support for packages that require certain use flags enabled/disabled for their dependencies. Good example is dev-utils/git (requires dev-utils/subversion[-dso]). For git this is however only RDEPEND (runtime dependency) so compilation doesn't depend on it. Since we already have db for storing results, and support is there let's compile some packages! For next week I plan to finally add proper testing to the mix. Instead of compiling fortune-mod over and over, Matchbox will ask also for other packages to be compiled. This will bring out some more problems that will need to be fixed I am sure. At least that's the idea... -- Stanislav Ochotnicky Working for Gentoo Linux http://www.gentoo.org Implementing Tree-wide collision checking and provided files database http://soc.gentooexperimental.org/projects/show/collision-database Blog: http://inputvalidation.blogspot.com/search/label/gsoc jabber: sochotnicky@gmail.com icq: 74274152 PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-qa] Re: [GSoC-status] Collagen - database schema and further changes 2009-07-09 21:36 ` Stanislav Ochotnicky @ 2009-07-17 13:39 ` Stanislav Ochotnicky 2009-07-25 22:50 ` Stanislav Ochotnicky 0 siblings, 1 reply; 8+ messages in thread From: Stanislav Ochotnicky @ 2009-07-17 13:39 UTC (permalink / raw To: gentoo-qa; +Cc: gentoo-soc [-- Attachment #1: Type: text/plain, Size: 1390 bytes --] YAWR (Yet Another Weekly Report) is here and yet again I am feeling like an intruder on these quiet lists :-) So what was going on over the last week? Well not that much since I had a visitor for the weekend and first day of the week. Bulk of the work went into making testing easier, documenting installation procedures, creating startup scripts for collagen components. We also started compiling more packages and collecting build errors/contents of these packages. Right now lot of errors come from problems with collagen (for example it managed to unmerge sed from chroot, not a great idea :-) ). Now we are slowly entering stage where most of work will be directed towards fixing stuff up so that errors that remain are real ebuild problems. I'd like to apologize to Andrey Kislyuk, my mentor, for not doing this sooner. I realized too late that "we want to start compiling packages as soon as possible" really meant "AS SOON AS POSSIBLE". Thanks for being patient. -- Stanislav Ochotnicky Working for Gentoo Linux http://www.gentoo.org Implementing Tree-wide collision checking and provided files database http://soc.gentooexperimental.org/projects/show/collision-database Blog: http://inputvalidation.blogspot.com/search/label/gsoc jabber: sochotnicky@gmail.com icq: 74274152 PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-qa] Re: [GSoC-status] Collagen - database schema and further changes 2009-07-17 13:39 ` Stanislav Ochotnicky @ 2009-07-25 22:50 ` Stanislav Ochotnicky 2009-07-31 9:44 ` Stanislav Ochotnicky 0 siblings, 1 reply; 8+ messages in thread From: Stanislav Ochotnicky @ 2009-07-25 22:50 UTC (permalink / raw To: gentoo-qa; +Cc: gentoo-soc [-- Attachment #1: Type: text/plain, Size: 1492 bytes --] Hi everyone, first thanks for responses to my last email. I was kind of joking with the intruder part, but it still is nice to have feedback. Now to good news. We had some nice results this week: first discovered ebuild error. Apparently lxkde-base/lxsession was missing intltool in DEPEND. Apart from that everything went as planned. I was focusing on fixing errors with collagen and building of packages. To speed up testing we started building/using binary packages (with --usepkg --buildpkg). This will have to be improved later when we start playing with use flags more, but for now it will do. I also fixed problem with unmerging system packages (collagen now skips unmerging of packages in "system" set). Rest of changes went into improving information we get when compiling packages (debugging info mostly). I guess this will be it for now, plans for following week are as follows: * fix few more outstanding bugs bugging me :-) * compile even more packages and start filling database up * categorize at least a few ebuild problems See ya later alligators, -- Stanislav Ochotnicky Working for Gentoo Linux http://www.gentoo.org Implementing Tree-wide collision checking and provided files database http://soc.gentooexperimental.org/projects/show/collision-database Blog: http://inputvalidation.blogspot.com/search/label/gsoc jabber: sochotnicky@gmail.com icq: 74274152 PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-qa] Re: [GSoC-status] Collagen - database schema and further changes 2009-07-25 22:50 ` Stanislav Ochotnicky @ 2009-07-31 9:44 ` Stanislav Ochotnicky 2009-08-07 8:12 ` Stanislav Ochotnicky [not found] ` <20090807081410.GB29277@w0rm.ynet.sk> 0 siblings, 2 replies; 8+ messages in thread From: Stanislav Ochotnicky @ 2009-07-31 9:44 UTC (permalink / raw To: gentoo-qa; +Cc: gentoo-soc [-- Attachment #1: Type: text/plain, Size: 2278 bytes --] Wow, another week behind me. And quite productive one if you ask me. I recently pushed yesterday's changes to public repo (so far only devel branch which might get rebased so thread carefully :-) ). There is a lot of cleanup work to be done, but I can say that we have base working now. So what exactly was going on over the last week? I fixed bunch of bugs (and two remaining I plan to fix today). You can see more information about that at redmine bug tracker on gentooexperimental.org. Once I fix remaining bugs I plan to add more to bugtracker :-) Some work went into making tinderboxes able to recover from problems so that they can run without supervision. All around exception handling and error logging is not perfect but a lot better then a week ago. I also started filling up database with information yesterday. It went even smoother then I expected, I only had few typos in my code :-) All in all it took about 2 hours to make it all work. I have three more ebuild candidates for fixing (not confirmed yet): * dev-java/kaffe doesn't list x11-libs/libXtst in DEPEND * x11-lib/libXaw is missing x11-libs/libXext and x11-proto/xextproto in DEPEND * games-roguelike/tome ebuilds all have typo in ebuilds. Can you spot it? I'll give you short example so you don't have to look it up: DEPEND="${REDEPEND} x11-misc/makedepend" For a moment I actually thought that portage has some bug because it didn't return proper DEPEND packages...Until I saw that typo later on. Maybe this could be checked in repoman somehow? So moving on to plan for today and the next week: * report bugs for mentioned ebuilds * fix main bugs remaining in collagen * start to refactor code to make it more pretty for later audit :-) * start creating web interface for file database using Django So long and thanks for all the fish, -- Stanislav Ochotnicky Working for Gentoo Linux http://www.gentoo.org Implementing Tree-wide collision checking and provided files database http://soc.gentooexperimental.org/projects/show/collision-database Blog: http://inputvalidation.blogspot.com/search/label/gsoc jabber: sochotnicky@gmail.com icq: 74274152 PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
* [gentoo-qa] Re: [GSoC-status] Collagen - database schema and further changes 2009-07-31 9:44 ` Stanislav Ochotnicky @ 2009-08-07 8:12 ` Stanislav Ochotnicky [not found] ` <20090807081410.GB29277@w0rm.ynet.sk> 1 sibling, 0 replies; 8+ messages in thread From: Stanislav Ochotnicky @ 2009-08-07 8:12 UTC (permalink / raw To: gentoo-qa [-- Attachment #1: Type: text/plain, Size: 1630 bytes --] And here I am with another week report, this week has been mostly about integration, deployment arrangements and web development. I also fixed two main remaining bugs, that is: * installation of packages even when KEYWORDS didn't have ARCH or ~ARCH in them. * nested use dependencies with '||' caused problems The web development is quite easy now althought I admit that my templates are more-less bare html django templates without any fancy ajax or similar web 2.0 stuff :-) For now we can easily: * list contens of certain package version (actually one compiled instance of that package version). If there was an error compiling instead of contents we can see build logs etc. * list only packages that were problematic (first category list, then package list) * search what packages certain path is in Monday is suggested "pencils down" but since I haven't deployed collagen yet this will not be our pencils down apparently. For last week(end) I plan to polish up collagen, especially documentation about workarounds and various hacks so that we will all know what are current limits (there are quite a few but concentrated in few places that can be easily improved). Enjoy your weekend, -- Stanislav Ochotnicky Working for Gentoo Linux http://www.gentoo.org Implementing Tree-wide collision checking and provided files database http://soc.gentooexperimental.org/projects/show/collision-database Blog: http://inputvalidation.blogspot.com/search/label/gsoc jabber: sochotnicky@gmail.com icq: 74274152 PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
[parent not found: <20090807081410.GB29277@w0rm.ynet.sk>]
* [gentoo-qa] Re: [GSoC-status] Collagen - database schema and further changes [not found] ` <20090807081410.GB29277@w0rm.ynet.sk> @ 2009-08-15 21:10 ` Stanislav Ochotnicky 0 siblings, 0 replies; 8+ messages in thread From: Stanislav Ochotnicky @ 2009-08-15 21:10 UTC (permalink / raw To: gentoo-soc; +Cc: gentoo-qa [-- Attachment #1: Type: text/plain, Size: 4552 bytes --] My final (GSoC) report on collagen is here. First, what was going on this past week...I've focused on adding documentation (docstrings, comments) and a little bit of refactoring. Then there were quite a few modifications to simplify installation and now I can say that simple: # python setup.py install plus few configuration steps afterwards will do the whole installation procedure. Alternatively it's possible to use bundled ebuild instead of setup.py script. This was actually first time I've created setup.py script and I have to admit that for simple stuff (like collagen) it's quite easy to create it from scratch in a few minutes. So to the main summary of the project, what works, what doesn't really work but is planned for post-GSoC era etc etc. We have working end-to-end system for automatic distributed compilation of packages from portage tree with information stored in database. Information being contents of packages or in case of compilation failure build.log, environment, emerge --info and application log for good measure. We were able to catch few bugs in ebuilds already even with limited resources (meaning I was mostly compiling in a virtualbox on one of my machines). Even if QA like this was not originally the main goal of collagen, it might later turn out to be exactly that. Bug hunting monster :-) As I mentioned in last week's report web interface for this was done in django and I've implemented 3 basic use-cases for showing data in database. If you've worked with Django before then you know it's quite easy to add more functionality. At least one more important use-case to be implemented is "is this package colliding with something else?" or even better "show me groups of packages colliding with each other". Data is there, database schema is able to support these use cases so this will definitely be implemented (although presumably after GSoC). This is where we come to "what now" part. Whole collagen didn't get a lot of testing yet, and I'd love to try it on more than one virtualbox machine. There are a lot of places where performance could be improved (read: where we would not install already installed packages and such). Another unimplemented idea was using remote binary hosts. However I really think that this would make much more sense if it was implemented together with improved binary support for portage. Then there was authentication of tinderboxes to matchbox for communication between them. Right now machine able to connect to matchbox server could probably run python code as user running matchbox. I haven't tried, but python documentation for pickle module is pretty clear about insecurity of this approach (I was assured this won't be a problem after auth is done though). All in all collagen is far from being finished, polished piece of software right now. But at least in my opinion it already showed that it could be useful and improvements can now be done in quite modular way. One exception being of course database schema, but even changes there are nothing to be afraid of (I presonally tried, it's possible without destroying the data :-) ). Expect few more updates on soc.gentooexperimental bugtracking/documents. Then Now let me say big thank you to my mentor weaver and the whole Gentoo team. Few examples: * weaver - for changing our meeting time quite a few times because of me, and for the patience. * zmedico - for being portage king, always ready to help out and even explain 'why', not just how * robbat2|na - for becoming base unit of problem solving recently. * .* - for answering quite a few of my questions and giving me ideas. Most of you probably never noticed that you helped me (sometimes even I didn't notice at the time :-) ). I hope I'll work with you to further improve collagen and after that, other Gentoo projects. I am not going anywhere yet, as long as you won't chase me away with a stick :-) For me this is not really over, I'll chill out a little bit of course, but if at all possible I'd like to continue contributing. -- Stanislav Ochotnicky Working for Gentoo Linux http://www.gentoo.org Implementing Tree-wide collision checking and provided files database http://soc.gentooexperimental.org/projects/show/collision-database Blog: http://inputvalidation.blogspot.com/search/label/gsoc jabber: sochotnicky@gmail.com icq: 74274152 PGP: https://dl.getdropbox.com/u/165616/sochotnicky-key.asc [-- Attachment #2: Type: application/pgp-signature, Size: 198 bytes --] ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2009-08-15 21:11 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-26 12:51 [gentoo-qa] [GSoC-status] Collagen - database schema and further changes Stanislav Ochotnicky 2009-07-03 15:41 ` [gentoo-qa] " Stanislav Ochotnicky 2009-07-09 21:36 ` Stanislav Ochotnicky 2009-07-17 13:39 ` Stanislav Ochotnicky 2009-07-25 22:50 ` Stanislav Ochotnicky 2009-07-31 9:44 ` Stanislav Ochotnicky 2009-08-07 8:12 ` Stanislav Ochotnicky [not found] ` <20090807081410.GB29277@w0rm.ynet.sk> 2009-08-15 21:10 ` Stanislav Ochotnicky
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox