public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Inviting you to project "PackageMap"
@ 2009-06-12  7:42 Sebastian Pipping
       [not found] ` <15e53e180906120130md68cd94nba61fa5560c73eb4@mail.gmail.com>
  2009-06-12 18:27 ` [gentoo-dev] " Petteri Räty
  0 siblings, 2 replies; 29+ messages in thread
From: Sebastian Pipping @ 2009-06-12  7:42 UTC (permalink / raw
  To: PackageKit users and developers list; +Cc: gentoo-dev

Hello!


Quick (re-)introduction:  My task for Gentoo/Google Summer of Code 2009
is to give Gentoo a Debian popcon equivalent, a tool to collect
statistics on "what package is installed how often".  To achieve this
goal I'm extending Smolt (a tool currently doing similar things with
hardware information) by fine-tunable software stats gathering.


The plan we have for Smolt is to make it cross-distro, not just fit
Gentoo or Fedora.  One point where the consequences and benefits of such
an approach can be seen clearly is with

  counting packages from different distros into the same buckets.

What do I mean by that?  Debian's Git counts for Gentoo's Git counts for
Fedora's, you know the list.  With packages counted from accross distros
we can suddenly answer questions that we currently cannot answer, among them

 - What globally popular packages are missing in distro X?
   Let's say we don't have a package for product P.  Do other distros
   have one?  They do, maybe we need one, too?  They don't, maybe P is
   not that important then?

 - How many Linux users are approximately using program X in total?
   Not just on Ubuntu or Arch - all across Linux, BSD, Solaris!

 - Does distro X have 10 times the packages of Y or is it just
   different splitting?

To count into the same bucket we use global identifiers for the
"products" that fall out of a package.  Gentoo package "dev-util/git"
can produce product "cpe://a:git:git", Debian's "git-core" can, too.
That string before is a CPE URI [1], a concept close to package naming
in Java.  This "intermediate language" allows us to relate package names
from distro X with those of distro Y and answer various questions from
that data.

To do such mapping we need code (or a "service") that does the mapping
for us and base of collected data that the service can operate on.  Both
of these is project "PackageMap"

I have started populating the database with packages (currently 312
in number) made from information extracted from the Gentoo tree
and the National Vulnerability Database.  Latter holds many CPEs.
Let me state clearly that packagemap is not about Gentoo in particular.
Sure, the initial data has lots of Gentoo in it but the whole point of
the project is to get information and people from different distros
together.

To see what these 312 packages maps look like at the moment you best do
a few clicks through the database folder yourself:
http://git.goodpoint.de/?p=packagemap.git;a=tree;f=database

Also, there are Relax NG schema and DTD for validation, more
documentation than I usually write and a few scripts:
http://git.goodpoint.de/?p=packagemap.git;a=tree

  By now I hope you have gained interest in what this can become.
  Your active participation is highly appreciated.
  A few minutes from everyone can make a huge difference here.
  If you want write access to the repo - mail me: sebastian@pipping.org.

Please have a look at the Git repository linked above and ask questions.
I propose to keep the related Gentoo stuff on gentoo-dev and everything
else on the packagekit list.  I hope that works out well.

Thanks for reading up to this point.



Sebastian



PS: I'm aware "hartwork.org" might not make a good longterm location for
    DTDs, XML namespaces and such for a cross-distro project.  Any ideas
    where to put them best?

[1] http://cpe.mitre.org/





^ permalink raw reply	[flat|nested] 29+ messages in thread

end of thread, other threads:[~2009-07-20  2:03 UTC | newest]

Thread overview: 29+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-06-12  7:42 [gentoo-dev] Inviting you to project "PackageMap" Sebastian Pipping
     [not found] ` <15e53e180906120130md68cd94nba61fa5560c73eb4@mail.gmail.com>
2009-06-12  9:54   ` [gentoo-dev] Re: [packagekit] " Sebastian Pipping
2009-06-17 12:08     ` Tiziano Müller
2009-06-12 13:00   ` [gentoo-dev] " Steven J Long
2009-06-13  3:55     ` Sebastian Pipping
2009-07-11 21:38       ` [gentoo-dev] " Steven J Long
2009-06-12 18:27 ` [gentoo-dev] " Petteri Räty
2009-06-12 21:43   ` [packagekit] " Sebastian Pipping
2009-06-13 15:53     ` Petteri Räty
2009-06-13 19:03       ` Sebastian Pipping
2009-06-13 19:16         ` Petteri Räty
2009-06-15 13:52         ` Robert Buchholz
2009-06-15 17:04           ` Sebastian Pipping
2009-06-15 18:24             ` Robert Buchholz
2009-06-15 19:13               ` Sebastian Pipping
2009-06-15 20:27                 ` Petteri Räty
2009-06-17  0:34                   ` Sebastian Pipping
2009-06-17  9:37                     ` Marijn Schouten (hkBst)
2009-06-18  0:09                       ` Sebastian Pipping
2009-06-18  9:07                         ` Marijn Schouten (hkBst)
2009-06-19 18:53                           ` Sebastian Pipping
     [not found]                         ` <1245295820.11471.223.camel@chianamo.mine.nu>
2009-06-18 22:33                           ` Sebastian Pipping
     [not found]                             ` <1245382383.14805.281.camel@chianamo.mine.nu>
2009-06-19 17:36                               ` Sebastian Pipping
2009-06-19 21:47                                 ` Sebastian Pipping
2009-06-20 13:16                     ` Petteri Räty
2009-06-20 17:28                       ` Sebastian Pipping
2009-07-14 16:49                       ` Sebastian Pipping
2009-07-20  2:03                         ` [GLEP] CPE names in metadata (was Re: [gentoo-dev] Inviting you to project "PackageMap") Sebastian Pipping
2009-06-15 21:27                 ` [gentoo-dev] Re: [packagekit] Inviting you to project "PackageMap" Christian Faulhammer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox