public inbox for gentoo-scm@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-scm] Git Conversion Validation
@ 2012-10-07 22:23 Rich Freeman
  2012-10-07 22:37 ` Peter Stuge
  0 siblings, 1 reply; 3+ messages in thread
From: Rich Freeman @ 2012-10-07 22:23 UTC (permalink / raw
  To: gentoo-scm

FYI - I started a repository of my git validation work at:
git://github.com/rich0/gitvalidate.git

I'm starting on the git side first.  I'm taking all my data directly
from the git executables and plan to do the same for cvs - if they
output the same content we should be OK.  I did some testing and I
think that my code should handle unicode output if git generates it.

The git repository has 1259922 commits, and it takes 50.5 seconds to
walk the list of commits to produce of trees and their commit info.

Next step is to iteratively perform the map / reduce algorithm I
outlined earlier to get a per-file history similar to what cvs
captures.

Contributions welcome.  I'm finding the main issue is cutting down the
overhead of spawning git processes to do the work.  While it will make
for more work in theory I might just have git-ls-tree recurse the
trees to reduce the subprocess overhead and then just do the extra
sorting/de-duplication in python.  I'm trying to avoid using git
implementations in python since that might expose us to bugs.

Rich


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-10-08  2:11 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-10-07 22:23 [gentoo-scm] Git Conversion Validation Rich Freeman
2012-10-07 22:37 ` Peter Stuge
2012-10-08  2:11   ` Rich Freeman

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox