public inbox for gentoo-scm@lists.gentoo.org
 help / color / mirror / Atom feed
From: Rich Freeman <rich0@gentoo.org>
To: gentoo-scm@lists.gentoo.org
Subject: Re: [gentoo-scm] Git Conversion Validation
Date: Sun, 7 Oct 2012 22:11:22 -0400	[thread overview]
Message-ID: <CAGfcS_mzVug9mjtWyCgihxPkzB6VXKBAh0CbCgXc+mOPDff4ng@mail.gmail.com> (raw)
In-Reply-To: <20121007223749.19796.qmail@stuge.se>

On Sun, Oct 7, 2012 at 6:37 PM, Peter Stuge <peter@stuge.se> wrote:
> Rich Freeman wrote:
>> I'm trying to avoid using git implementations in python since that
>> might expose us to bugs.
>
> Take a look at libgit2+pygit2.

Well, my goal was to try to stick to the output of the official
commands, figuring that this is essentially the standard to go by.  My
understanding is that subtle problems with character encodings and
such were found in past conversation efforts.  If unusual characters
are being modified by the conversion program I want to avoid the
verification program making the same mistake and therefore obscuring
the problem.

That said, spawning git several million times is looking to be REALLY
slow, so I think I might bite the bullet and use a library.  It seems
like pygit2 is designed to use unicode for everything.

And of course the risk that pygit2/etc has bugs really isn't
necessarily greater than the risk that my own stuff has bugs (though
knowing my intended use I can probably minimize the ones that count -
the logic really is simple).

The repository contains currently what should be a working
implementation (though it doesn't write the final list out to disk).
It is just WAY too slow to run (hence the command line parameter to
limit the number of commits examined).

Pretty busy for a few days, but I'll convert the git spawning and
output parsing to pygit2 calls.  As an added bonus I don't have to
deal with the fact that git just LOVES to mangle its output to be
pleasing to eyes and less so to robots.

Rich


      reply	other threads:[~2012-10-08  2:11 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-10-07 22:23 [gentoo-scm] Git Conversion Validation Rich Freeman
2012-10-07 22:37 ` Peter Stuge
2012-10-08  2:11   ` Rich Freeman [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAGfcS_mzVug9mjtWyCgihxPkzB6VXKBAh0CbCgXc+mOPDff4ng@mail.gmail.com \
    --to=rich0@gentoo.org \
    --cc=gentoo-scm@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox