From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 7586C1388C0 for ; Thu, 25 Feb 2016 14:32:01 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 57ED921C012; Thu, 25 Feb 2016 14:31:48 +0000 (UTC) Received: from mail-ig0-f169.google.com (mail-ig0-f169.google.com [209.85.213.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 4100AE07D5 for ; Thu, 25 Feb 2016 14:31:47 +0000 (UTC) Received: by mail-ig0-f169.google.com with SMTP id y8so14549592igp.1 for ; Thu, 25 Feb 2016 06:31:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:sender:in-reply-to:references:date:message-id:subject :from:to; bh=Wo/ytM9fDQOpoNYzrrCy/gxkRAEbr36VrZ12qJGuLC8=; b=wjc5V8opypT832M7C7gyAb03ZDZ/jNbN/DRyNCgLtNgAbEVioo/yxfxpYIq5NSAK76 /wHIS5EU5iRFzikhFDGYKsfl0VP50OlmqV62pZZW00jx9No6mITlS97DQD7yec5Vf76u xk5kapI94S9/TVvDX6t1yQSPe320wzxpzFAcP7KuffpXTIzo3g9Eta9Pko8qSIfXpAxW gCS78OHuI31I/+cPwrwSFI4y+m8vUqCCvFZv9wwprcO7Ks2HM3e2g6NC6+SwFLRPhpAw XgAweNEBH39ysMgV8mmWFfxgQXFCYvhAUGhUgW631UtZVij51o9RZMAVFZK8SmnYTYGN +vBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:mime-version:sender:in-reply-to:references:date :message-id:subject:from:to; bh=Wo/ytM9fDQOpoNYzrrCy/gxkRAEbr36VrZ12qJGuLC8=; b=LBM466ELLCAYx5rrDNhmETZSqFnfFPV4zw5T95eHsvLpIurMnwbNuUAun+ZWRSmfWC avdtVlSSVpKNhyJugIFZLvv6/t8AUnBlzA+o6RqnylM6lKhb4AE/a4hyLNMJifJfxvq7 lugHVqu+gIG5oMpCwip9Dht+AN2c9DvBZsnRPCBglzWv/3dbGXw0yD0wbmzHLrkVgvRD rbYL4w2uIWEAw1yGwDG8TRT/GXsRLc5Swjuu8GbMwtuBV9BF4eXYqnLIH4LRAKN7ZgI+ FoYN8fA+J2oyuBTmImRIMyv0SGY3zX1nh+kGI0WnuklTrvL3uRuS2mcUioUAaH8ddzCC OzxA== X-Gm-Message-State: AG10YORcGWn8BZlE/s3clRgZDZ/JuIJsYRgag2dQ0GlMQwaJeThEWafvKDdXFMfOZHEvE3X82kzEl8FEWFaBnA== Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 X-Received: by 10.50.64.146 with SMTP id o18mr3169115igs.51.1456410706556; Thu, 25 Feb 2016 06:31:46 -0800 (PST) Sender: freemanrich@gmail.com Received: by 10.64.225.232 with HTTP; Thu, 25 Feb 2016 06:31:46 -0800 (PST) In-Reply-To: References: Date: Thu, 25 Feb 2016 09:31:46 -0500 X-Google-Sender-Auth: JxxJQaWqRE-U8nBcIuUbr5AsD_0 Message-ID: Subject: Re: [gentoo-user] Re: Attic (cvs) -> ???(git) From: Rich Freeman To: gentoo-user@lists.gentoo.org Content-Type: text/plain; charset=UTF-8 X-Archives-Salt: 9b71f74b-04b4-40f2-b25f-8b988ffb2633 X-Archives-Hash: 2b93290fe664cdd15bbfcde7ae4dcf42 On Mon, Feb 22, 2016 at 4:49 PM, James wrote: > Rich Freeman gentoo.org> writes: > >> If I were doing anything too >> crazy with all this I'd probably use the python git module. > > dev-python/git-python ??? Any others or related docs/howtos/examples? > I used pygit2, but there are a few different implenentations and plenty of docs online in general. Here is an example program that runs through a history and dumps a list of commits and their metadata in csv format: https://github.com/rich0/gitvalidate/blob/master/gitdump/parsetrees.py There are some other scripts that retrieve blobs and manipulate them in the same directory. This was part of the validation of the git migration, which uses a map-reduce algorithm to diff every single commit in a git history and identify all file revisions (which creates a cvs-like per-file history which can then be compared with results obtained from parsing a cvs repository for the same information). The only single-threaded step in the process is walking the list of commits - all the diffs can be highly paralleled. I doubt you need anything quite so fancy. As you can see from the script pulling metadata out of commits and walking through parents is pretty easy. My example doesn't account for merge commits. There weren't any in the cvs->git migration. Obviously walking commits with merges will get a lot messier. -- Rich