public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
From: Rich Freeman <rich0@gentoo.org>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Attic (cvs) -> ???(git)
Date: Mon, 22 Feb 2016 16:05:56 -0500	[thread overview]
Message-ID: <CAGfcS_n+Q=yp9F=fGX8hP_CT2MT-dsvkvMS30pXrLemyoggErA@mail.gmail.com> (raw)
In-Reply-To: <loom.20160222T203336-415@post.gmane.org>

On Mon, Feb 22, 2016 at 2:49 PM, James <wireless@tampabay.rr.com> wrote:
>
> So using wget to fetch {package/files} from the gentoo attic was/is a reliable
> exercise to build things removed from the tree, into one's
> /usr/local/portage tree. It still works, but I'm guessing there is a now a
> "github_way" to do this.

cvs was file-oriented, and git is not, so I'll warn you up-front that
finding deleted files is a bit of pita.

> A Fully automated script? I could not find a
> gentoo wiki page to such effect, nor any other suggested pathway, but surely
> it exists? I guessing the gentoo's cvs_attic is deprecated now and thus
> subject to removal in the near future?

I don't think anybody has plans to get rid of the old cvs, but nothing
new goes into it, so it is frozen in time as of the migration date.

> I'd would be excited to see a specific example, using git on on of  the
> recently removed packages "app-misc/multimon" [1], which was just removed
> from the portage tree, unless someone wants to use another example. That
> package is about radio packets.

The approach to take depends on whether you want to find EVERYTHING
that has ever been deleted, or a specific thing, but ultimately it
comes down to looking at the full log.  Lots of good stuff can be
found here:
https://stackoverflow.com/questions/953481/find-and-restore-a-deleted-file-in-a-git-repository

The way I'd do it is run "git log --diff-filter=D --summary" and
search for multimon.  That gives you the commit ID it was removed in.
Then you want to checkout the commit before it.

In this case doing that search will yield:
commit 760e17fcbac1b8c04a96ab08306dbcc644131dfb
Author: Pacho Ramos <pacho@gentoo.org>
Date:   Sat Feb 20 12:49:31 2016

    Remove masked for removal packages

...
 delete mode 100644 app-misc/multimon/Manifest
 delete mode 100644 app-misc/multimon/files/multimon-1.0-flags.patch
 delete mode 100644 app-misc/multimon/files/multimon-1.0-includes.patch
 delete mode 100644 app-misc/multimon/files/multimon-1.0-prll.patch
 delete mode 100644 app-misc/multimon/metadata.xml
 delete mode 100644 app-misc/multimon/multimon-1.0-r2.ebuild
 delete mode 100644 app-misc/multimon/multimon-1.0-r3.ebuild
...

Then what you want to do is checkout the previous commit:
git checkout "760e17fcbac1b8c04a96ab08306dbcc644131dfb^1"

Now you're looking at the repo containing the last known state of
those files, so you can just copy them or cat them from the directory
tree:
cat app-misc/multimon/multimon-1.0-r3.ebuild

You could build all that into a script.  If I were doing anything too
crazy with all this I'd probably use the python git module.  Then
you'll get all your query results in collections and such instead of
having to parse all that output.  If you do want to parse you can
control the output of git log.

I will say that deleted files are one of those things that isn't as
pretty in git.  It isn't like a file with a deleted state flag that
you can search by - they're identified by their presence in one commit
and absence in the next.  In fact, to identify them I'd think that
git-log has to basically has to diff every tree for every commit
historically.  That isn't as bad as it sounds as each directory is
shared with the previous commit COW-style - so if only one
subdirectory contains changes only that directory needs to be walked
to find what those differences are, and so on.  The structure of our
repository leads to a relatively well-balanced tree with fairly few
levels, which is a good case for git.  When I did the git validation I
had to walk all of it and doing it smartly in parallel you can get it
done remarkably quickly even in python (considering we have 2M
commits, which is 2M*<num-files-in-portage> files you could have to
diff in the brute force approach).

-- 
Rich


  reply	other threads:[~2016-02-22 21:06 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-02-22 19:49 [gentoo-user] Attic (cvs) -> ???(git) James
2016-02-22 21:05 ` Rich Freeman [this message]
2016-02-22 21:49   ` [gentoo-user] " James
2016-02-25 14:31     ` Rich Freeman
2016-02-25 19:46       ` James
2016-02-25  2:21 ` walt
2016-02-25  3:35   ` Mike Gilbert
2016-02-25 13:07   ` James
2016-02-26  1:49     ` walt
2016-02-26 16:43       ` James
2016-02-26 17:15         ` »Q«

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGfcS_n+Q=yp9F=fGX8hP_CT2MT-dsvkvMS30pXrLemyoggErA@mail.gmail.com' \
    --to=rich0@gentoo.org \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox