From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1O65Lq-00079x-4T for garchives@archives.gentoo.org; Sun, 25 Apr 2010 17:10:38 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 3D4C8E08C9; Sun, 25 Apr 2010 17:10:35 +0000 (UTC) Received: from smtp.gentoo.org (smtp.gentoo.org [140.211.166.183]) by pigeon.gentoo.org (Postfix) with ESMTP id E9957E08B2 for ; Sun, 25 Apr 2010 17:10:17 +0000 (UTC) Received: from [10.0.3.10] (bl15-116-137.dsl.telepac.pt [188.80.116.137]) (using TLSv1 with cipher DHE-RSA-CAMELLIA256-SHA (256/256 bits)) (No client certificate requested) by smtp.gentoo.org (Postfix) with ESMTP id F069D65175 for ; Sun, 25 Apr 2010 17:10:15 +0000 (UTC) Message-ID: <4BD47770.8050308@gentoo.org> Date: Sun, 25 Apr 2010 19:10:08 +0200 From: Angelo Arrifano Organization: Gentoo Linux Foundation User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.1.9) Gecko/20100419 Lightning/1.0b2pre Thunderbird/3.0.4 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@lists.gentoo.org Reply-to: gentoo-dev@lists.gentoo.org MIME-Version: 1.0 To: gentoo-dev@lists.gentoo.org Subject: Re: [gentoo-dev] [RFC][NEW] Utility to find orphaned files References: <4BD42501.9070505@gentoo.org> <20100425103426.66855395@xdune.lan> In-Reply-To: <20100425103426.66855395@xdune.lan> X-Enigmail-Version: 1.0.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Archives-Salt: 0ce3f8e6-524b-4bf2-85fe-9896ecf2960f X-Archives-Hash: f89c4cfab75e59e72ccc2cab0c14dbb3 On 25-04-2010 17:34, Yuri Vasilevski wrote: > Hello, > > On Sun, 25 Apr 2010 13:18:25 +0200 > Angelo Arrifano wrote: > >> Hello developers developers and developers, >> >> Ever wondered how much crap is left in your X-years old Gentoo box? >> >> I just developed a python utility to efficiently find orphaned files >> in the system. By orphaned files I mean the files that are present on >> system directories and don't belong to any installed package. >> >> The package builds a virtual filesystem (cache) on the RAM using >> python hash tables. Then it uses the cache to find the ownership of >> files inside user-specified dirs. >> >> Building the cache takes less than 10 seconds here in a system with >> 1366 installed packages. >> >> This is not intended to be a finished program yet, I'm looking forward >> for your constructive commentaries. > > There is a tool that does that, qfile from app-portage/portage-utils. > Check the "-o, --orphans * List orphan files" option. > > It's not as straight forward as it could be, as it checks only for > files specified as arguments or read from file. > > But you can trivially use it like: > # find /dir/you/want/to/check/for/orphans | qfile -o -f - > > Best, > Yuri. > Based on the comments so far, I'll try to make my PoC a better tool. My primary objective is to make this some kind of disk cleanup utility for Gentoo boxens. I don't expect Gentoo systems to be *that* polluted but sometimes we all have to do ugly things to fix broken systems real fast. - If you know what I mean. There are other things that came to my mind, like using stored hashes to check the system files integrity (as in security). My next steps in regard to this utility will be: * Follow harring suggestion and use available PM API. * Make the application handle symlinks so we start getting a more informative output. * To store the generated cache on disk and to only regenerate it if needed. Regards, - Angelo