From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id 5C6411381F3 for ; Fri, 26 Apr 2013 01:17:16 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 324AAE0AA9; Fri, 26 Apr 2013 01:17:13 +0000 (UTC) Received: from mail-la0-f54.google.com (mail-la0-f54.google.com [209.85.215.54]) (using TLSv1 with cipher ECDHE-RSA-RC4-SHA (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 7F882E0AA9 for ; Fri, 26 Apr 2013 01:17:12 +0000 (UTC) Received: by mail-la0-f54.google.com with SMTP id es20so3152764lab.27 for ; Thu, 25 Apr 2013 18:17:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=mime-version:x-received:sender:in-reply-to:references:date :x-google-sender-auth:message-id:subject:from:to:content-type :content-transfer-encoding; bh=rvJCmf1m2GuZiSi1ysx/tThndJMlZgfZTVYG2vZygz0=; b=L6DSkMPgmyvqUMjTZ68H8vje/HosCXsGfzNFsrZpuQ49VwCSUMKNmFU1xxz/MgLwGy FkSqcXzFCamODvdfqK02cW7FO/IZR7U6H62/kNPvASRGWF0DPjwfCGnGbGoPCaIsG/lc TKnl/16mdobgqSP0CHc4Ko0XTiGoWjjux42Z5AP6zi4ZEtUqUtPe6jy89ee/Yx4lA0b6 fbozGs1VG7fqUGikrFdRSsRdVv/3Odf5WezARWsnsQLJ6g0R/jl5Rim4+aVVrJnunT7b 8TshbHKyWAHJWB9jLllMjLLtLOnXAO+RDmZXX9Z6cThBzQUw7MZq7+xrjENBpTw523dG embg== Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-soc@lists.gentoo.org Reply-to: gentoo-soc@lists.gentoo.org MIME-Version: 1.0 X-Received: by 10.152.5.194 with SMTP id u2mr7272492lau.17.1366939030730; Thu, 25 Apr 2013 18:17:10 -0700 (PDT) Sender: zmedico@gmail.com Received: by 10.114.28.103 with HTTP; Thu, 25 Apr 2013 18:17:10 -0700 (PDT) In-Reply-To: References: Date: Thu, 25 Apr 2013 18:17:10 -0700 X-Google-Sender-Auth: k-Qlq9I-_Zm-SAJ2Xx-oH50-DCM Message-ID: Subject: Re: [gentoo-soc] rfc: reducing the time of "Calculating dependencies" phase project. From: Zac Medico To: gentoo-soc@lists.gentoo.org Content-Type: text/plain; charset=KOI8-R Content-Transfer-Encoding: quoted-printable X-Archives-Salt: 3c1cac7e-f94b-42e8-b342-2f28420eec47 X-Archives-Hash: be864fa00047f0b8ac18c8cd7b35ab37 On Thu, Apr 25, 2013 at 11:58 AM, =E1=CC=C5=CB=D3=C1=CE=C4=D2 =E2=C5=D2=D3= =C5=CE=C5=D7 wrote: > Hello, > > my name is Alexander Bersenev, I am postgraduate of Institute of Mathemat= ics > and Mechanics(Russia). Hello, it's nice to meet you. > I want to propose a project for GSoC 2013 and ask what do you think about > it. > > In short: I want to reduce the "Calculating dependencies" phase of emerge= . > > On my notebook "emerge -pv bash" command takes 40 secs to calculate a dep= s. > If I launch it again, it take about 40 secs again(a have a lot of RAM, so > there was no HDD usage). A few things to note: 1) It will make a big difference if there is a bash version upgrade, or if the bash USE flags have changed. This is due to the --complete-graph-if-new-use and --complete-graph-if-new-ver options which are enabled by default. This behavior serves to protect reverse-dependencies from being broken. 2) Portage assumes that the portage tree can be modified between each emerge invocation. This is assumption necessary for development situations, but it has the disadvantage of introducing some extra overhead (comparing checksums of ebuilds and eclasses to the checksums found in the corresponding md5-cache entries). It would be possible to have an alternative "frozen tree" mode of operation which assumes that the portage tree can _not_ be modified between emerge invocations, and this mode would be more optimal for non-development situations. 3) Putting the portage tree on squashfs can help in some situations, since it allows the whole tree to easily fit into RAM and be accessed quickly. > Of course, quick cprofile profiling showed no places to optimize because > such optimizations already have been made. > > The main idea is add some caching layers(more high-level, than in > /usr/portage/metadata/md5-cache/). The main goal is to find and eliminate > repeated computations between "emerge" runs. > > As part of work I plan to examine approaches of other pkg managers(yum, > aptitude). > > I heard from Donnie Berkholz in IRC about pkgcore project. He said it wor= ks > faster in practice. But it has some problems with EAPI5 support. > > What is better: actualize a pkgcore code or try to dig into portage? Or i= t > is > the bad ideas at all? I suspect the pkgcore may already have a "frozen tree" mode, among other optimizations. However, it's not very useful until EAPI 5 support is completed. Adding "frozen tree" support to portage might be a nice enhancement, but I'm not sure how much performance increase that it would yield. The --complete-graph-* options that I've mentioned introduce a large amount of overhead that could easily overshadow any performance increase that a "frozen tree" optimization would give you.