public inbox for gentoo-portage-dev@lists.gentoo.org
 help / color / mirror / Atom feed
From: Dan Armak <danarmak@gentoo.org>
To: gentoo-portage-dev@lists.gentoo.org
Cc: kde@gentoo.org, Simone Gotti <simone.gotti@email.it>, stuart@gentoo.org
Subject: [gentoo-portage-dev] Port of Stuart's confcache for .51_rc7
Date: Thu, 7 Oct 2004 23:57:44 +0200	[thread overview]
Message-ID: <200410072357.52578.danarmak@gentoo.org> (raw)


[-- Attachment #1.1: Type: text/plain, Size: 1730 bytes --]

Hi all,

I ported Stuart's confcache patch from portage .50 to .51_rc7 (hope cvs head 
isn't too different...). I also made a few changes and fixes - see attached 
changelog if you used the orig patch.

What this does: it keeps a central cache for autoconf-generated configure 
scripts in /var/tmp. When econf is called, it checks the cache's validity 
against a list of files & md5sums. If any files have changed, the entire 
cache is cleared.

During configure's run, the sandbox logs all files accessed. There is a small 
patch to libsandbox.c, unchanged from Stuart's version, to allow this. Then, 
the file/md5sum list is updated. The sandbox log then has to be manually 
fixed up to remove these entries.

Ebuild-side: FEATURES+=confcache enables this. Either use econf, or call 
confcache_start and confcache_stop before and after your configure run. 
confcache_start adds the --with-cache=foo parameter to $ECONF_EXTRA; pass 
that to configure.

Embedded people: I've been asked to keep a separate cache for every 
combination of $CHOST, $CBUILD, $THOST. Please confirm that's how you want 
it.

Myself, I need it to speed up the upcoming split kde ebuilds (eg separate 
ebuilds for konqueror etc rather than one big kdebase ebuild).

Stuart, I hope you're ok with my changes. Basically they just consist of 
updating the patch for the new version's ebuild.sh and fixing a couple of 
bugs.

Comments welcome (my first patch to portage, hope it works :-). Credit goes to 
Stuart, really.

-- 
Dan Armak
Gentoo Linux developer (KDE)
Matan, Israel
Public GPG key: http://dev.gentoo.org/~danarmak/danarmak-gpg-public.key
Fingerprint: DD70 DBF9 E3D4 6CB9 2FDD  0069 508D 9143 8D5F 8951

[-- Attachment #1.2: confcache-portage-2.0.51_rc7.patch --]
[-- Type: text/x-diff, Size: 9206 bytes --]

--- portage-2.0.51_rc7/bin/ebuild.sh	2004-09-21 04:08:05.000000000 +0300
+++ portage-2.0.51_rc7.new/bin/ebuild.sh	2004-10-07 21:34:15.000000000 +0200
@@ -65,6 +65,15 @@
 export SANDBOX_ON="0"
 
 # sandbox support functions; defined prior to profile.bashrc srcing, since the profile might need to add a default exception (/usr/lib64/conftest fex, bug #60147)
+addlogread()
+{
+	if [ "$1" = "1" ]; then
+		export SANDBOX_LOGREAD="1"
+	else
+		unset SANDBOX_LOGREAD
+	fi
+}
+
 addread()
 {
 	export SANDBOX_READ="$SANDBOX_READ:$1"
@@ -389,7 +398,256 @@
 	done
 }
 
+CONFCACHE_MD5SUM="$PORTAGE_TMPDIR/configure-$CBUILD-$CHOST-$THOST.md5sums"
+CONFCACHE_SETTINGS="$PORTAGE_TMPDIR/configure-$CBUILD-$CHOST-$THOST.cache"
+
+# $1 - file to write the configure cache out to
+
+confcache_prepare () {
+	
+	echo ">>> Preparing configure cache"
+
+	# step 1 - make sure that none of the files in our list of deps
+	# have changed since we last built something
+
+	addwrite "$CONFCACHE_MD5SUM"
+	confcache_checksums
+
+	case "$?" in
+		2)
+			return
+			;;
+		1)
+			echo ">>> Configure deps have changed; using an empty cache"
+			rm "$CONFCACHE_MD5SUM"
+			return
+			;;
+	esac
+
+	# step 2 - now we copy the existing cache into the temporary file
+
+	cat $CONFCACHE_SETTINGS > $1
+}
+
+confcache_checksums () {
+
+	# if any of them have changed, we throw the whole cache away
+	# a future version, written in python, will be able to do
+	# per config-line deps
+
+	if [ ! -f "$CONFCACHE_MD5SUM" ]; then
+		echo ">>> You have no md5sum cache file; assuming this is your first time"
+		return 2
+	fi
+
+	while read x ; do
+		sum="`echo $x | cut -d ' ' -f 1`"
+		file="`echo $x | cut -d ' ' -f 2-`"
+
+		new_sum="`md5sum $file | awk '{ print $1 }'`"
+		if [ "$new_sum" != "$sum" ]; then
+			return 1
+		fi
+	done < $CONFCACHE_MD5SUM
+
+	return 0
+}
+
+# $1           - file containing the updated configure cache
+# $SANDBOX_LOG - file containing the sandbox log
+
+confcache_update () {
+
+	# special cases
+
+	if [ ! -f "$1" ]; then
+		echo "confcache_update(): unable to find temporary cache $1"
+		return
+	fi
+
+	echo ">>> Updating global configure cache from $1"
+
+	addwrite "$CONFCACHE_SETTINGS"
+	addwrite "$CONFCACHE_MD5SUM"
+
+	# step 1 - replace our global configure cache
+
+	cp -f "$1" "$CONFCACHE_SETTINGS"
+
+	# step 2 - make a list of files from the sandbox log
+
+	if [ ! -f "$SANDBOX_LOG" ]; then
+		# nothing we can do - let's bail
+		echo "confcache_update: sandbox log not found"
+		return
+	fi
+
+	echo ">>> Reading list of files used by configure"
+	if [ -n "$CCACHE_DIR" ]; then
+		local REMOVE_CCACHE_DIR="$CCACHE_DIR"
+	else
+		local REMOVE_CCACHE_DIR="/root/.ccache"
+	fi
+	files="`grep open_rd $SANDBOX_LOG | sed -e 's/^open_rd: \+//' | grep -v /tmp | grep -v /var/tmp | grep -v $REMOVE_CCACHE_DIR | grep -v /dev | sort | uniq`"
+
+	# step 3 - add each file to the global md5 cache
+	#
+	# yes, this is a bit slow, but relying on egrep to search the file
+	# is risky.  Sooner or later, some spanner package will rely on
+	# a file that contains spaces.
+
+	# special case - do we *have* a CONFCACHE_MD5SUM file atm?
+
+	if [ ! -f "$CONFCACHE_MD5SUM" ]; then
+
+		echo ">>> No md5sum cache found; populating for first time"
+
+		# create the file
+		touch $CONFCACHE_MD5SUM
+
+		# populate it
+
+		OLD_IFS="$IFS"
+		NEW_IFS="^M"
+		# IFS="$NEW_IFS"
+
+		for x in $files ; do
+			IFS="$OLD_IFS"
+
+			if [ ! -f "$x" ] ; then
+				continue
+			fi
+
+			newsum="`md5sum \"$x\" | awk '{ print $1 }'`"
+			echo "$newsum $x" >> $CONFCACHE_MD5SUM
+
+			# IFS="$NEW_IFS"
+		done
+		IFS="$OLD_IFS"
+	fi
+
+	# if we don't
+
+	OLD_IFS="$IFS"
+	NEW_IFS="
+"
+
+	IFS="$NEW_IFS"
+
+	echo ">>> Updating md5sum cache"
+
+	for x in $files ; do
+		if [ ! -f "$x" ]; then
+			continue
+		fi
+
+		infile=0
+		while read y < $CONFCACHE_MD5SUM ; do
+			IFS="$OLD_IFS"
+		
+			sum="`echo $x | cut -d ' ' -f 1`"
+			file="`echo $x | cut -d ' ' -f 2-`"
+
+			if [ "$file" = "$x" ]; then
+				infile=1
+				break
+			fi
+			IFS="$NEW_IFS"
+		done
+
+		IFS="$OLD_IFS"
+
+		if [ "$infile" = "0" ]; then
+			newsum="`md5sum \"$x\" | awk '{ print $1 }'`"
+			echo "$newsum $x" >> $CONFCACHE_MD5SUM
+		fi
+
+		IFS="$NEW_IFS"
+	done
+
+	IFS="$OLD_IFS"
+}
+
+confcache_start() {
+	# global configure cache - stuart@gentoo.org
+	#
+	# if FEATURES="sandbox confcache" are set, we maintain a global
+	# cache of results from previous configure statements
+	#
+	# this global cache should benefit all machines, but especially
+	# multi-processor boxes
+
+	if [ "${FEATURES//*confcache*/true}" = "true" -a "${FEATURES//*sandbox*/true}" = "true" -a "${RESTRICT//*noconfcache*/true}" != "true" ]; then
+		CONF_CACHE="`/bin/tempfile`"
+		echo ">>> Temporary configure cache file is $CONF_CACHE"
+		addwrite "$CONF_CACHE"
+		confcache_prepare "$CONF_CACHE"
+		EXTRA_ECONF="--cache-file=$CONF_CACHE $EXTRA_ECONF"
+		addlogread 1
+
+		# mark these variables read-only, so that they can't be
+		# changed to, say, /etc/passwd
+
+		# typeset -r CONF_CACHE
+		# typeset -r READLOG
+	else
+		echo "!!! Not using global configure cache"
+	fi
+}
+
+confcache_stop() {
+	if [ "${FEATURES//*confcache*/true}" = "true" -a "${FEATURES//*sandbox*/true}" = "true" -a "${RESTRICT//*noconfcache*/true}" != "true" ]; then
+		addlogread 0
+		confcache_update "$CONF_CACHE"
+		[ -f "$CONF_CACHE" ] && rm -f "$CONF_CACHE"
+
+		# now we need to clean up the log file, otherwise portage
+		# will have a fit ;-)
+		#
+		# here, we remove all the entries that the SANDBOX_LOGREAD
+		# feature will have added
+		#
+		# any entries that we don't remove are sandbox violations
+		# by definitions
+		echo ">>> Fixing sandbox log"
+		if [ -f "$SANDBOX_LOG" ]; then
+			SANDBOX_DIRS="`echo $SANDBOX_WRITE | tr ':' ' '`"
+
+			sed -i -e "s/^open_rd:.*//" \
+					-e "s/^execve:.*//" \
+					-e "s/^opendir:.*//" $SANDBOX_LOG
+			for x in $SANDBOX_DIRS ; do
+				sed -i -e "s/^open_wr:   ${x//\//\\/}.*//" \
+					-e "s/^unlink:    ${x//\//\\/}.*//" \
+					-e "s/^rename:    ${x//\//\\/}.*//" \
+					-e "s/^chmod:     ${x//\//\\/}.*//" \
+					-e "s/^chown:     ${x//\//\\/}.*//" \
+					-e "s/^symlink:   ${x//\//\\/}.*//" \
+					-e "s/^rmdir:     ${x//\//\\/}.*//" \
+					-e "s/^mkdir:     ${x//\//\\/}.*//" $SANDBOX_LOG
+			done
+
+			# now we remove the empty lines
+
+			TMPFILE="`tempfile`"
+			egrep -v '^$' $SANDBOX_LOG > $TMPFILE
+			cat $TMPFILE > $SANDBOX_LOG
+			rm $TMPFILE
+
+			# if the file is empty, we can now remove the file
+			#
+			# this test may break on arches that do not use GNU ls
+	
+			if [ "`ls -l $SANDBOX_LOG | awk '{ print $5 }'`" = "0" ]; then
+				rm -f $SANDBOX_LOG
+			fi
+		fi
+	fi
+}
+
 econf() {
+	confcache_start
+	
 	if [ -x ./configure ]; then
 		if hasq autoconfig $FEATURES && ! hasq autoconfig $RESTRICT; then
 			if [ -e /usr/share/gnuconfig/ -a -x /bin/basename ]; then
@@ -440,8 +698,12 @@
 			${EXTRA_ECONF} \
 			"$@" || die "econf failed"
 	else
+		[ -n "$CONF_CACHE" ] && rm -f "$CONF_CACHE"
+		[ -n "$READLOG" ] && rm -f "$READLOG" 
 		die "no configure script found"
 	fi
+
+	confcache_stop
 }
 
 einstall() {
--- portage-2.0.51_rc7/src/sandbox-1.1/libsandbox.c	2004-08-02 21:49:31.000000000 +0300
+++ portage-2.0.51_rc7.new/src/sandbox-1.1/libsandbox.c	2004-10-07 15:27:53.000000000 +0200
@@ -1193,14 +1193,14 @@
 	debug_log_env = getenv("SANDBOX_DEBUG");
 	debug_log_path = getenv("SANDBOX_DEBUG_LOG");
 
-	if (((NULL == log_path) ||
-			 (0 != strncmp(absolute_path, log_path, strlen(log_path)))) &&
+	if (((NULL != log_path) ||
+			 (0 == strncmp(absolute_path, log_path, strlen(log_path)))) &&
 			((NULL == debug_log_env) ||
 			 (NULL == debug_log_path) ||
 			 (0 != strncmp(absolute_path, debug_log_path, strlen(debug_log_path))))
-			&& (0 == check_access(sbcontext, func, absolute_path))
 			) {
-		if (1 == sbcontext->show_access_violation) {
+
+		if ((0 == check_access(sbcontext, func, absolute_path)) && (1 == sbcontext->show_access_violation)) {
 			fprintf(stderr,
 							"\e[31;01mACCESS DENIED\033[0m  %s:%*s%s\n",
 							func, (int) (10 - strlen(func)), "", absolute_path);
@@ -1226,9 +1226,33 @@
 					}
 				}
 			}
+
+			result = 0;
 		}
+		else if (getenv("SANDBOX_LOGREAD") != NULL)
+		{
+			/* create the log message */
+
+			sprintf (
+				buffer, 
+				"%s:%*s%s\n", 
+				func, 
+				(int) (10 - strlen(func)), 
+				"",
+				absolute_path
+			);
+
+			/* fprintf(stderr, buffer); */
+
+			/* log the event to the log file anyway */
+			log_file = true_open(log_path, O_APPEND | O_WRONLY | O_CREAT, S_IRUSR | S_IWUSR | S_IRGRP | S_IROTH);
+			if (log_file >= 0) {
+				write(log_file, buffer, strlen(buffer));
+				close(log_file);
+			}
 
-		result = 0;
+			result = 1;
+		}
 	} else if (NULL != debug_log_env) {
 		if (NULL != debug_log_path) {
 			if (0 != strncmp(absolute_path, debug_log_path, strlen(debug_log_path))) {

[-- Attachment #1.3: confcache.changelog --]
[-- Type: text/plain, Size: 989 bytes --]

20041007, danarmak:
	* ported stuart's orig patch to portage .51_rc7
	* change: move the confcache code in econf into new functions confcache_start(),
		confcache_stop(), which are called at the beginning and end of econd().
		this way ebuilds that don't use econf (eg kde.eclass) can still use confcache explicitly.
	* fix: confcache_stop: also remove allowed chown()s from sandbox log
	* fix: confcache_stop: fix sed invocations
	* optimize: confcache_stop: convert many sed invocations into one invocation with many scripts
		(can be further reduced to egrep()s, but i'm lazy and this seems fast enough)
	* change: move cache to PORTAGE_TMPDIR. /etc/portage is obviously the wrong place, and carpaski
		advises not using /var/cache/edb like ferringb did as long as the implementation is in bash
		and runs under the sandbox.
	* change: store separate cache per combinations of $CHOST, $CBUILD, $TBUILD, as in ferringb's version;
		the embedded people are said to want this	

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

             reply	other threads:[~2004-10-07 21:57 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-10-07 21:57 Dan Armak [this message]
2004-10-08 10:59 ` [gentoo-portage-dev] Re: Port of Stuart's confcache for .51_rc7 Dan Armak
2004-10-08 11:02   ` Dan Armak

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=200410072357.52578.danarmak@gentoo.org \
    --to=danarmak@gentoo.org \
    --cc=gentoo-portage-dev@lists.gentoo.org \
    --cc=kde@gentoo.org \
    --cc=simone.gotti@email.it \
    --cc=stuart@gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox