public inbox for gentoo-portage-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-portage-dev] [Bug 104705] emerge doesn't print complete error message
@ 2005-10-23  6:45 Jason Stubbs
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Stubbs @ 2005-10-23  6:45 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 640 bytes --]

Commented on the bug due to reasoning behind this patch. Essentially, SIGTERM 
is sent to "tee", a WNOHANG waitpid() is performed followed by SIGKILL if it 
hasn't exited. So if tee doesn't exit immediately upon getting the SIGTERM, 
its buffers won't get a chance to get to disk due to it being killed. This 
patch simply adds a 1 second window between the SIGTERM and the SIGKILL.

One question; is the waitpid(x,0) necessary in the case where SIGKILL wasn't 
sent? Is waitpid(x,os.NOHANG) enough to clean up the zombie when SIGTERM 
succeeds? If so, the waitpid(x,0) could be indented into the "if not 
timeout:" block.

--
Jason Stubbs

[-- Attachment #2: sigterm-timeout.patch --]
[-- Type: text/x-diff, Size: 822 bytes --]

Index: pym/portage_exec.py
===================================================================
--- pym/portage_exec.py	(revision 2150)
+++ pym/portage_exec.py	(working copy)
@@ -4,7 +4,7 @@
 # $Id: /var/cvsroot/gentoo-src/portage/pym/portage_exec.py,v 1.13.2.4 2005/04/17 09:01:56 jstubbs Exp $
 
 
-import os,types,atexit,string,stat
+import os,types,atexit,string,stat,time
 import signal
 import portage_data
 import portage_util
@@ -182,7 +182,13 @@
 			for x in mypid[0:-1]:
 				try:
 					os.kill(x,signal.SIGTERM)
-					if os.waitpid(x,os.WNOHANG)[1] == 0:
+					timeout = 100
+					while timeout:
+						if os.waitpid(x,os.WNOHANG)[1] != 0:
+							break
+						time.sleep(0.01)
+						timeout -= 1
+					if not timeout:
 						# feisty bugger, still alive.
 						os.kill(x,signal.SIGKILL)
 					os.waitpid(x,0)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message)
  2005-10-23  6:45 [gentoo-portage-dev] [Bug 104705] emerge doesn't print complete error message Jason Stubbs
@ 2005-10-29 10:34 ` Jason Stubbs
  2005-10-29 10:41   ` [gentoo-portage-dev] [1/7] portage_exec cleanups Jason Stubbs
                     ` (8 more replies)
  0 siblings, 9 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 10:34 UTC (permalink / raw
  To: gentoo-portage-dev

On Sunday 23 October 2005 15:45, Jason Stubbs wrote:
> Commented on the bug due to reasoning behind this patch. Essentially,
> SIGTERM is sent to "tee", a WNOHANG waitpid() is performed followed by
> SIGKILL if it hasn't exited. So if tee doesn't exit immediately upon
> getting the SIGTERM, its buffers won't get a chance to get to disk due to
> it being killed. This patch simply adds a 1 second window between the
> SIGTERM and the SIGKILL.

Per discussion on the bug, the problem is actually that tee shouldn't be sent 
a SIGTERM or SIGKILL at all. The decided solution was to wait on tee rather 
ebuild. I went a bit further than that though and cleaned up most of spawn().

--
Jason Stubbs
-- 
gentoo-portage-dev@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [1/7] portage_exec cleanups
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
@ 2005-10-29 10:41   ` Jason Stubbs
  2005-10-29 10:44   ` [gentoo-portage-dev] [2/7] " Jason Stubbs
                     ` (7 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 10:41 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 339 bytes --]

[NOTE: this is targetted for .54]

Wait on the last listener in the pipe chain. In the standard logging case of 
`ebuild | tee`, ebuild will close its pipe when it finishes up, which will 
then cause tee to close. tee's exit status will be 0 (unless there's a 
problem with tee) after which the exit status of ebuild is read and returned.

[-- Attachment #2: 01-wait-order.patch --]
[-- Type: text/x-diff, Size: 631 bytes --]

--- portage_exec.py.orig	2005-10-29 18:48:49.000000000 +0900
+++ portage_exec.py	2005-10-29 18:49:58.000000000 +0900
@@ -177,9 +177,9 @@
 		spawned_pids.append(mypid[-1])
 		return mypid
 	while len(mypid):
-		retval=os.waitpid(mypid[-1],0)[1]
+		retval=os.waitpid(mypid.pop(0),0)[1]
 		if retval != 0:
-			for x in mypid[0:-1]:
+			for x in mypid:
 				try:
 					os.kill(x,signal.SIGTERM)
 					if os.waitpid(x,os.WNOHANG)[1] == 0:
@@ -197,8 +197,6 @@
 				return (retval >> 8) # return exit code
 			else:
 				return ((retval & 0xff) << 8) # interrupted by signal
-		else:
-			mypid.pop(-1)
 	return 0
 
 def find_binary(myc):

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [2/7] portage_exec cleanups
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
  2005-10-29 10:41   ` [gentoo-portage-dev] [1/7] portage_exec cleanups Jason Stubbs
@ 2005-10-29 10:44   ` Jason Stubbs
  2005-10-29 10:49   ` [gentoo-portage-dev] [3/7] " Jason Stubbs
                     ` (6 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 10:44 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 293 bytes --]

When a process in the pipe chain is found to be dead, check if each remaining 
process is the chain is dead already before sending it a SIGKILL. Sending of 
SIGTERM was dropped due to it having no affect unless a pause is inserted 
between sending the TERM signal and sending the KILL signal.

[-- Attachment #2: 02-only-kill-if-not-dead.patch --]
[-- Type: text/x-diff, Size: 643 bytes --]

--- portage_exec.py.orig	2005-10-29 18:51:00.000000000 +0900
+++ portage_exec.py	2005-10-29 18:53:47.000000000 +0900
@@ -180,15 +180,9 @@
 		retval=os.waitpid(mypid.pop(0),0)[1]
 		if retval != 0:
 			for x in mypid:
-				try:
-					os.kill(x,signal.SIGTERM)
-					if os.waitpid(x,os.WNOHANG)[1] == 0:
-						# feisty bugger, still alive.
-						os.kill(x,signal.SIGKILL)
+				if os.waitpid(x,os.WNOHANG) == (0,0):
+					os.kill(x,signal.SIGKILL)
 					os.waitpid(x,0)
-				except OSError, oe:
-					if oe.errno not in (10,3):
-						raise oe
 			
 			# at this point we've killed all other kid pids generated via this call.
 			# return now.

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [3/7] portage_exec cleanups
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
  2005-10-29 10:41   ` [gentoo-portage-dev] [1/7] portage_exec cleanups Jason Stubbs
  2005-10-29 10:44   ` [gentoo-portage-dev] [2/7] " Jason Stubbs
@ 2005-10-29 10:49   ` Jason Stubbs
  2005-10-29 14:22     ` Brian Harring
  2005-10-29 10:52   ` [gentoo-portage-dev] [4/7] " Jason Stubbs
                     ` (5 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 10:49 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 184 bytes --]

This patch simplifies the moving around and closing of file descriptors when 
setting up the pipe. The addition of fd_unused to the parameter list is in 
preparation of a later patch.

[-- Attachment #2: 03-fd_pipes-switching.patch --]
[-- Type: text/x-diff, Size: 2270 bytes --]

--- portage_exec.py.orig	2005-10-29 18:54:02.000000000 +0900
+++ portage_exec.py	2005-10-29 19:07:21.000000000 +0900
@@ -66,7 +66,7 @@
 	return spawn(args,uid=uid,opt_name=opt_name,**keywords)
 
 # base spawn function
-def spawn(mycommand,env={},opt_name=None,fd_pipes=None,returnpid=False,uid=None,gid=None,groups=None,umask=None,logfile=None,path_lookup=True):
+def spawn(mycommand,env={},opt_name=None,fd_pipes=None,fd_unused=None,returnpid=False,uid=None,gid=None,groups=None,umask=None,logfile=None,path_lookup=True):
 	if type(mycommand)==types.StringType:
 		mycommand=mycommand.split()
 	myc = mycommand[0]
@@ -76,7 +76,12 @@
 		myc = find_binary(myc)
 		if myc == None:
 			return None
-		
+
+	if not fd_pipes:
+		fd_pipes = {0:0, 1:1, 2:2}
+	if not fd_unused:
+		fd_unused = []
+
 	mypid=[]
 	if logfile:
 		pr,pw=os.pipe()
@@ -88,46 +93,22 @@
 				return (retval >> 8) # exit code
 			else:
 				return ((retval & 0xff) << 8) # signal
-		if not fd_pipes:
-			fd_pipes={}
-			fd_pipes[0] = 0
 		fd_pipes[1]=pw
 		fd_pipes[2]=pw
-		
+
 	if not opt_name:
 		opt_name = mycommand[0]
 	myargs=[opt_name]
 	myargs.extend(mycommand[1:])
 	mypid.append(os.fork())
 	if mypid[-1] == 0:
-		# this may look ugly, but basically it moves file descriptors around to ensure no
-		# handles that are needed are accidentally closed during the final dup2 calls.
-		trg_fd=[]
-		if type(fd_pipes)==types.DictType:
-			src_fd=[]
-			k=fd_pipes.keys()
-			k.sort()
-			for x in k:
-				trg_fd.append(x)
-				src_fd.append(fd_pipes[x])
-			for x in range(0,len(trg_fd)):
-				if trg_fd[x] == src_fd[x]:
-					continue
-				if trg_fd[x] in src_fd[x+1:]:
-					new=os.dup2(trg_fd[x],max(src_fd) + 1)
-					os.close(trg_fd[x])
-					try:
-						while True: 
-							src_fd[s.index(trg_fd[x])]=new
-					except SystemExit, e:
-						raise
-					except:
-						pass
-			for x in range(0,len(trg_fd)):
-				if trg_fd[x] != src_fd[x]:
-					os.dup2(src_fd[x], trg_fd[x])
-		else:
-			trg_fd=[0,1,2]
+		for x in fd_pipes:
+			if x != fd_pipes[x]:
+				os.dup2(fd_pipes[x], x)
+			if fd_pipes[x] not in fd_pipes and fd_pipes[x] not in fd_unused:
+				fd_unused.append(fd_pipes[x])
+		for x in fd_unused:
+			os.close(x)
 		for x in range(0,max_fd_limit):
 			if x not in trg_fd:
 				try: 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [4/7] portage_exec cleanups
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
                     ` (2 preceding siblings ...)
  2005-10-29 10:49   ` [gentoo-portage-dev] [3/7] " Jason Stubbs
@ 2005-10-29 10:52   ` Jason Stubbs
  2005-10-29 11:03     ` Jason Stubbs
  2005-10-29 10:56   ` [gentoo-portage-dev] [5/7] " Jason Stubbs
                     ` (4 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 10:52 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 381 bytes --]

Remove the exit status check from the logging's setup code and allow it to be 
handled by the final process monitoring loop. This adds slight overhead of 
unnecessarily setting up ebuild when tee is broken but simplifies the code.

Closing of the read end of the pipe is also moved to just after tee has been 
spawned so that it is not incorrectly left open in the ebuild process.

[-- Attachment #2: 04-logging.patch --]
[-- Type: text/x-diff, Size: 776 bytes --]

--- portage_exec.py.orig	2005-10-29 19:08:32.000000000 +0900
+++ portage_exec.py	2005-10-29 19:11:35.000000000 +0900
@@ -85,14 +85,8 @@
 	mypid=[]
 	if logfile:
 		pr,pw=os.pipe()
-		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:1,2:2}))
-		retval=os.waitpid(mypid[-1],os.WNOHANG)[1]
-		if retval != 0:
-			# he's dead jim.
-			if (retval & 0xff)==0:
-				return (retval >> 8) # exit code
-			else:
-				return ((retval & 0xff) << 8) # signal
+		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:fd_pipes[1],2:fd_pipes[2]},fd_unused=[pw]))
+		os.close(pr)
 		fd_pipes[1]=pw
 		fd_pipes[2]=pw
 
@@ -150,7 +144,6 @@
 		return # should never get reached
 
 	if logfile:
-		os.close(pr)
 		os.close(pw)
 	
 	if returnpid:

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
                     ` (3 preceding siblings ...)
  2005-10-29 10:52   ` [gentoo-portage-dev] [4/7] " Jason Stubbs
@ 2005-10-29 10:56   ` Jason Stubbs
  2005-10-29 15:25     ` Jason Stubbs
  2005-10-29 10:57   ` [gentoo-portage-dev] [6/7] " Jason Stubbs
                     ` (3 subsequent siblings)
  8 siblings, 1 reply; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 10:56 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 418 bytes --]

All file descriptors opened by spawn() are now closed in the correct places. 
The code that closes all unnecessary descriptors via brute force is no longer 
required.

I found that a file descriptor was being left open and passed around. However, 
it was definately not created in spawn(). It seems to me though that code 
outside of spawn should take care of closing (or telling spawn to close) any 
unnecessary FDs.

[-- Attachment #2: 05-remove-bulk-fd-closing.patch --]
[-- Type: text/x-diff, Size: 764 bytes --]

--- portage_exec.py.orig	2005-10-29 19:12:51.000000000 +0900
+++ portage_exec.py	2005-10-29 19:15:15.000000000 +0900
@@ -9,15 +9,6 @@
 import portage_data
 import portage_util
 
-try:
-	import resource
-	max_fd_limit=resource.getrlimit(RLIMIT_NOFILE)
-except SystemExit, e:
-	raise
-except:
-	# hokay, no resource module.
-	max_fd_limit=256
-
 spawned_pids = []
 def cleanup():
 	global spawned_pids
@@ -103,14 +94,6 @@
 				fd_unused.append(fd_pipes[x])
 		for x in fd_unused:
 			os.close(x)
-		for x in range(0,max_fd_limit):
-			if x not in trg_fd:
-				try: 
-					os.close(x)
-				except SystemExit, e:
-					raise
-				except:
-					pass
 		# note this order must be preserved- can't change gid/groups if you change uid first.
 		if gid:
 			os.setgid(gid)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [6/7] portage_exec cleanups
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
                     ` (4 preceding siblings ...)
  2005-10-29 10:56   ` [gentoo-portage-dev] [5/7] " Jason Stubbs
@ 2005-10-29 10:57   ` Jason Stubbs
  2005-10-29 11:01   ` [gentoo-portage-dev] [7/7] " Jason Stubbs
                     ` (2 subsequent siblings)
  8 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 10:57 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 97 bytes --]

Remove the try/except from cleanup() and only send SIGKILL to anything that 
isn't dead already.

[-- Attachment #2: 06-remove-cleanup-tryexcept.patch --]
[-- Type: text/x-diff, Size: 493 bytes --]

--- portage_exec.py.orig	2005-10-29 19:15:42.000000000 +0900
+++ portage_exec.py	2005-10-29 19:17:35.000000000 +0900
@@ -14,12 +14,9 @@
 	global spawned_pids
 	while spawned_pids:
 		pid = spawned_pids.pop()
-		try:
-			os.kill(pid,SIGKILL)
-		except SystemExit, e:
-			raise
-		except:
-			pass
+		if os.waitpid(pid,os.WNOHANG) == (0,0):
+			os.kill(pid,signal.SIGKILL)
+			os.waitpid(pid,0)
 atexit.register(cleanup)
 
 from portage_const import BASH_BINARY,SANDBOX_BINARY,SANDBOX_PIDS_FILE

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [7/7] portage_exec cleanups
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
                     ` (5 preceding siblings ...)
  2005-10-29 10:57   ` [gentoo-portage-dev] [6/7] " Jason Stubbs
@ 2005-10-29 11:01   ` Jason Stubbs
  2005-10-29 17:32   ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
  2005-10-30  7:03   ` Jason Stubbs
  8 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 11:01 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 396 bytes --]

Remove the unused code block prior to execve'ing and all the superfluous 
exits. To be specific, if os.execve succeeds then python is gone. If it 
doesn't succeed then an exception will be raised. The previous code raised 
the exception meaning that python would exit (on the child thread side) and 
portage exit handlers would be called. The os._exit(1) and friends weren't 
ever called at all.

[-- Attachment #2: 07-execvp-failure-handling.patch --]
[-- Type: text/x-diff, Size: 880 bytes --]

--- portage_exec.py.orig	2005-10-29 19:21:38.000000000 +0900
+++ portage_exec.py	2005-10-29 19:22:13.000000000 +0900
@@ -100,28 +100,12 @@
 			os.setuid(uid)
 		if umask:
 			os.umask(umask)
-		try:
-			# XXX: We would do this to stop ebuild.sh from getting any
-			# XXX: output, and consequently, we'd get to handle the sigINT.
-			#os.close(sys.stdin.fileno())
-			pass
-		except SystemExit, e:
-			raise
-		except:
-			pass
 
 		try:
-			#print "execing", myc, myargs
 			os.execve(myc,myargs,env)
-		except SystemExit, e:
-			raise
 		except Exception, e:
-			raise str(e)+":\n   "+myc+" "+string.join(myargs)
-		# If the execve fails, we need to report it, and exit
-		# *carefully* --- report error here
-		os._exit(1)
-		sys.exit(1)
-		return # should never get reached
+			print str(e)+":\n   "+myc+" "+string.join(myargs)
+			os._exit(1)
 
 	if logfile:
 		os.close(pw)

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [4/7] portage_exec cleanups
  2005-10-29 10:52   ` [gentoo-portage-dev] [4/7] " Jason Stubbs
@ 2005-10-29 11:03     ` Jason Stubbs
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 11:03 UTC (permalink / raw
  To: gentoo-portage-dev

On Saturday 29 October 2005 19:52, Jason Stubbs wrote:
> Remove the exit status check from the logging's setup code and allow it to
> be handled by the final process monitoring loop. This adds slight overhead
> of unnecessarily setting up ebuild when tee is broken but simplifies the
> code.
>
> Closing of the read end of the pipe is also moved to just after tee has
> been spawned so that it is not incorrectly left open in the ebuild process.

Forgot to mention here that the write end of the pipe is passed to tee's 
spawn() so that tee's process doesn't keep the write pipe left open causing 
it to never exit.
-- 
gentoo-portage-dev@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [3/7] portage_exec cleanups
  2005-10-29 10:49   ` [gentoo-portage-dev] [3/7] " Jason Stubbs
@ 2005-10-29 14:22     ` Brian Harring
  2005-10-29 15:06       ` Jason Stubbs
  0 siblings, 1 reply; 24+ messages in thread
From: Brian Harring @ 2005-10-29 14:22 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 3557 bytes --]

On Sat, Oct 29, 2005 at 07:49:49PM +0900, Jason Stubbs wrote:
> This patch simplifies the moving around and closing of file descriptors when 
> setting up the pipe. The addition of fd_unused to the parameter list is in 
> preparation of a later patch.

> +++ portage_exec.py	2005-10-29 19:07:21.000000000 +0900
> -		# this may look ugly, but basically it moves file descriptors around to ensure no
> -		# handles that are needed are accidentally closed during the final dup2 calls.
> -		trg_fd=[]
> -		if type(fd_pipes)==types.DictType:
> -			src_fd=[]
> -			k=fd_pipes.keys()
> -			k.sort()
> -			for x in k:
> -				trg_fd.append(x)
> -				src_fd.append(fd_pipes[x])
> -			for x in range(0,len(trg_fd)):
> -				if trg_fd[x] == src_fd[x]:
> -					continue
> -				if trg_fd[x] in src_fd[x+1:]:
> -					new=os.dup2(trg_fd[x],max(src_fd) + 1)
> -					os.close(trg_fd[x])
> -					try:
> -						while True: 
> -							src_fd[s.index(trg_fd[x])]=new
> -					except SystemExit, e:
> -						raise
> -					except:
> -						pass
> -			for x in range(0,len(trg_fd)):
> -				if trg_fd[x] != src_fd[x]:
> -					os.dup2(src_fd[x], trg_fd[x])
> -		else:
> -			trg_fd=[0,1,2]

Above *does* need reworking, knew that when I wrote it- but the reason 
it's so damn ugly is to cover against dup2 calls stomping handles that 
are needed.  Must be a collect, then re-arrange algo, not single pass

fd_pipes = [1:2, 2:1]

> +		for x in fd_pipes:
> +			if x != fd_pipes[x]:
> +				os.dup2(fd_pipes[x], x)
> +			if fd_pipes[x] not in fd_pipes and fd_pipes[x] not in fd_unused:
> +				fd_unused.append(fd_pipes[x])
> +		for x in fd_unused:
> +			os.close(x)

Replacement code will fail with the simple stdout <-> stderr switch.
dup2 will set the actual fd 2 handle to fd 1, but *prior* to 
duplicating fd 2 to a throw away handle so it remains open, end result 
is that fd 2 is now fd 1, so the 2 -> 1 move does nothing.

Code above will result in 1 == 1, 2 == 1, rather then 1 == 2, 2 == 1

Which is a regression, although existing code _probably_ won't choke 
on it- the protection I added made it robust so that it handles the 
corner case without tripping up any dev writing the code, rather then 
silently tieing the fd's together.


fd_unused doesn't cut it either also, or at the very least introduces 
an ass biter hidden behaviour into spawn- consider

fd_pipes={}

Previous code *would* close all fd's, and execute the process as such.  
With fd_unused addition, it won't anymore- meaning at the very least a 
chunk of the regen code will need updating (closes stdin last I 
looked).  Very least, all callers of spawn with fd_pipes != {0:0, 1:1, 
2:2} need verification, since the handles they were having explicitly 
closed prior, are now left open.

With the fd_unused route, there is no way clean way to have all fd's 
closed.  You don't know what fd's are open when you spawn (nor is 
there any way to know, beyond testing each potential fd, or overriding 
the file builtin); going the fd_unused route means fd's the process 
shouldn't have access to, will have access to.

Mind you the fd_unused comments are about this patch and the one +2 
down the set.

So... regressions, both in fd_pipes handling, and in open handles 
leaking to the spawned processes

The original code is ugly, and knowing a bit more this time around I'd 
write it a bit differently, that said the protections/behaviour 
originally in it should be left- it's more robust, and is expected 
now.
~harring

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [3/7] portage_exec cleanups
  2005-10-29 14:22     ` Brian Harring
@ 2005-10-29 15:06       ` Jason Stubbs
  2005-10-29 15:17         ` Brian Harring
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 15:06 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 2534 bytes --]

On Saturday 29 October 2005 23:22, Brian Harring wrote:
> On Sat, Oct 29, 2005 at 07:49:49PM +0900, Jason Stubbs wrote:
> > This patch simplifies the moving around and closing of file descriptors
> > when setting up the pipe. The addition of fd_unused to the parameter list
> > is in preparation of a later patch.
> >
> > +++ portage_exec.py	2005-10-29 19:07:21.000000000 +0900
> > -		# this may look ugly, but basically it moves file descriptors around
> > to ensure no -		# handles that are needed are accidentally closed during
> > the final dup2 calls. -		trg_fd=[]
> > -		if type(fd_pipes)==types.DictType:
> > -			src_fd=[]
> > -			k=fd_pipes.keys()
> > -			k.sort()
> > -			for x in k:
> > -				trg_fd.append(x)
> > -				src_fd.append(fd_pipes[x])
> > -			for x in range(0,len(trg_fd)):
> > -				if trg_fd[x] == src_fd[x]:
> > -					continue
> > -				if trg_fd[x] in src_fd[x+1:]:
> > -					new=os.dup2(trg_fd[x],max(src_fd) + 1)
> > -					os.close(trg_fd[x])
> > -					try:
> > -						while True:
> > -							src_fd[s.index(trg_fd[x])]=new

"s" doesn't exist. Whatever this code was intending to do, it's not doing it.

> > -					except SystemExit, e:
> > -						raise
> > -					except:
> > -						pass
> > -			for x in range(0,len(trg_fd)):
> > -				if trg_fd[x] != src_fd[x]:
> > -					os.dup2(src_fd[x], trg_fd[x])
> > -		else:
> > -			trg_fd=[0,1,2]
>
> Above *does* need reworking, knew that when I wrote it- but the reason
> it's so damn ugly is to cover against dup2 calls stomping handles that
> are needed.  Must be a collect, then re-arrange algo, not single pass
>
> fd_pipes = [1:2, 2:1]
>
> > +		for x in fd_pipes:
> > +			if x != fd_pipes[x]:
> > +				os.dup2(fd_pipes[x], x)
> > +			if fd_pipes[x] not in fd_pipes and fd_pipes[x] not in fd_unused:
> > +				fd_unused.append(fd_pipes[x])
> > +		for x in fd_unused:
> > +			os.close(x)
>
> Replacement code will fail with the simple stdout <-> stderr switch.
> dup2 will set the actual fd 2 handle to fd 1, but *prior* to
> duplicating fd 2 to a throw away handle so it remains open, end result
> is that fd 2 is now fd 1, so the 2 -> 1 move does nothing.
>
> Code above will result in 1 == 1, 2 == 1, rather then 1 == 2, 2 == 1
>
> Which is a regression, although existing code _probably_ won't choke
> on it- the protection I added made it robust so that it handles the
> corner case without tripping up any dev writing the code, rather then
> silently tieing the fd's together.

Yep, missed that case. Respun and left the fd_unused for later.

--
Jason Stubbs

[-- Attachment #2: 03-fd_pipes-switching-take2.patch --]
[-- Type: text/x-diff, Size: 1721 bytes --]

--- portage_exec.py.orig	2005-10-30 00:04:12.000000000 +0900
+++ portage_exec.py	2005-10-30 00:03:49.000000000 +0900
@@ -76,7 +76,10 @@
 		myc = find_binary(myc)
 		if myc == None:
 			return None
-		
+
+	if fd_pipes is None:
+		fd_pipes = {0:0, 1:1, 2:2}
+
 	mypid=[]
 	if logfile:
 		pr,pw=os.pipe()
@@ -88,9 +91,6 @@
 				return (retval >> 8) # exit code
 			else:
 				return ((retval & 0xff) << 8) # signal
-		if not fd_pipes:
-			fd_pipes={}
-			fd_pipes[0] = 0
 		fd_pipes[1]=pw
 		fd_pipes[2]=pw
 		
@@ -100,34 +100,16 @@
 	myargs.extend(mycommand[1:])
 	mypid.append(os.fork())
 	if mypid[-1] == 0:
-		# this may look ugly, but basically it moves file descriptors around to ensure no
-		# handles that are needed are accidentally closed during the final dup2 calls.
-		trg_fd=[]
-		if type(fd_pipes)==types.DictType:
-			src_fd=[]
-			k=fd_pipes.keys()
-			k.sort()
-			for x in k:
-				trg_fd.append(x)
-				src_fd.append(fd_pipes[x])
-			for x in range(0,len(trg_fd)):
-				if trg_fd[x] == src_fd[x]:
-					continue
-				if trg_fd[x] in src_fd[x+1:]:
-					new=os.dup2(trg_fd[x],max(src_fd) + 1)
-					os.close(trg_fd[x])
-					try:
-						while True: 
-							src_fd[s.index(trg_fd[x])]=new
-					except SystemExit, e:
-						raise
-					except:
-						pass
-			for x in range(0,len(trg_fd)):
-				if trg_fd[x] != src_fd[x]:
-					os.dup2(src_fd[x], trg_fd[x])
-		else:
-			trg_fd=[0,1,2]
+		my_pipes = {}
+		close_pipes = {}
+		for x in fd_pipes:
+			my_pipes[x] = os.dup(fd_pipes[x])
+			close_pipes[fd_pipes[x]] = True
+		for x in close_pipes:
+			os.close(x)
+		for x in my_pipes:
+			os.dup2(my_pipes[x], x)
+			os.close(my_pipes[x])
 		for x in range(0,max_fd_limit):
 			if x not in trg_fd:
 				try: 

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [3/7] portage_exec cleanups
  2005-10-29 15:06       ` Jason Stubbs
@ 2005-10-29 15:17         ` Brian Harring
  0 siblings, 0 replies; 24+ messages in thread
From: Brian Harring @ 2005-10-29 15:17 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 538 bytes --]

On Sun, Oct 30, 2005 at 12:06:10AM +0900, Jason Stubbs wrote:
> +		my_pipes = {}
> +		close_pipes = {}
> +		for x in fd_pipes:
> +			my_pipes[x] = os.dup(fd_pipes[x])
> +			close_pipes[fd_pipes[x]] = True
> +		for x in close_pipes:
> +			os.close(x)
> +		for x in my_pipes:
> +			os.dup2(my_pipes[x], x)
> +			os.close(my_pipes[x])

Works for me- might be worth tracking what was closed to avoid trying 
to close those fd's during the "force all unused fd's closed" run, 
from a speed standpoint (less syscalls).
~harring

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 10:56   ` [gentoo-portage-dev] [5/7] " Jason Stubbs
@ 2005-10-29 15:25     ` Jason Stubbs
  2005-10-29 16:04       ` Brian Harring
  0 siblings, 1 reply; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 15:25 UTC (permalink / raw
  To: gentoo-portage-dev

On Saturday 29 October 2005 19:56, Jason Stubbs wrote:
> All file descriptors opened by spawn() are now closed in the correct
> places. The code that closes all unnecessary descriptors via brute force is
> no longer required.
>
> I found that a file descriptor was being left open and passed around.
> However, it was definately not created in spawn(). It seems to me though
> that code outside of spawn should take care of closing (or telling spawn to
> close) any unnecessary FDs.


On Saturday 29 October 2005 23:22, Brian Harring wrote:
> fd_unused doesn't cut it either also, or at the very least introduces
> an ass biter hidden behaviour into spawn- consider
>
> fd_pipes={}
>
> Previous code *would* close all fd's, and execute the process as such.
> With fd_unused addition, it won't anymore- meaning at the very least a
> chunk of the regen code will need updating (closes stdin last I
> looked).  Very least, all callers of spawn with fd_pipes != {0:0, 1:1,
> 2:2} need verification, since the handles they were having explicitly
> closed prior, are now left open.

Not explicitly closed. Implicitly closed.

> With the fd_unused route, there is no way clean way to have all fd's
> closed.  You don't know what fd's are open when you spawn (nor is
> there any way to know, beyond testing each potential fd, or overriding
> the file builtin); going the fd_unused route means fd's the process
> shouldn't have access to, will have access to.

This is a difference of viewpoint, I think. Having unknown fds left open by 
the calling code seems unclean to me. Iterating through 100s or 1000s of fds 
and attempting to close them just in case the caller was sloppy seems equally 
unclean.

By the way, this method does not allow having a pipe open to a child process 
other than stdin, stdout or stderr.

> Mind you the fd_unused comments are about this patch and the one +2
> down the set.
>
> So... regressions, both in fd_pipes handling, and in open handles
> leaking to the spawned processes

fd_pipes handling is now addressed. The open handles leaking is an issue, but 
where the issue stems from is yet to be resolved.

> The original code is ugly, and knowing a bit more this time around I'd
> write it a bit differently, that said the protections/behaviour
> originally in it should be left- it's more robust, and is expected
> now.

I disagree here. Presently, other than the execve call, every try/except block 
in portage_exec.py has bugs that are hidden by the "protections".

--
Jason Stubbs
-- 
gentoo-portage-dev@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 15:25     ` Jason Stubbs
@ 2005-10-29 16:04       ` Brian Harring
  2005-10-29 16:40         ` Jason Stubbs
  0 siblings, 1 reply; 24+ messages in thread
From: Brian Harring @ 2005-10-29 16:04 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 4665 bytes --]

On Sun, Oct 30, 2005 at 12:25:23AM +0900, Jason Stubbs wrote:
> On Saturday 29 October 2005 19:56, Jason Stubbs wrote:
> > All file descriptors opened by spawn() are now closed in the correct
> > places. The code that closes all unnecessary descriptors via brute force is
> > no longer required.
> >
> > I found that a file descriptor was being left open and passed around.
> > However, it was definately not created in spawn(). It seems to me though
> > that code outside of spawn should take care of closing (or telling spawn to
> > close) any unnecessary FDs.
> 
> 
> On Saturday 29 October 2005 23:22, Brian Harring wrote:
> > With the fd_unused route, there is no way clean way to have all fd's
> > closed.  You don't know what fd's are open when you spawn (nor is
> > there any way to know, beyond testing each potential fd, or overriding
> > the file builtin); going the fd_unused route means fd's the process
> > shouldn't have access to, will have access to.
> 
> This is a difference of viewpoint, I think. Having unknown fds left open by 
> the calling code seems unclean to me. Iterating through 100s or 1000s of fds 
> and attempting to close them just in case the caller was sloppy seems equally 
> unclean.

Arbitrary example as to why the brute force is required

class user_source:
	def __init__(self):
		self.fd = open("/etc/passwd", "r")
	def __iter__(self):
		return self
	def next(self):
		l = self.fd.next()
		return l.split(":",1)[0]

from portage_exec import spawn_bash
def process_users(users):
	for x in users:
		spawn_bash("echo %s" % x, fd_pipes={1:1, 2:2})

if __name__ == "__main__":
	process_users(user_source())

I'd have to modify the process_users func to handle passing in a 
potentially unneeded list of fd's to close- if users were a list, I'm 
cluttering up process_user's signature specializing for once potential 
source, since the process_users function is designed to take an 
iterable object (any iterable object).

Example above doesn't cover the rather nasty cases where there is N 
func calls involved between where the fd is opened, and the actual 
spawn call.  The worst scenario is where there is >1 thread running- 
you're stuck using globals there, which is pretty ugly.

It's pretty much best practice to close what you don't explicitly need 
when handling total control over to another process, good daemonizing code 
does the same thing due to the reasons I raised above.


> By the way, this method does not allow having a pipe open to a child process 
> other than stdin, stdout or stderr.

It does; the portage_exec that is in stable is the modified version I 
created for ebd, which uses this exact functionality.

line #153 of trunk/pym/ebuild.py, example call-

self.pid = spawn_func(self.ebd+" daemonize", fd_pipes={0:0, 1:1, 2:2, 
	3:cread, 4:dwrite}, returnpid=True,env=env, *args, **spawn_opts)[0]

Diffing between stable and trunk portage_exec, no changes affect fd 
handling (not surprising, the code is nasty without your reworking), 
so please clarify.

Addendum though... where the implicit fd closing breaks down is in 
usage of spawn_func, which is a trunk func only, used for implementing 
parallel-fetch, due to issues with insuring that the fetching process 
isn't screwing with the normal sequential buildplan execution (nice 
way of saying it was something of a hack).  Plus side, seperate 
process, it's not bound by the gil, so it sidestepped that trickery.


> > The original code is ugly, and knowing a bit more this time around I'd
> > write it a bit differently, that said the protections/behaviour
> > originally in it should be left- it's more robust, and is expected
> > now.
> 
> I disagree here. Presently, other than the execve call, every try/except block 
> in portage_exec.py has bugs that are hidden by the "protections".

Original in this case was referencing the fd_pipes additions, not the 
true original spawn func :)

Most of that code could be converted over to OSError catches anyways, 
although doing so requires testing the piss out of it to make sure no 
regressions are introduced (shouldn't be possible, but don't care to 
screw up the subsystem).

Looking through the 2.4 subprocess module (which came along after the 
mods in question), adding an option controlling the fd cleansing would 
make sense, as long as the default option is to close the fds.

That's subjective, but the examples I gave above show (imo at least) 
why this is needed; it's anal and *will* expose bugs rather then 
letting them lie, but that's better from where I sit.
~harring

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 16:04       ` Brian Harring
@ 2005-10-29 16:40         ` Jason Stubbs
  2005-10-29 17:17           ` Jason Stubbs
                             ` (2 more replies)
  0 siblings, 3 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 16:40 UTC (permalink / raw
  To: gentoo-portage-dev

On Sunday 30 October 2005 01:04, Brian Harring wrote:
> On Sun, Oct 30, 2005 at 12:25:23AM +0900, Jason Stubbs wrote:
> > By the way, this method does not allow having a pipe open to a child
> > process other than stdin, stdout or stderr.
>
> It does; the portage_exec that is in stable is the modified version I
> created for ebd, which uses this exact functionality.
>
> line #153 of trunk/pym/ebuild.py, example call-
>
> self.pid = spawn_func(self.ebd+" daemonize", fd_pipes={0:0, 1:1, 2:2,
> 	3:cread, 4:dwrite}, returnpid=True,env=env, *args, **spawn_opts)[0]

Yep. Oversight on my part.

> > > The original code is ugly, and knowing a bit more this time around I'd
> > > write it a bit differently, that said the protections/behaviour
> > > originally in it should be left- it's more robust, and is expected
> > > now.
> >
> > I disagree here. Presently, other than the execve call, every try/except
> > block in portage_exec.py has bugs that are hidden by the "protections".
>
> Original in this case was referencing the fd_pipes additions, not the
> true original spawn func :)

Even with the fd_pipes, the try/except block in there covers a bug that is hit 
every time it is entered.

> Looking through the 2.4 subprocess module (which came along after the
> mods in question), adding an option controlling the fd cleansing would
> make sense, as long as the default option is to close the fds.

Given the threading comment, I'd be agreeable to this. However, the spawn of 
tee would have to specifically close the write side of the pipe when unknown 
FDs aren't closed. Is having two extra paramaters, close_fds (a list) and 
close_all_fds (a bool), to spawn okay?

--
Jason Stubbs
-- 
gentoo-portage-dev@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 16:40         ` Jason Stubbs
@ 2005-10-29 17:17           ` Jason Stubbs
  2005-10-29 17:17           ` Brian Harring
  2005-10-29 17:20           ` Jason Stubbs
  2 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 17:17 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 587 bytes --]

On Sunday 30 October 2005 01:40, Jason Stubbs wrote:
> Given the threading comment, I'd be agreeable to this. However, the spawn
> of tee would have to specifically close the write side of the pipe when
> unknown FDs aren't closed. Is having two extra paramaters, close_fds (a
> list) and close_all_fds (a bool), to spawn okay?

Okay. I'll back down on the closing/leaving unknown fds debate. The attached 
patch removes the usage of unused_fds introduced by 4/7, removes the 
unnecessary closing in the revised 3/7 and fixes two bugs in the detection of 
max_fd_limit.

--
Jason Stubbs

[-- Attachment #2: 05-bulk-fd-closing-take2.patch --]
[-- Type: text/x-diff, Size: 1383 bytes --]

--- portage_exec.py.orig	2005-10-30 02:05:50.000000000 +0900
+++ portage_exec.py	2005-10-30 02:08:03.000000000 +0900
@@ -11,11 +11,8 @@
 
 try:
 	import resource
-	max_fd_limit=resource.getrlimit(RLIMIT_NOFILE)
-except SystemExit, e:
-	raise
-except:
-	# hokay, no resource module.
+	max_fd_limit=resource.getrlimit(resource.RLIMIT_NOFILE)[1]
+except ImportError:
 	max_fd_limit=256
 
 spawned_pids = []
@@ -83,7 +80,7 @@
 	mypid=[]
 	if logfile:
 		pr,pw=os.pipe()
-		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:fd_pipes[1],2:fd_pipes[2]},fd_unused=[pw]))
+		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:fd_pipes[1],2:fd_pipes[2]}))
 		os.close(pr)
 		fd_pipes[1]=pw
 		fd_pipes[2]=pw
@@ -95,22 +92,15 @@
 	mypid.append(os.fork())
 	if mypid[-1] == 0:
 		my_pipes = {}
-		close_pipes = {}
 		for x in fd_pipes:
 			my_pipes[x] = os.dup(fd_pipes[x])
-			close_pipes[fd_pipes[x]] = True
-		for x in close_pipes:
-			os.close(x)
 		for x in my_pipes:
 			os.dup2(my_pipes[x], x)
-			os.close(my_pipes[x])
-		for x in range(0,max_fd_limit):
+		for x in range(max_fd_limit):
 			if x not in trg_fd:
-				try: 
+				try:
 					os.close(x)
-				except SystemExit, e:
-					raise
-				except:
+				except OSError:
 					pass
 		# note this order must be preserved- can't change gid/groups if you change uid first.
 		if gid:

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 16:40         ` Jason Stubbs
  2005-10-29 17:17           ` Jason Stubbs
@ 2005-10-29 17:17           ` Brian Harring
  2005-10-29 17:28             ` Jason Stubbs
  2005-10-29 17:20           ` Jason Stubbs
  2 siblings, 1 reply; 24+ messages in thread
From: Brian Harring @ 2005-10-29 17:17 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 1612 bytes --]

On Sun, Oct 30, 2005 at 01:40:41AM +0900, Jason Stubbs wrote:
> Even with the fd_pipes, the try/except block in there covers a bug that is hit 
> every time it is entered.
*cough* really doesn't surprise me. :)

> > Looking through the 2.4 subprocess module (which came along after the
> > mods in question), adding an option controlling the fd cleansing would
> > make sense, as long as the default option is to close the fds.
> 
> Given the threading comment, I'd be agreeable to this. However, the spawn of 
> tee would have to specifically close the write side of the pipe when unknown 
> FDs aren't closed. Is having two extra paramaters, close_fds (a list) and 
> close_all_fds (a bool), to spawn okay?

It's needed for any logged calls, doesn't mean I like it though.  How 
about this, 
def spawn(...close_fds=True...):
	if close_fds is True:
		# kill all that aren't marked for open
	elif close_fds:
		# must be an iterable of what to kill
	else:
		#whatever code required to keep it from nuking file handles

Don't really like that one much myself, but it would allow for the 
logging to use the existing structure, and one less option on the 
overgrown spawn* prototype(s). 


sidenote, since max_fd_limit needs to survive, tag 
 try:
	import resource
-	max_fd_limit=resource.getrlimit(RLIMIT_NOFILE)
+	max_fd_limit=resource.getrlimit(resource.RLIMIT_NOFILE)
 except SystemExit, e:
	raise

in while you're at it, since that's obviously broke.  Current code 
would default to range(256), but default is 1024 fds for gentoo boxes 
(so that _is_ a hole).
~harring

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 16:40         ` Jason Stubbs
  2005-10-29 17:17           ` Jason Stubbs
  2005-10-29 17:17           ` Brian Harring
@ 2005-10-29 17:20           ` Jason Stubbs
  2005-10-29 17:28             ` Brian Harring
  2 siblings, 1 reply; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 17:20 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 124 bytes --]

Trying to alter patches that have been split up is a PITA. Missed a var rename 
in the patch just sent. :/

--
Jason Stubbs

[-- Attachment #2: 05-bulk-fd-closing-take2.patch --]
[-- Type: text/x-diff, Size: 1409 bytes --]

--- portage_exec.py.orig	2005-10-30 02:05:50.000000000 +0900
+++ portage_exec.py	2005-10-30 02:18:24.000000000 +0900
@@ -11,11 +11,8 @@
 
 try:
 	import resource
-	max_fd_limit=resource.getrlimit(RLIMIT_NOFILE)
-except SystemExit, e:
-	raise
-except:
-	# hokay, no resource module.
+	max_fd_limit=resource.getrlimit(resource.RLIMIT_NOFILE)[1]
+except ImportError:
 	max_fd_limit=256
 
 spawned_pids = []
@@ -83,7 +80,7 @@
 	mypid=[]
 	if logfile:
 		pr,pw=os.pipe()
-		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:fd_pipes[1],2:fd_pipes[2]},fd_unused=[pw]))
+		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:fd_pipes[1],2:fd_pipes[2]}))
 		os.close(pr)
 		fd_pipes[1]=pw
 		fd_pipes[2]=pw
@@ -95,22 +92,15 @@
 	mypid.append(os.fork())
 	if mypid[-1] == 0:
 		my_pipes = {}
-		close_pipes = {}
 		for x in fd_pipes:
 			my_pipes[x] = os.dup(fd_pipes[x])
-			close_pipes[fd_pipes[x]] = True
-		for x in close_pipes:
-			os.close(x)
 		for x in my_pipes:
 			os.dup2(my_pipes[x], x)
-			os.close(my_pipes[x])
-		for x in range(0,max_fd_limit):
-			if x not in trg_fd:
-				try: 
+		for x in range(max_fd_limit):
+			if x not in my_pipes:
+				try:
 					os.close(x)
-				except SystemExit, e:
-					raise
-				except:
+				except OSError:
 					pass
 		# note this order must be preserved- can't change gid/groups if you change uid first.
 		if gid:

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 17:20           ` Jason Stubbs
@ 2005-10-29 17:28             ` Brian Harring
  0 siblings, 0 replies; 24+ messages in thread
From: Brian Harring @ 2005-10-29 17:28 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 719 bytes --]

On Sun, Oct 30, 2005 at 02:20:02AM +0900, Jason Stubbs wrote:
> Trying to alter patches that have been split up is a PITA. Missed a var rename 
> in the patch just sent. :/
Heh, nice timing on the max_fd_limit, beat me in sending it by 46 
seconds :)

Personally... for a change like what you're introducing into 
portage_exec, I prefer a single patch.

Chunked patched make sense depending on a somewhat subjective 
grouping; in this case, familiar with the underlying code so 
interpretting it all is possible, but if it were any other source I 
would've resorted to applying the series, then diffing between svn and 
the changes.

Your call really, since you're generating the patch(es).

~harring

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [5/7] portage_exec cleanups
  2005-10-29 17:17           ` Brian Harring
@ 2005-10-29 17:28             ` Jason Stubbs
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 17:28 UTC (permalink / raw
  To: gentoo-portage-dev

On Sunday 30 October 2005 02:17, Brian Harring wrote:
> On Sun, Oct 30, 2005 at 01:40:41AM +0900, Jason Stubbs wrote:
> > Even with the fd_pipes, the try/except block in there covers a bug that
> > is hit every time it is entered.
>
> *cough* really doesn't surprise me. :)
>
> > > Looking through the 2.4 subprocess module (which came along after the
> > > mods in question), adding an option controlling the fd cleansing would
> > > make sense, as long as the default option is to close the fds.
> >
> > Given the threading comment, I'd be agreeable to this. However, the spawn
> > of tee would have to specifically close the write side of the pipe when
> > unknown FDs aren't closed. Is having two extra paramaters, close_fds (a
> > list) and close_all_fds (a bool), to spawn okay?
>
> It's needed for any logged calls, doesn't mean I like it though.  How
> about this,
> def spawn(...close_fds=True...):
> 	if close_fds is True:
> 		# kill all that aren't marked for open
> 	elif close_fds:
> 		# must be an iterable of what to kill
> 	else:
> 		#whatever code required to keep it from nuking file handles
>
> Don't really like that one much myself, but it would allow for the
> logging to use the existing structure, and one less option on the
> overgrown spawn* prototype(s).

Yep, this sort of messyness is why I chose to back down. ;)

> sidenote, since max_fd_limit needs to survive, tag
>  try:
> 	import resource
> -	max_fd_limit=resource.getrlimit(RLIMIT_NOFILE)
> +	max_fd_limit=resource.getrlimit(resource.RLIMIT_NOFILE)
>  except SystemExit, e:
> 	raise
>
> in while you're at it, since that's obviously broke.  Current code
> would default to range(256), but default is 1024 fds for gentoo boxes
> (so that _is_ a hole).

Got that one already, but I didn't want to point out too many bugs. ;)
(There's still one more you missed there, btw; getrlimit() returns a tuple)

--
Jason Stubbs
-- 
gentoo-portage-dev@gentoo.org mailing list



^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message)
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
                     ` (6 preceding siblings ...)
  2005-10-29 11:01   ` [gentoo-portage-dev] [7/7] " Jason Stubbs
@ 2005-10-29 17:32   ` Jason Stubbs
  2005-10-29 17:38     ` Jason Stubbs
  2005-10-30  7:03   ` Jason Stubbs
  8 siblings, 1 reply; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 17:32 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 1023 bytes --]

On Saturday 29 October 2005 19:34, Jason Stubbs wrote:
> On Sunday 23 October 2005 15:45, Jason Stubbs wrote:
> > Commented on the bug due to reasoning behind this patch. Essentially,
> > SIGTERM is sent to "tee", a WNOHANG waitpid() is performed followed by
> > SIGKILL if it hasn't exited. So if tee doesn't exit immediately upon
> > getting the SIGTERM, its buffers won't get a chance to get to disk due to
> > it being killed. This patch simply adds a 1 second window between the
> > SIGTERM and the SIGKILL.
>
> Per discussion on the bug, the problem is actually that tee shouldn't be
> sent a SIGTERM or SIGKILL at all. The decided solution was to wait on tee
> rather ebuild. I went a bit further than that though and cleaned up most of
> spawn().

Here's the full patch after incorporating feedback. It should be fairly easy 
to follow given the other explanations. The only change beyond what's already 
been gone over is that spawn() is now removing pids from spawned_pids as it 
cleans them up.

--
Jason Stubbs

[-- Attachment #2: exec-cleanup.patch --]
[-- Type: text/x-diff, Size: 11789 bytes --]

Index: portage_util.py
===================================================================
--- portage_util.py	(revision 2199)
+++ portage_util.py	(working copy)
@@ -455,5 +455,15 @@
 	return mya
 
 
+def dump_traceback(msg):
+	try:
+		import sys, traceback
+		writemsg("\n====================================\n")
+		writemsg("Warning: %s\n" % msg)
+		for line in traceback.format_list(traceback.extract_stack()[:-1]):
+			writemsg(line)
+		writemsg("Please file a bug for %s\n" % sys.argv[0])
+		writemsg("====================================\n\n")
+	except:
+		pass
 
-
Index: portage.py
===================================================================
--- portage.py	(revision 2199)
+++ portage.py	(working copy)
@@ -89,7 +89,7 @@
 	import portage_util
 	from portage_util import grab_multiple, grabdict, grabdict_package, grabfile, grabfile_package, \
 		grabints, map_dictlist_vals, pickle_read, pickle_write, stack_dictlist, stack_dicts, stack_lists, \
-		unique_array, varexpand, writedict, writeints, writemsg, getconfig
+		unique_array, varexpand, writedict, writeints, writemsg, getconfig, dump_traceback
 	import portage_exception
 	import portage_gpg
 	import portage_locks
@@ -2352,9 +2352,13 @@
 	return str(eapi).strip() == str(portage_const.EAPI).strip()
 
 
-def doebuild(myebuild,mydo,myroot,mysettings,debug=0,listonly=0,fetchonly=0,cleanup=0,dbkey=None,use_cache=1,fetchall=0,tree="porttree"):
+def doebuild(myebuild,mydo,myroot,mysettings,debug=0,listonly=0,fetchonly=0,cleanup=0,dbkey=None,use_cache=1,fetchall=0,tree=None):
 	global db, actionmap_deps
 
+	if not tree:
+		dump_traceback("tree not specified to doebuild")
+		tree = "porttree"
+
 	ebuild_path = os.path.abspath(myebuild)
 	pkg_dir     = os.path.dirname(ebuild_path)
 
@@ -2759,12 +2763,12 @@
 			print "!!! mydo=qmerge, but install phase hasn't been ran"
 			sys.exit(1)
 		#qmerge is specifically not supposed to do a runtime dep check
-		return merge(mysettings["CATEGORY"],mysettings["PF"],mysettings["D"],mysettings["BUILDDIR"]+"/build-info",myroot,mysettings,myebuild=mysettings["EBUILD"])
+		return merge(mysettings["CATEGORY"],mysettings["PF"],mysettings["D"],mysettings["BUILDDIR"]+"/build-info",myroot,mysettings,myebuild=mysettings["EBUILD"],mytree=tree)
 	elif mydo=="merge":
 		retval=spawnebuild("install",actionmap,mysettings,debug,alwaysdep=1,logfile=logfile)
 		if retval:
 			return retval
-		return merge(mysettings["CATEGORY"],mysettings["PF"],mysettings["D"],mysettings["BUILDDIR"]+"/build-info",myroot,mysettings,myebuild=mysettings["EBUILD"])
+		return merge(mysettings["CATEGORY"],mysettings["PF"],mysettings["D"],mysettings["BUILDDIR"]+"/build-info",myroot,mysettings,myebuild=mysettings["EBUILD"],mytree=tree)
 	else:
 		print "!!! Unknown mydo:",mydo
 		sys.exit(1)
@@ -2937,12 +2941,12 @@
 
 	return newmtime
 
-def merge(mycat,mypkg,pkgloc,infloc,myroot,mysettings,myebuild=None):
-	mylink=dblink(mycat,mypkg,myroot,mysettings)
+def merge(mycat,mypkg,pkgloc,infloc,myroot,mysettings,myebuild=None,mytree=None):
+	mylink=dblink(mycat,mypkg,myroot,mysettings,treetype=mytree)
 	return mylink.merge(pkgloc,infloc,myroot,myebuild)
 
 def unmerge(cat,pkg,myroot,mysettings,mytrimworld=1):
-	mylink=dblink(cat,pkg,myroot,mysettings)
+	mylink=dblink(cat,pkg,myroot,mysettings,treetype="vartree")
 	if mylink.exists():
 		mylink.unmerge(trimworld=mytrimworld,cleanup=1)
 	mylink.delete()
@@ -5393,7 +5397,7 @@
 					writemsg("Uncaught handled exception: %(exception)s\n" % {"exception":str(e)})
 					raise
 
-			myret=doebuild(myebuild,"depend","/",self.mysettings,dbkey=mydbkey)
+			myret=doebuild(myebuild,"depend","/",self.mysettings,dbkey=mydbkey,tree="porttree")
 			if myret:
 				portage_locks.unlockfile(mylock)
 				self.lock_held = 0
@@ -6027,12 +6031,13 @@
 
 class dblink:
 	"this class provides an interface to the standard text package database"
-	def __init__(self,cat,pkg,myroot,mysettings):
+	def __init__(self,cat,pkg,myroot,mysettings,treetype=None):
 		"create a dblink object for cat/pkg.  This dblink entry may or may not exist"
 		self.cat     = cat
 		self.pkg     = pkg
 		self.mycpv   = self.cat+"/"+self.pkg
 		self.mysplit = pkgsplit(self.mycpv)
+		self.treetype = treetype
 
 		self.dbroot   = os.path.normpath(myroot+VDB_PATH)
 		self.dbcatdir = self.dbroot+"/"+cat
@@ -6213,7 +6218,7 @@
 
 		#do prerm script
 		if myebuildpath and os.path.exists(myebuildpath):
-			a=doebuild(myebuildpath,"prerm",self.myroot,self.settings,cleanup=cleanup,use_cache=0,tree="vartree")
+			a=doebuild(myebuildpath,"prerm",self.myroot,self.settings,cleanup=cleanup,use_cache=0,tree=self.treetype)
 			# XXX: Decide how to handle failures here.
 			if a != 0:
 				writemsg("!!! FAILED prerm: "+str(a)+"\n")
@@ -6419,7 +6424,7 @@
 		if myebuildpath and os.path.exists(myebuildpath):
 			# XXX: This should be the old config, not the current one.
 			# XXX: Use vardbapi to load up env vars.
-			a=doebuild(myebuildpath,"postrm",self.myroot,self.settings,use_cache=0,tree="vartree")
+			a=doebuild(myebuildpath,"postrm",self.myroot,self.settings,use_cache=0,tree=self.treetype)
 			# XXX: Decide how to handle failures here.
 			if a != 0:
 				writemsg("!!! FAILED postrm: "+str(a)+"\n")
@@ -6541,9 +6546,9 @@
 		if myebuild:
 			# if we are merging a new ebuild, use *its* pre/postinst rather than using the one in /var/db/pkg
 			# (if any).
-			a=doebuild(myebuild,"preinst",root,self.settings,cleanup=cleanup,use_cache=0)
+			a=doebuild(myebuild,"preinst",root,self.settings,cleanup=cleanup,use_cache=0,tree=self.treetype)
 		else:
-			a=doebuild(inforoot+"/"+self.pkg+".ebuild","preinst",root,self.settings,cleanup=cleanup,use_cache=0)
+			a=doebuild(inforoot+"/"+self.pkg+".ebuild","preinst",root,self.settings,cleanup=cleanup,use_cache=0,tree=self.treetype)
 
 		# XXX: Decide how to handle failures here.
 		if a != 0:
@@ -6658,9 +6663,9 @@
 		if myebuild:
 			# if we are merging a new ebuild, use *its* pre/postinst rather than using the one in /var/db/pkg
 			# (if any).
-			a=doebuild(myebuild,"postinst",root,self.settings,use_cache=0)
+			a=doebuild(myebuild,"postinst",root,self.settings,use_cache=0,tree=self.treetype)
 		else:
-			a=doebuild(inforoot+"/"+self.pkg+".ebuild","postinst",root,self.settings,use_cache=0)
+			a=doebuild(inforoot+"/"+self.pkg+".ebuild","postinst",root,self.settings,use_cache=0,tree=self.treetype)
 
 		# XXX: Decide how to handle failures here.
 		if a != 0:
@@ -7074,7 +7079,7 @@
 	# the merge takes care of pre/postinst and old instance
 	# auto-unmerge, virtual/provides updates, etc.
 	mysettings.load_infodir(infloc)
-	mylink=dblink(mycat,mypkg,myroot,mysettings)
+	mylink=dblink(mycat,mypkg,myroot,mysettings,treetype="bintree")
 	mylink.merge(pkgloc,infloc,myroot,myebuild,cleanup=1)
 
 	if not os.path.exists(infloc+"/RDEPEND"):
Index: portage_exec.py
===================================================================
--- portage_exec.py	(revision 2199)
+++ portage_exec.py	(working copy)
@@ -11,11 +11,8 @@
 
 try:
 	import resource
-	max_fd_limit=resource.getrlimit(RLIMIT_NOFILE)
-except SystemExit, e:
-	raise
-except:
-	# hokay, no resource module.
+	max_fd_limit=resource.getrlimit(resource.RLIMIT_NOFILE)[1]
+except ImportError:
 	max_fd_limit=256
 
 spawned_pids = []
@@ -23,12 +20,9 @@
 	global spawned_pids
 	while spawned_pids:
 		pid = spawned_pids.pop()
-		try:
-			os.kill(pid,SIGKILL)
-		except SystemExit, e:
-			raise
-		except:
-			pass
+		if os.waitpid(pid,os.WNOHANG) == (0,0):
+			os.kill(pid,signal.SIGKILL)
+			os.waitpid(pid,0)
 atexit.register(cleanup)
 
 from portage_const import BASH_BINARY,SANDBOX_BINARY,SANDBOX_PIDS_FILE
@@ -67,6 +61,7 @@
 
 # base spawn function
 def spawn(mycommand,env={},opt_name=None,fd_pipes=None,returnpid=False,uid=None,gid=None,groups=None,umask=None,logfile=None,path_lookup=True):
+	global spawned_pids
 	if type(mycommand)==types.StringType:
 		mycommand=mycommand.split()
 	myc = mycommand[0]
@@ -76,21 +71,15 @@
 		myc = find_binary(myc)
 		if myc == None:
 			return None
-		
+
+	if fd_pipes is None:
+		fd_pipes = {0:0, 1:1, 2:2}
+
 	mypid=[]
 	if logfile:
 		pr,pw=os.pipe()
-		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:1,2:2}))
-		retval=os.waitpid(mypid[-1],os.WNOHANG)[1]
-		if retval != 0:
-			# he's dead jim.
-			if (retval & 0xff)==0:
-				return (retval >> 8) # exit code
-			else:
-				return ((retval & 0xff) << 8) # signal
-		if not fd_pipes:
-			fd_pipes={}
-			fd_pipes[0] = 0
+		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:fd_pipes[1],2:fd_pipes[2]}))
+		os.close(pr)
 		fd_pipes[1]=pw
 		fd_pipes[2]=pw
 		
@@ -100,41 +89,16 @@
 	myargs.extend(mycommand[1:])
 	mypid.append(os.fork())
 	if mypid[-1] == 0:
-		# this may look ugly, but basically it moves file descriptors around to ensure no
-		# handles that are needed are accidentally closed during the final dup2 calls.
-		trg_fd=[]
-		if type(fd_pipes)==types.DictType:
-			src_fd=[]
-			k=fd_pipes.keys()
-			k.sort()
-			for x in k:
-				trg_fd.append(x)
-				src_fd.append(fd_pipes[x])
-			for x in range(0,len(trg_fd)):
-				if trg_fd[x] == src_fd[x]:
-					continue
-				if trg_fd[x] in src_fd[x+1:]:
-					new=os.dup2(trg_fd[x],max(src_fd) + 1)
-					os.close(trg_fd[x])
-					try:
-						while True: 
-							src_fd[s.index(trg_fd[x])]=new
-					except SystemExit, e:
-						raise
-					except:
-						pass
-			for x in range(0,len(trg_fd)):
-				if trg_fd[x] != src_fd[x]:
-					os.dup2(src_fd[x], trg_fd[x])
-		else:
-			trg_fd=[0,1,2]
-		for x in range(0,max_fd_limit):
-			if x not in trg_fd:
-				try: 
+		my_pipes = {}
+		for x in fd_pipes:
+			my_pipes[x] = os.dup(fd_pipes[x])
+		for x in my_pipes:
+			os.dup2(my_pipes[x], x)
+		for x in range(max_fd_limit):
+			if x not in my_pipes:
+				try:
 					os.close(x)
-				except SystemExit, e:
-					raise
-				except:
+				except OSError:
 					pass
 		# note this order must be preserved- can't change gid/groups if you change uid first.
 		if gid:
@@ -145,50 +109,29 @@
 			os.setuid(uid)
 		if umask:
 			os.umask(umask)
-		try:
-			# XXX: We would do this to stop ebuild.sh from getting any
-			# XXX: output, and consequently, we'd get to handle the sigINT.
-			#os.close(sys.stdin.fileno())
-			pass
-		except SystemExit, e:
-			raise
-		except:
-			pass
 
 		try:
-			#print "execing", myc, myargs
 			os.execve(myc,myargs,env)
-		except SystemExit, e:
-			raise
 		except Exception, e:
-			raise str(e)+":\n   "+myc+" "+string.join(myargs)
-		# If the execve fails, we need to report it, and exit
-		# *carefully* --- report error here
-		os._exit(1)
-		sys.exit(1)
-		return # should never get reached
+			print str(e)+":\n   "+myc+" "+string.join(myargs)
+			os._exit(1)
 
 	if logfile:
-		os.close(pr)
 		os.close(pw)
 	
 	if returnpid:
-		global spawned_pids
 		spawned_pids.append(mypid[-1])
 		return mypid
 	while len(mypid):
-		retval=os.waitpid(mypid[-1],0)[1]
+		pid = mypid.pop(0)
+		retval=os.waitpid(pid,0)[1]
+		if pid in spawned_pids:
+			spawned_pids.remove(pid)
 		if retval != 0:
-			for x in mypid[0:-1]:
-				try:
-					os.kill(x,signal.SIGTERM)
-					if os.waitpid(x,os.WNOHANG)[1] == 0:
-						# feisty bugger, still alive.
-						os.kill(x,signal.SIGKILL)
+			for x in mypid:
+				if os.waitpid(x,os.WNOHANG) == (0,0):
+					os.kill(x,signal.SIGKILL)
 					os.waitpid(x,0)
-				except OSError, oe:
-					if oe.errno not in (10,3):
-						raise oe
 			
 			# at this point we've killed all other kid pids generated via this call.
 			# return now.
@@ -197,8 +140,6 @@
 				return (retval >> 8) # return exit code
 			else:
 				return ((retval & 0xff) << 8) # interrupted by signal
-		else:
-			mypid.pop(-1)
 	return 0
 
 def find_binary(myc):
@@ -211,5 +152,3 @@
 			return "%s/%s" % (x,myc)
 
 	return None
-
-

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message)
  2005-10-29 17:32   ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
@ 2005-10-29 17:38     ` Jason Stubbs
  0 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-29 17:38 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 1473 bytes --]

On Sunday 30 October 2005 02:32, Jason Stubbs wrote:
> On Saturday 29 October 2005 19:34, Jason Stubbs wrote:
> > On Sunday 23 October 2005 15:45, Jason Stubbs wrote:
> > > Commented on the bug due to reasoning behind this patch. Essentially,
> > > SIGTERM is sent to "tee", a WNOHANG waitpid() is performed followed by
> > > SIGKILL if it hasn't exited. So if tee doesn't exit immediately upon
> > > getting the SIGTERM, its buffers won't get a chance to get to disk due
> > > to it being killed. This patch simply adds a 1 second window between
> > > the SIGTERM and the SIGKILL.
> >
> > Per discussion on the bug, the problem is actually that tee shouldn't be
> > sent a SIGTERM or SIGKILL at all. The decided solution was to wait on tee
> > rather ebuild. I went a bit further than that though and cleaned up most
> > of spawn().
>
> Here's the full patch after incorporating feedback. It should be fairly
> easy to follow given the other explanations. The only change beyond what's
> already been gone over is that spawn() is now removing pids from
> spawned_pids as it cleans them up.

02:30.. Brain is starting to stop working. Here's a patch of _just_ 
portage_exec.py. With regard to splitting up patches, I agree it's much 
easier to parse a complete patch when familiar with the code. When one isn't 
familiar with the code however... Splitting it up a bit makes explaining the 
whys easier too - as well as keeping discussion threads seperate.

--
Jason Stubbs

[-- Attachment #2: exec-cleanup.patch --]
[-- Type: text/x-diff, Size: 4994 bytes --]

Index: pym/portage_exec.py
===================================================================
--- pym/portage_exec.py	(revision 2199)
+++ pym/portage_exec.py	(working copy)
@@ -11,11 +11,8 @@
 
 try:
 	import resource
-	max_fd_limit=resource.getrlimit(RLIMIT_NOFILE)
-except SystemExit, e:
-	raise
-except:
-	# hokay, no resource module.
+	max_fd_limit=resource.getrlimit(resource.RLIMIT_NOFILE)[1]
+except ImportError:
 	max_fd_limit=256
 
 spawned_pids = []
@@ -23,12 +20,9 @@
 	global spawned_pids
 	while spawned_pids:
 		pid = spawned_pids.pop()
-		try:
-			os.kill(pid,SIGKILL)
-		except SystemExit, e:
-			raise
-		except:
-			pass
+		if os.waitpid(pid,os.WNOHANG) == (0,0):
+			os.kill(pid,signal.SIGKILL)
+			os.waitpid(pid,0)
 atexit.register(cleanup)
 
 from portage_const import BASH_BINARY,SANDBOX_BINARY,SANDBOX_PIDS_FILE
@@ -67,6 +61,7 @@
 
 # base spawn function
 def spawn(mycommand,env={},opt_name=None,fd_pipes=None,returnpid=False,uid=None,gid=None,groups=None,umask=None,logfile=None,path_lookup=True):
+	global spawned_pids
 	if type(mycommand)==types.StringType:
 		mycommand=mycommand.split()
 	myc = mycommand[0]
@@ -76,21 +71,15 @@
 		myc = find_binary(myc)
 		if myc == None:
 			return None
-		
+
+	if fd_pipes is None:
+		fd_pipes = {0:0, 1:1, 2:2}
+
 	mypid=[]
 	if logfile:
 		pr,pw=os.pipe()
-		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:1,2:2}))
-		retval=os.waitpid(mypid[-1],os.WNOHANG)[1]
-		if retval != 0:
-			# he's dead jim.
-			if (retval & 0xff)==0:
-				return (retval >> 8) # exit code
-			else:
-				return ((retval & 0xff) << 8) # signal
-		if not fd_pipes:
-			fd_pipes={}
-			fd_pipes[0] = 0
+		mypid.extend(spawn(('tee','-i','-a',logfile),returnpid=True,fd_pipes={0:pr,1:fd_pipes[1],2:fd_pipes[2]}))
+		os.close(pr)
 		fd_pipes[1]=pw
 		fd_pipes[2]=pw
 		
@@ -100,41 +89,16 @@
 	myargs.extend(mycommand[1:])
 	mypid.append(os.fork())
 	if mypid[-1] == 0:
-		# this may look ugly, but basically it moves file descriptors around to ensure no
-		# handles that are needed are accidentally closed during the final dup2 calls.
-		trg_fd=[]
-		if type(fd_pipes)==types.DictType:
-			src_fd=[]
-			k=fd_pipes.keys()
-			k.sort()
-			for x in k:
-				trg_fd.append(x)
-				src_fd.append(fd_pipes[x])
-			for x in range(0,len(trg_fd)):
-				if trg_fd[x] == src_fd[x]:
-					continue
-				if trg_fd[x] in src_fd[x+1:]:
-					new=os.dup2(trg_fd[x],max(src_fd) + 1)
-					os.close(trg_fd[x])
-					try:
-						while True: 
-							src_fd[s.index(trg_fd[x])]=new
-					except SystemExit, e:
-						raise
-					except:
-						pass
-			for x in range(0,len(trg_fd)):
-				if trg_fd[x] != src_fd[x]:
-					os.dup2(src_fd[x], trg_fd[x])
-		else:
-			trg_fd=[0,1,2]
-		for x in range(0,max_fd_limit):
-			if x not in trg_fd:
-				try: 
+		my_pipes = {}
+		for x in fd_pipes:
+			my_pipes[x] = os.dup(fd_pipes[x])
+		for x in my_pipes:
+			os.dup2(my_pipes[x], x)
+		for x in range(max_fd_limit):
+			if x not in my_pipes:
+				try:
 					os.close(x)
-				except SystemExit, e:
-					raise
-				except:
+				except OSError:
 					pass
 		# note this order must be preserved- can't change gid/groups if you change uid first.
 		if gid:
@@ -145,50 +109,29 @@
 			os.setuid(uid)
 		if umask:
 			os.umask(umask)
-		try:
-			# XXX: We would do this to stop ebuild.sh from getting any
-			# XXX: output, and consequently, we'd get to handle the sigINT.
-			#os.close(sys.stdin.fileno())
-			pass
-		except SystemExit, e:
-			raise
-		except:
-			pass
 
 		try:
-			#print "execing", myc, myargs
 			os.execve(myc,myargs,env)
-		except SystemExit, e:
-			raise
 		except Exception, e:
-			raise str(e)+":\n   "+myc+" "+string.join(myargs)
-		# If the execve fails, we need to report it, and exit
-		# *carefully* --- report error here
-		os._exit(1)
-		sys.exit(1)
-		return # should never get reached
+			print str(e)+":\n   "+myc+" "+string.join(myargs)
+			os._exit(1)
 
 	if logfile:
-		os.close(pr)
 		os.close(pw)
 	
 	if returnpid:
-		global spawned_pids
 		spawned_pids.append(mypid[-1])
 		return mypid
 	while len(mypid):
-		retval=os.waitpid(mypid[-1],0)[1]
+		pid = mypid.pop(0)
+		retval=os.waitpid(pid,0)[1]
+		if pid in spawned_pids:
+			spawned_pids.remove(pid)
 		if retval != 0:
-			for x in mypid[0:-1]:
-				try:
-					os.kill(x,signal.SIGTERM)
-					if os.waitpid(x,os.WNOHANG)[1] == 0:
-						# feisty bugger, still alive.
-						os.kill(x,signal.SIGKILL)
+			for x in mypid:
+				if os.waitpid(x,os.WNOHANG) == (0,0):
+					os.kill(x,signal.SIGKILL)
 					os.waitpid(x,0)
-				except OSError, oe:
-					if oe.errno not in (10,3):
-						raise oe
 			
 			# at this point we've killed all other kid pids generated via this call.
 			# return now.
@@ -197,8 +140,6 @@
 				return (retval >> 8) # return exit code
 			else:
 				return ((retval & 0xff) << 8) # interrupted by signal
-		else:
-			mypid.pop(-1)
 	return 0
 
 def find_binary(myc):
@@ -211,5 +152,3 @@
 			return "%s/%s" % (x,myc)
 
 	return None
-
-

^ permalink raw reply	[flat|nested] 24+ messages in thread

* Re: [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message)
  2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
                     ` (7 preceding siblings ...)
  2005-10-29 17:32   ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
@ 2005-10-30  7:03   ` Jason Stubbs
  8 siblings, 0 replies; 24+ messages in thread
From: Jason Stubbs @ 2005-10-30  7:03 UTC (permalink / raw
  To: gentoo-portage-dev

[-- Attachment #1: Type: text/plain, Size: 2454 bytes --]

On Saturday 29 October 2005 19:34, Jason Stubbs wrote:
> On Sunday 23 October 2005 15:45, Jason Stubbs wrote:
> > Commented on the bug due to reasoning behind this patch. Essentially,
> > SIGTERM is sent to "tee", a WNOHANG waitpid() is performed followed by
> > SIGKILL if it hasn't exited. So if tee doesn't exit immediately upon
> > getting the SIGTERM, its buffers won't get a chance to get to disk due to
> > it being killed. This patch simply adds a 1 second window between the
> > SIGTERM and the SIGKILL.
>
> Per discussion on the bug, the problem is actually that tee shouldn't be
> sent a SIGTERM or SIGKILL at all. The decided solution was to wait on tee
> rather ebuild. I went a bit further than that though and cleaned up most of
> spawn().

Ok, I'm done. The diff is twice the size of the file itself, so I've just 
attached the file. Beyond the previous patch, it's mostly comment additions 
and whitespace/style changes but there are a few edits beyond that:

* Removed unused imports
* Removed the setting of BASH_ENV from spawn_bash
* Removed the SANDBOX_PID_FILE code from spawn_sandbox
* Removed the unnecessary setting of uid in spawn_sandbox
* Added back exception handling to cleanup() as registered pids may
  be cleaned up outside of spawn()
* Changed cleanup() to send a SIGTERM rather than SIGKILL
* Added os.path.isfile() checks to the can-execute tests so that directories
  aren't incorrectly detected as executable
* Changed pathname lookup failure return from None to -1 because
  equatable-to-False returns from spawn() usually indicate success
* Raise a ValueError if a logfile is specified but stdout/stderr have not
* Moved the os.fork()==0 code into a separate function so that it can be
  easily wrapped with a try/except
* Changed the exception reporting to use sys.stderr.write() rather than print
* Moved opt_name and myargs setup into the os.fork()==0 branch as it's not
  used outside of it
* Add/remove spawned pids to/from spawned_pids regardless of whether a
  returnpid was specified
* Changed spawn() to send a SIGTERM rather than SIGKILL when the pipe fails.
* Renamed myc to binary
* Renamed mypid to mypids

Out of those changes, the SIGKILL -> SIGTERM will probably be questioned. 
SIGKILL just seems too heavy handed - especially in the cleanup() function. 
The main thing that worries me is that anything that has write buffers open 
will likely corrupt whatever it was writing to.

--
Jason Stubbs

[-- Attachment #2: portage_exec.py --]
[-- Type: application/x-python, Size: 6078 bytes --]

^ permalink raw reply	[flat|nested] 24+ messages in thread

end of thread, other threads:[~2005-10-30  7:03 UTC | newest]

Thread overview: 24+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-10-23  6:45 [gentoo-portage-dev] [Bug 104705] emerge doesn't print complete error message Jason Stubbs
2005-10-29 10:34 ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
2005-10-29 10:41   ` [gentoo-portage-dev] [1/7] portage_exec cleanups Jason Stubbs
2005-10-29 10:44   ` [gentoo-portage-dev] [2/7] " Jason Stubbs
2005-10-29 10:49   ` [gentoo-portage-dev] [3/7] " Jason Stubbs
2005-10-29 14:22     ` Brian Harring
2005-10-29 15:06       ` Jason Stubbs
2005-10-29 15:17         ` Brian Harring
2005-10-29 10:52   ` [gentoo-portage-dev] [4/7] " Jason Stubbs
2005-10-29 11:03     ` Jason Stubbs
2005-10-29 10:56   ` [gentoo-portage-dev] [5/7] " Jason Stubbs
2005-10-29 15:25     ` Jason Stubbs
2005-10-29 16:04       ` Brian Harring
2005-10-29 16:40         ` Jason Stubbs
2005-10-29 17:17           ` Jason Stubbs
2005-10-29 17:17           ` Brian Harring
2005-10-29 17:28             ` Jason Stubbs
2005-10-29 17:20           ` Jason Stubbs
2005-10-29 17:28             ` Brian Harring
2005-10-29 10:57   ` [gentoo-portage-dev] [6/7] " Jason Stubbs
2005-10-29 11:01   ` [gentoo-portage-dev] [7/7] " Jason Stubbs
2005-10-29 17:32   ` [gentoo-portage-dev] [0/7] portage_exec cleanups (WAS: [Bug 104705] emerge doesn't print complete error message) Jason Stubbs
2005-10-29 17:38     ` Jason Stubbs
2005-10-30  7:03   ` Jason Stubbs

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox