public inbox for gentoo-dev@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-dev] Racing in init scripts
@ 2003-10-15  1:36 Eric Sammer
  2003-10-15  8:21 ` Paul de Vrieze
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Eric Sammer @ 2003-10-15  1:36 UTC (permalink / raw
  To: gentoo-dev

While not exactly a "show stopper," I find that a number of the init 
scripts (namely apache, bind, and maybe cyrus, IIRC) race too quickly 
during a 'restart.'

For instance, my nameservers (all gentoo) with more than ~12 zones take 
a second to stop listening on the interfaces and close the logs. Before 
they stop, init has already tried to start named again. You see the problem.

This leads to failed starts which always makes me panic for a second 
before I realize it just ran too fast. Throwing a 'sleep' in there 
doesn't always seem like a good idea because the shutdown time is 
variable and putting a 'sleep 5' between start and stop makes things 
annoying.

Just food for thought, I suppose. Anyone else notice this? (FWIW, I've 
seen this on almost every other linux distro out there as well, but 
fixing it wouldn't keep me up at night.)

TIA.
-- 
Eric Sammer
eric@ineoconcepts.com
http://www.ineoconcepts.com


--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [gentoo-dev] Racing in init scripts
  2003-10-15  1:36 [gentoo-dev] Racing in init scripts Eric Sammer
@ 2003-10-15  8:21 ` Paul de Vrieze
  2003-10-15  8:42   ` Eric Sammer
  2003-10-15 13:01 ` Stroller
  2003-10-15 22:11 ` [gentoo-dev] " Charlie C
  2 siblings, 1 reply; 7+ messages in thread
From: Paul de Vrieze @ 2003-10-15  8:21 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wednesday 15 October 2003 03:36, Eric Sammer wrote:
> While not exactly a "show stopper," I find that a number of the init
> scripts (namely apache, bind, and maybe cyrus, IIRC) race too quickly
> during a 'restart.'
>
> For instance, my nameservers (all gentoo) with more than ~12 zones take
> a second to stop listening on the interfaces and close the logs. Before
> they stop, init has already tried to start named again. You see the
> problem.
>
> This leads to failed starts which always makes me panic for a second
> before I realize it just ran too fast. Throwing a 'sleep' in there
> doesn't always seem like a good idea because the shutdown time is
> variable and putting a 'sleep 5' between start and stop makes things
> annoying.
>
> Just food for thought, I suppose. Anyone else notice this? (FWIW, I've
> seen this on almost every other linux distro out there as well, but
> fixing it wouldn't keep me up at night.)

In general that status part of the init scripts deserves some fixing too. The 
problem is that this status checks whether the service is supposed to be 
running, not whether it is actually running. If we were to implement a status 
function we could use that for the restart too (only start when the status 
returns that it is actually not running anymore, or after a predetermined 
time has passed, say 10 seconds)

Paul

- -- 
Paul de Vrieze
Gentoo Developer
Mail: pauldv@gentoo.org
Homepage: http://www.devrieze.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/jQOFbKx5DBjWFdsRAvlMAKC1LTX2b3bsQ3U1LbEk4kLzoPevyACgpicZ
h6fTTxUdDdKZMtd5ccTCUhs=
=4QMB
-----END PGP SIGNATURE-----


--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [gentoo-dev] Racing in init scripts
  2003-10-15  8:21 ` Paul de Vrieze
@ 2003-10-15  8:42   ` Eric Sammer
  2003-10-15 13:11     ` Corvus Corax
  0 siblings, 1 reply; 7+ messages in thread
From: Eric Sammer @ 2003-10-15  8:42 UTC (permalink / raw
  To: gentoo-dev

Paul de Vrieze wrote:
> In general that status part of the init scripts deserves some fixing too. The 
> problem is that this status checks whether the service is supposed to be 
> running, not whether it is actually running.

That combined with the race aspect creates the need for 'zap' - 
something that should arguably never be needed. It would be nice to see 
that go away (the need for it, I mean).

> If we were to implement a status 
> function we could use that for the restart too (only start when the status 
> returns that it is actually not running anymore, or after a predetermined 
> time has passed, say 10 seconds)

That would be nice. Of course, this is very application-centric due to 
the numerous methods of defining "running" - a pid file, a process being 
up, etc.

I suppose it if were to be abstracted by a function call where the app 
specific work can be done (as I believe it to be now), it would be 
feasible. I think it's a matter of it bothering enough people before it 
gets looked at. The problem exists more so in potentia (unless people 
are restarting things automagically) than in normal day to day operation.

I did find one instance where this is problematic: When using logrotate 
to handle log files, some apps need to be restarted to create and start 
logging again. In this case, one would be shooting themselves in the 
foot by using the init scripts in their current incarnation - your 
services wouldn't come back up. This would be A Very Bad Thing(tm).

-- 
Eric Sammer
eric@ineoconcepts.com
http://www.ineoconcepts.com


--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [gentoo-dev] Racing in init scripts
  2003-10-15  1:36 [gentoo-dev] Racing in init scripts Eric Sammer
  2003-10-15  8:21 ` Paul de Vrieze
@ 2003-10-15 13:01 ` Stroller
  2003-10-15 22:11 ` [gentoo-dev] " Charlie C
  2 siblings, 0 replies; 7+ messages in thread
From: Stroller @ 2003-10-15 13:01 UTC (permalink / raw
  To: gentoo-dev


On 15 Oct 2003, at 2:36 am, Eric Sammer wrote:

> While not exactly a "show stopper," I find that a number of the init 
> scripts (namely apache, bind, and maybe cyrus, IIRC) race too quickly 
> during a 'restart.'... Anyone else notice this?

Yes. It's annoying, but not enough to make me want to fix it.

Stroller.


--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [gentoo-dev] Racing in init scripts
  2003-10-15  8:42   ` Eric Sammer
@ 2003-10-15 13:11     ` Corvus Corax
  2003-10-15 13:35       ` Paul de Vrieze
  0 siblings, 1 reply; 7+ messages in thread
From: Corvus Corax @ 2003-10-15 13:11 UTC (permalink / raw
  To: gentoo-dev

Am Wed, 15 Oct 2003 04:42:29 -0400
schrieb Eric Sammer <eric@ineoconcepts.com>:

> I did find one instance where this is problematic: When using logrotate 
> to handle log files, some apps need to be restarted to create and start 
> logging again. In this case, one would be shooting themselves in the 
> foot by using the init scripts in their current incarnation - your 
> services wouldn't come back up. This would be A Very Bad Thing(tm).
> 
> -- 
> Eric Sammer
> eric@ineoconcepts.com
> http://www.ineoconcepts.com
> 

This is Corvus Corax, I unfortunately stepped into exactly that "Very Bad Thing"(tm)
when porting some automatic re-connection scripts
(where net-dependant services had to be restarted partially)
to a slower machine, wher "restart" was just too fast for the services, to come down

i had to do it via "xxx stop && sleep 5 && xxx start"

the problem is, a "net.eth0 restart" for example would re-start all net - dependant services, too.
with a "stop -- start" they dont come up again automaticaly, so i have to handle each running service
manually, status confirmed as "very evil, very nasty thing" (tm)


CvC




--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [gentoo-dev] Racing in init scripts
  2003-10-15 13:11     ` Corvus Corax
@ 2003-10-15 13:35       ` Paul de Vrieze
  0 siblings, 0 replies; 7+ messages in thread
From: Paul de Vrieze @ 2003-10-15 13:35 UTC (permalink / raw
  To: gentoo-dev

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wednesday 15 October 2003 15:11, Corvus Corax wrote:
> Am Wed, 15 Oct 2003 04:42:29 -0400
>
> This is Corvus Corax, I unfortunately stepped into exactly that "Very Bad
> Thing"(tm) when porting some automatic re-connection scripts
> (where net-dependant services had to be restarted partially)
> to a slower machine, wher "restart" was just too fast for the services, to
> come down
>
> i had to do it via "xxx stop && sleep 5 && xxx start"
>
> the problem is, a "net.eth0 restart" for example would re-start all net -
> dependant services, too. with a "stop -- start" they dont come up again
> automaticaly, so i have to handle each running service manually, status
> confirmed as "very evil, very nasty thing" (tm)
>

What you could do is add some check to the start script and have it wait in 
that case. I will however try to look into fixing this "feature" of the init 
scripts by adding some actual status support (not apache is running when it 
crashed).

Paul

- -- 
Paul de Vrieze
Gentoo Developer
Mail: pauldv@gentoo.org
Homepage: http://www.devrieze.net
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.3 (GNU/Linux)

iD8DBQE/jU1PbKx5DBjWFdsRAmBeAJ9EugA1h5IwvB1VZ4C0BLDco+rZOQCffl5Y
PRnPi6SWBJPpeXLxSx7qKzw=
=Iuo/
-----END PGP SIGNATURE-----


--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 7+ messages in thread

* [gentoo-dev] Re: Racing in init scripts
  2003-10-15  1:36 [gentoo-dev] Racing in init scripts Eric Sammer
  2003-10-15  8:21 ` Paul de Vrieze
  2003-10-15 13:01 ` Stroller
@ 2003-10-15 22:11 ` Charlie C
  2 siblings, 0 replies; 7+ messages in thread
From: Charlie C @ 2003-10-15 22:11 UTC (permalink / raw
  To: gentoo-dev

It is indeed a "very evil, very nasty thing" - as bugs like this tend to 
end up "WORKSFORME". Like #22421

In almost all cases, I suspect, the solution is a "--retry -TERM/60", or 
similar, in the "start-stop-daemon --stop" line....
This should probably be standard practice for all runscripts which use 
start-stop-daemon. My own preference is to explicitly specify the signal 
and timeout, but it can be simplified (... do the man thing!).

There are many, many, runscripts which (theoretically) will suffer from 
a race like this. One potential "show stopper" is sshd...I wonder if 
anyone has been locked out of a machine because of that one?
I tried, but was unable, to get "/etc/init.d/sshd restart" to fail - 
perhaps the problem only bites when the daemon or process in question 
has a trap for SIGTERM?

Anyway, for an example, please see 
http://bugs.gentoo.org/show_bug.cgi?id=31125 (from my own fair hand, 
this very morning), or indeed #29932 or 28345.

Having a rather slow (and overloaded) machine, I've probably seen these 
race conditions more than most.

Best wishes to all devs, and everyone else,

Charlie

-----------------
Eric Sammer wrote:
> While not exactly a "show stopper," I find that a number of the init 
> scripts (namely apache, bind, and maybe cyrus, IIRC) race too quickly 
> during a 'restart.'
> 



--
gentoo-dev@gentoo.org mailing list


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2003-10-15 23:00 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2003-10-15  1:36 [gentoo-dev] Racing in init scripts Eric Sammer
2003-10-15  8:21 ` Paul de Vrieze
2003-10-15  8:42   ` Eric Sammer
2003-10-15 13:11     ` Corvus Corax
2003-10-15 13:35       ` Paul de Vrieze
2003-10-15 13:01 ` Stroller
2003-10-15 22:11 ` [gentoo-dev] " Charlie C

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox