* RE: [gentoo-dev] init script guidelines
@ 2005-07-19 18:08 Eric Brown
2005-07-19 18:40 ` Chris Gianelloni
2005-07-19 20:03 ` Mike Frysinger
0 siblings, 2 replies; 19+ messages in thread
From: Eric Brown @ 2005-07-19 18:08 UTC (permalink / raw
To: gentoo-dev
A few responses:
(Please forgive the lack of normal formatting)
1) To Chris Gianelloni
I really do agree that it's silly for a daemon to lie about it's
initialization status. However, after actually haven taken some of
these issues upstream (in particular Apache 1.3). I realized that the
upstream devs don't really consider these bugs all of the time. In
Apache's case, it's a bug, but one that's never going to be fixed in 1.3
(2.0 supposedly fixes it). I think there was one case where pure-ftpd
actually fixed one of these bugs when I reported it.
My point is that Snort and Apache are not alone in this, so I suppose
quite a few upstream developers just disagree with us on what proper
initialization means. Why should our users suffer?
2) To Mike Frysinger
Most of these services are pretty common, and the suckage is usually
limited to this area of initialization =)
I do see how timing could be an issue for sleeps, but I would personally
much rather have a timeout variable in conf.d somewhere rather than no
check at all.
I would also much rather have a simple check be performed that produced
false positives itself (which is what the init scripts are doing now),
as long as it cut down on the total number of false positives.
3) To anyone else
So far it looks like developer awareness is the best we can do?
What about making standard functions or check services available to help
developers who are aware and need to use them?
Even if developers just become willing to add checks, that would be
great. Right now most devs simply rely on upstream (although I think
upstream should certainly be a part of each case).
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [gentoo-dev] init script guidelines
2005-07-19 18:08 [gentoo-dev] init script guidelines Eric Brown
@ 2005-07-19 18:40 ` Chris Gianelloni
2005-07-19 20:43 ` Michael Cummings
2005-07-19 21:53 ` Martin Schlemmer
2005-07-19 20:03 ` Mike Frysinger
1 sibling, 2 replies; 19+ messages in thread
From: Chris Gianelloni @ 2005-07-19 18:40 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 667 bytes --]
On Tue, 2005-07-19 at 14:08 -0400, Eric Brown wrote:
> My point is that Snort and Apache are not alone in this, so I suppose
> quite a few upstream developers just disagree with us on what proper
> initialization means. Why should our users suffer?
They shouldn't, but that doesn't mean implementing some half-baked hack
to resolve the situation. It might be better to instead patch the
daemon in question and send the patches upstream. Upstream developers
(usually) are much more willing to make changes when you've done the
work for them... ;]
--
Chris Gianelloni
Release Engineering - Strategic Lead/QA Manager
Games - Developer
Gentoo Linux
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 18:40 ` Chris Gianelloni
@ 2005-07-19 20:43 ` Michael Cummings
2005-07-19 21:07 ` Chris Gianelloni
2005-07-19 21:53 ` Martin Schlemmer
1 sibling, 1 reply; 19+ messages in thread
From: Michael Cummings @ 2005-07-19 20:43 UTC (permalink / raw
To: gentoo-dev
not to detract from the discussion, but...anyone else notice this?
On Tue, 19 Jul 2005 14:40:01 -0400
Chris Gianelloni <wolf31o2@gentoo.org> wrote:
> They shouldn't, but that doesn't mean implementing some half-baked
> hack to resolve the situation. It might be better to instead patch
> the daemon in question and send the patches upstream. Upstream
> developers (usually) are much more willing to make changes when you've
> done the work for them... ;]
>
On Tue, 19 Jul 2005 15:39:16 -0400
"Eric Brown" <ebrown@magbank.com> wrote:
>
> They shouldn't, but that doesn't mean implementing some half-baked
> hack to resolve the situation. It might be better to instead patch
> the daemon in question and send the patches upstream. Upstream
> developers (usually) are much more willing to make changes when you've
> done the work for them... ;]
>
I'm beginning to suspect Eric and Chris are the same person. Prove they
aren't - show evidence of them independently in the same room at the
same time ;)
(and being a mid-stream developer, I know *I* like working patches more
than 'fix your junk, it broke' reports)
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 20:43 ` Michael Cummings
@ 2005-07-19 21:07 ` Chris Gianelloni
0 siblings, 0 replies; 19+ messages in thread
From: Chris Gianelloni @ 2005-07-19 21:07 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1541 bytes --]
On Tue, 2005-07-19 at 16:43 -0400, Michael Cummings wrote:
> not to detract from the discussion, but...anyone else notice this?
He quoted me. His text was above mine.
People have met me. They know I exist. Though Eric might be a figment
of my shattered subconscious psyche. Who knows? :P
> On Tue, 19 Jul 2005 14:40:01 -0400
> Chris Gianelloni <wolf31o2@gentoo.org> wrote:
>
> > They shouldn't, but that doesn't mean implementing some half-baked
> > hack to resolve the situation. It might be better to instead patch
> > the daemon in question and send the patches upstream. Upstream
> > developers (usually) are much more willing to make changes when you've
> > done the work for them... ;]
> >
>
> On Tue, 19 Jul 2005 15:39:16 -0400
> "Eric Brown" <ebrown@magbank.com> wrote:
> >
> > They shouldn't, but that doesn't mean implementing some half-baked
> > hack to resolve the situation. It might be better to instead patch
> > the daemon in question and send the patches upstream. Upstream
> > developers (usually) are much more willing to make changes when you've
> > done the work for them... ;]
> >
>
> I'm beginning to suspect Eric and Chris are the same person. Prove they
> aren't - show evidence of them independently in the same room at the
> same time ;)
>
> (and being a mid-stream developer, I know *I* like working patches more
> than 'fix your junk, it broke' reports)
--
Chris Gianelloni
Release Engineering - Strategic Lead/QA Manager
Games - Developer
Gentoo Linux
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [gentoo-dev] init script guidelines
2005-07-19 18:40 ` Chris Gianelloni
2005-07-19 20:43 ` Michael Cummings
@ 2005-07-19 21:53 ` Martin Schlemmer
2005-07-20 6:30 ` Roy Marples
1 sibling, 1 reply; 19+ messages in thread
From: Martin Schlemmer @ 2005-07-19 21:53 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1627 bytes --]
On Tue, 2005-07-19 at 14:40 -0400, Chris Gianelloni wrote:
> On Tue, 2005-07-19 at 14:08 -0400, Eric Brown wrote:
> > My point is that Snort and Apache are not alone in this, so I suppose
> > quite a few upstream developers just disagree with us on what proper
> > initialization means. Why should our users suffer?
>
> They shouldn't, but that doesn't mean implementing some half-baked hack
> to resolve the situation. It might be better to instead patch the
> daemon in question and send the patches upstream. Upstream developers
> (usually) are much more willing to make changes when you've done the
> work for them... ;]
>
I know Roy already did the sleep check in rc-services.sh which is small,
and I think fairly acceptable, but like Mike said, you cannot make it
longer and then do it for all, as some arches is just too slow, and I'm
going to guess we have a less than 10% of services with this issue?
Personally I think the issue should be taken on a per-package basis, and
if somebody sees an issue, open a bug against snort/apache/whatever to
do a timeout, and then check some or other way if its actually started.
For the developer awareness issue ... its not always such an open/shut
case. I can't remember what had this issue, but some daemon only
displayed this issues with slower boxes, and not the faster ones, so it
really will totally depend on what type of hardware the developer have
or not. So yeah, better awareness by adding a section to the developer
manual or something to the test for new developers might help, but not
fool proof.
--
Martin Schlemmer
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 18:08 [gentoo-dev] init script guidelines Eric Brown
2005-07-19 18:40 ` Chris Gianelloni
@ 2005-07-19 20:03 ` Mike Frysinger
1 sibling, 0 replies; 19+ messages in thread
From: Mike Frysinger @ 2005-07-19 20:03 UTC (permalink / raw
To: gentoo-dev
On Tuesday 19 July 2005 02:08 pm, Eric Brown wrote:
> I do see how timing could be an issue for sleeps, but I would personally
> much rather have a timeout variable in conf.d somewhere rather than no
> check at all.
because you're only looking at one side of the race condition
your check goes to sleep for 3 seconds ... then the service starts up but
because it's a slow CPU, it takes 10 seconds to get to the config file
parsing where it fails and exits silently ... when the check wakes back up it
goes 'hey, service is still running, all is good'
-mike
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
* RE: [gentoo-dev] init script guidelines
@ 2005-07-19 19:39 Eric Brown
0 siblings, 0 replies; 19+ messages in thread
From: Eric Brown @ 2005-07-19 19:39 UTC (permalink / raw
To: gentoo-dev
Not everyone can patch them, more people would be capable of writing
half-baked hacks that resolve most of the issues.
Anyway I guess the new baselayout sounds promising here.
> My point is that Snort and Apache are not alone in this, so I suppose
> quite a few upstream developers just disagree with us on what proper
> initialization means. Why should our users suffer?
They shouldn't, but that doesn't mean implementing some half-baked hack
to resolve the situation. It might be better to instead patch the
daemon in question and send the patches upstream. Upstream developers
(usually) are much more willing to make changes when you've done the
work for them... ;]
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
* [gentoo-dev] init script guidelines
@ 2005-07-19 16:42 Eric Brown
2005-07-19 17:22 ` Chris Gianelloni
` (3 more replies)
0 siblings, 4 replies; 19+ messages in thread
From: Eric Brown @ 2005-07-19 16:42 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 2298 bytes --]
Services that use Gentoo init scripts often report a status of [started]
or
[OK] even though they fail to start. The most recent bug like this that
I've
found is with snort. If you have a bad rule, snort will initialize, the
rc-scripts will give it an [OK] status, and then it will die once it
parses the
rules.
The real problem is not that the daemons don't return errors, but that
our init
scripts do not make reasonable attempts to verify service startup. If a
Gentoo
init script claims that a service started, it should make an effort to
check
that the processes are actually running shortly after the script is run,
even if
start-stop-daemon says the parent process initialized. Relying on the
return
value of start-stop-daemon is simply insufficient for some services.
I am aware that there are services that can monitor the status of other
services
(app-admin/mon?) but I think this issue is a little different. If an
ebuild
developer is aware of an error condition can commonly occur shortly
after a
daemon initializes, why not attempt to catch those errors? Most of them
could
probably be caught by simply checking to see if the process is still
running
shortly after the script is run.
I propose increasing developer awareness of this problem, perhaps
through some
formal guidelines for ebuild developers. At the very least, I would
like to see
these bugs being acknowledged in bugs.gentoo.org instead of getting the
same old
upstream/it's not our fault response. We are responsible for our init
scripts,
and they are important to our users.
I have 2 ideas for the actual implementation:
1) Some kind of check() function in the init.d script, or a generic
check() function
that just checks with ps | grep. This might typically be called after
having the
init script sleep for a certain amount of time.
2) Some kind of special init script that checks registered daemons after
all services
have started. (i.e. it depends on all daemons, or they are put into it's
config file).
With this scheme we could avoid excessive sleeping during startup (to
keep it fast),
And perhaps even keep using service specific check() functions
Does anyone else think this idea is worth looking into?
[-- Attachment #2: Type: text/html, Size: 8233 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 16:42 Eric Brown
@ 2005-07-19 17:22 ` Chris Gianelloni
2005-07-19 17:45 ` Mike Frysinger
` (2 subsequent siblings)
3 siblings, 0 replies; 19+ messages in thread
From: Chris Gianelloni @ 2005-07-19 17:22 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 4033 bytes --]
On Tue, 2005-07-19 at 12:42 -0400, Eric Brown wrote:
> Services that use Gentoo init scripts often report a status of [started] or
> [OK] even though they fail to start. The most recent bug like this that I've
> found is with snort. If you have a bad rule, snort will initialize, the
> rc-scripts will give it an [OK] status, and then it will die once it parses the
> rules.
So snort shouldn't be giving the OK until it really is OK.
>
> The real problem is not that the daemons don't return errors, but that our init
> scripts do not make reasonable attempts to verify service startup. If a Gentoo
> init script claims that a service started, it should make an effort to check
> that the processes are actually running shortly after the script is run, even if
> start-stop-daemon says the parent process initialized. Relying on the return
> value of start-stop-daemon is simply insufficient for some services.
Not really. An init script is simply a script. It doesn't guarantee
anything other than what the service told it. If a service is returning
status codes when it really isn't completed its initialization, that is
a bug in that service, not in the init script code. While code might
need to be adjusted in the init script, this will most likely require
patches to the upstream sources.
>
> I am aware that there are services that can monitor the status of other services
> (app-admin/mon?) but I think this issue is a little different. If an ebuild
> developer is aware of an error condition can commonly occur shortly after a
> daemon initializes, why not attempt to catch those errors? Most of them could
> probably be caught by simply checking to see if the process is still running
> shortly after the script is run.
I agree with you that we should catch the errors, but running another
check is simply a waste of time. The service should not ever show a
completed state until it is completed. It shouldn't ever be like "Yes,
snort worked.......... oh wait, no it didn't." That is even more
confusing for users.
> I propose increasing developer awareness of this problem, perhaps through some
> formal guidelines for ebuild developers. At the very least, I would like to see
> these bugs being acknowledged in bugs.gentoo.org instead of getting the same old
> upstream/it's not our fault response. We are responsible for our init scripts,
> and they are important to our users.
You really need to take this up with the developers in question, as this
is not a global matter, but really a matter with specific packages.
Those are bugs in those packages. If the ebuild maintainers are
refusing to resolve issues in the init scripts, which are definitely
Gentoo works, please take it up with user relations or attempt to
provide a fix for the problem.
>
> I have 2 ideas for the actual implementation:
>
> 1) Some kind of check() function in the init.d script, or a generic check() function
> that just checks with ps | grep. This might typically be called after having the
> init script sleep for a certain amount of time.
I would object to this. Having a function to check the status of a
service for all of the possible services, when it is only a few that are
showing this error, is a bad idea. It adds extra load on all developers
that have any init scripts, and is unnecessary in most cases.
>
> 2) Some kind of special init script that checks registered daemons after all services
> have started. (i.e. it depends on all daemons, or they are put into it’s config file).
> With this scheme we could avoid excessive sleeping during startup (to keep it fast),
> And perhaps even keep using service specific check() functions
This would require much more knowledge on the end-user's part. Plus, it
will need to be aware of init script dependencies. All in all, it
sounds like a bad patch for a situation.
--
Chris Gianelloni
Release Engineering - Strategic Lead/QA Manager
Games - Developer
Gentoo Linux
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 16:42 Eric Brown
2005-07-19 17:22 ` Chris Gianelloni
@ 2005-07-19 17:45 ` Mike Frysinger
2005-07-19 18:00 ` Roy Marples
2005-07-19 18:14 ` Francesco R
3 siblings, 0 replies; 19+ messages in thread
From: Mike Frysinger @ 2005-07-19 17:45 UTC (permalink / raw
To: gentoo-dev
On Tuesday 19 July 2005 12:42 pm, Eric Brown wrote:
> The real problem is not that the daemons don't return errors, but that
> our init scripts do not make reasonable attempts to verify service startup.
i'd disagree ... if a service sucks, it sucks
adding some code to try and guess whether the service actually started is a
roundabout (and by no means fool proof) way of doing things ... it may result
in correct results sometimes, but i imagine it'll also be susceptible to
false positives
> If a Gentoo init script claims that a service started, it should make an
> effort to check that the processes are actually running shortly after the
> script is run
how do you define 'short' ? really anything that relies on sometime out value
like this is a flawed design ... just cause your smokin fast amd64 should
complete in .1 seconds doesnt mean my not-very-smokin-fast-at-all arm
netwinder can complete inside of 3 seconds
> Relying on the return
> value of start-stop-daemon is simply insufficient for some services.
then those services should not be using ssd
> I propose increasing developer awareness of this problem, perhaps
> through some formal guidelines for ebuild developers.
this seems to be the only feasible approach (and one i'm all for)
-mike
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 16:42 Eric Brown
2005-07-19 17:22 ` Chris Gianelloni
2005-07-19 17:45 ` Mike Frysinger
@ 2005-07-19 18:00 ` Roy Marples
2005-07-19 22:16 ` Francesco R
2005-08-23 14:09 ` Paul de Vrieze
2005-07-19 18:14 ` Francesco R
3 siblings, 2 replies; 19+ messages in thread
From: Roy Marples @ 2005-07-19 18:00 UTC (permalink / raw
To: gentoo-dev
On Tue, 2005-07-19 at 12:42 -0400, Eric Brown wrote:
> The real problem is not that the daemons don't return errors, but that our init
> scripts do not make reasonable attempts to verify service startup. If a Gentoo
> init script claims that a service started, it should make an effort to check
> that the processes are actually running shortly after the script is run, even if
> start-stop-daemon says the parent process initialized. Relying on the return
> value of start-stop-daemon is simply insufficient for some services.
I agree.
Infact, rc-services.sh (/lib/rcscripts/sh) has been totally re-written
for the baselayout-1.12.x branch. It now intercepts calls to
start-stop-daemon and checks if the daemon is still active after a
default time of 0.1 (adjustable) seconds. If not, the we assume the
daemon failed. This solves many existing bugs :)
Also, we kill any rogue processes and other such checks when a stop call
to start-stop-daemon is made - which is handy for when asterisk fails to
start and leaves mpg123 processes lying around :)
Check it out when baselayout-1.12.0pre1 hits portage!
Caveat: - some init scripts abuse start-stop-daemon. One example are all
courier scripts which pass the env program as a daemon. This is easily
worked around, but we fail badly if env then calls a shell script which
in turn launches a daemon. Of all the server stuff I run, only couier
has this issue - but there may be other programs too. Basically
start-stop-daemon should only call daemons!
http://bugs.gentoo.org/show_bug.cgi?id=98745
Roy
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 18:00 ` Roy Marples
@ 2005-07-19 22:16 ` Francesco R
2005-08-23 14:09 ` Paul de Vrieze
1 sibling, 0 replies; 19+ messages in thread
From: Francesco R @ 2005-07-19 22:16 UTC (permalink / raw
To: gentoo-dev
Roy Marples wrote:
>On Tue, 2005-07-19 at 12:42 -0400, Eric Brown wrote:
>
>
>
>
>>The real problem is not that the daemons don't return errors, but that our init
>>scripts do not make reasonable attempts to verify service startup. If a Gentoo
>>init script claims that a service started, it should make an effort to check
>>that the processes are actually running shortly after the script is run, even if
>>start-stop-daemon says the parent process initialized. Relying on the return
>>value of start-stop-daemon is simply insufficient for some services.
>>
>>
>
>I agree.
>
>Infact, rc-services.sh (/lib/rcscripts/sh) has been totally re-written
>for the baselayout-1.12.x branch. It now intercepts calls to
>start-stop-daemon and checks if the daemon is still active after a
>default time of 0.1 (adjustable) seconds. If not, the we assume the
>daemon failed. This solves many existing bugs :)
>
>Also, we kill any rogue processes and other such checks when a stop call
>to start-stop-daemon is made - which is handy for when asterisk fails to
>start and leaves mpg123 processes lying around :)
>
>Check it out when baselayout-1.12.0pre1 hits portage!
>
>Caveat: - some init scripts abuse start-stop-daemon. One example are all
>courier scripts which pass the env program as a daemon. This is easily
>worked around, but we fail badly if env then calls a shell script which
>in turn launches a daemon. Of all the server stuff I run, only couier
>has this issue - but there may be other programs too. Basically
>start-stop-daemon should only call daemons!
>
>http://bugs.gentoo.org/show_bug.cgi?id=98745
>
>Roy
>
>
what about to define two additional functions
check_startup() and check_shutdown()
intended to be filled from package mantainer.
The rc scripts can call these one to check if a service is
started/stopped or not.
If not it wait and retry untill a timeout is reached.
This open the road also to centralized policies of waits between check
like :
(1,1,1,1,1,1) (1,2,3,4,5,6) (1,2,4,8,16,32) and other nice stuff.
Francesco
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 18:00 ` Roy Marples
2005-07-19 22:16 ` Francesco R
@ 2005-08-23 14:09 ` Paul de Vrieze
2005-08-31 7:13 ` Roy Marples
1 sibling, 1 reply; 19+ messages in thread
From: Paul de Vrieze @ 2005-08-23 14:09 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 2191 bytes --]
On Tuesday 19 July 2005 20:00, Roy Marples wrote:
> On Tue, 2005-07-19 at 12:42 -0400, Eric Brown wrote:
> > The real problem is not that the daemons don't return errors, but
> > that our init scripts do not make reasonable attempts to verify
> > service startup. If a Gentoo init script claims that a service
> > started, it should make an effort to check that the processes are
> > actually running shortly after the script is run, even if
> > start-stop-daemon says the parent process initialized. Relying on
> > the return value of start-stop-daemon is simply insufficient for some
> > services.
>
> I agree.
>
> Infact, rc-services.sh (/lib/rcscripts/sh) has been totally re-written
> for the baselayout-1.12.x branch. It now intercepts calls to
> start-stop-daemon and checks if the daemon is still active after a
> default time of 0.1 (adjustable) seconds. If not, the we assume the
> daemon failed. This solves many existing bugs :)
>
> Also, we kill any rogue processes and other such checks when a stop
> call to start-stop-daemon is made - which is handy for when asterisk
> fails to start and leaves mpg123 processes lying around :)
>
> Check it out when baselayout-1.12.0pre1 hits portage!
>
> Caveat: - some init scripts abuse start-stop-daemon. One example are
> all courier scripts which pass the env program as a daemon. This is
> easily worked around, but we fail badly if env then calls a shell
> script which in turn launches a daemon. Of all the server stuff I run,
> only couier has this issue - but there may be other programs too.
> Basically start-stop-daemon should only call daemons!
What I would really like to see in the init system is a way that
initscripts can check whether the services they are responsible for are
still running and then adjust their status accordingly, along with some
nice output. This would then allow the execution of rc-status to give
proper information of actually running daemons, and the "rc" command the
possibility to actually bring online all daemons that should be running.
Paul
--
Paul de Vrieze
Gentoo Developer
Mail: pauldv@gentoo.org
Homepage: http://www.devrieze.net
[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-08-23 14:09 ` Paul de Vrieze
@ 2005-08-31 7:13 ` Roy Marples
2005-08-31 8:05 ` Roy Marples
2005-08-31 17:38 ` Roy Marples
0 siblings, 2 replies; 19+ messages in thread
From: Roy Marples @ 2005-08-31 7:13 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 1515 bytes --]
On Tue, 2005-08-23 at 16:09 +0200, Paul de Vrieze wrote:
> What I would really like to see in the init system is a way that
> initscripts can check whether the services they are responsible for are
> still running and then adjust their status accordingly, along with some
> nice output. This would then allow the execution of rc-status to give
> proper information of actually running daemons, and the "rc" command the
> possibility to actually bring online all daemons that should be running.
>
> Paul
>
Attached is a patch to baselayout-1.12.0_pre6-r3 that allows this.
Basically when an init script calls start-stop-daemon --start then we
log what it started (and hopefully a pidfile) in
${svcdir}/daemons/${myservice}
When it's status is asked for (either init.d/foo status or rc-status)
then we load this daemon file and check to see if the given daemons are
still running. If not then we call init.d/foo stop. We do this instead
of just marking the daemon as stopped in-case there is any clean-up code
that's needed to be run by the init script.
For this to work well, start-stop-daemon needs to be used correctly, not
just to stop it (like most init scripts seem to). sshd is a popular init
script and on most Gentoo'ers systems, so I've attached a patch so show
how init script should use start-stop-daemon so this works correctly.
What do people think about this? Is this worthfile and fixing all the
init scripts in the tree to use start-stop-daemon correctly AND for
starting up?
Thanks
Roy
[-- Attachment #2: rc-init-status.patch --]
[-- Type: text/x-patch, Size: 5126 bytes --]
--- rc-status 2005-08-01 21:26:00.000000000 +0100
+++ /bin/rc-status 2005-08-31 07:57:15.000000000 +0100
@@ -31,6 +31,7 @@
# grab settings from conf.d/rc
source /etc/conf.d/rc
+source "${svclib}/sh/rc-daemon.sh"
################################################################################
# Parse command line options #
@@ -157,10 +158,19 @@
# Now collect information about the status of the various services; whether #
# they're started, broken, or failed. Put all of this into arrays. #
################################################################################
-# Read services from ${svcdir}/{started,failed,broken}
+if [[ -x ${svcdir}/started ]]; then
+ started=$(ls ${svcdir}/started)
+ # If we're root then update service statuses incase any naughty daemons
+ # stopped running without our say so
+ if [[ ${EUID} == 0 ]]; then
+ for service in ${started}; do
+ update_service_status "${service}"
+ done
+ started=$(ls ${svcdir}/started)
+ fi
+fi
[[ -x ${svcdir}/starting ]] && starting=$(ls ${svcdir}/starting)
[[ -x ${svcdir}/inactive ]] && inactive=$(ls ${svcdir}/inactive)
-[[ -x ${svcdir}/started ]] && started=$(ls ${svcdir}/started)
[[ -x ${svcdir}/stopping ]] && stopping=$(ls ${svcdir}/stopping)
################################################################################
--- runscript.sh 2005-08-21 18:08:24.000000000 +0100
+++ /sbin/runscript.sh 2005-08-31 07:59:30.000000000 +0100
@@ -413,6 +413,10 @@
# to work with the printed " * status: foo".
local efunc="" state=""
+ # If we are effectively root, check to see if required daemons are running
+ # and update our status accordingly
+ [[ ${EUID} == 0 ]] && update_service_status "${myservice}"
+
if service_starting "${myservice}" ; then
efunc="einfo"
state="starting"
--- rc-daemon.sh 2005-08-30 07:22:39.000000000 +0100
+++ /lib/rcscripts/sh/rc-daemon.sh 2005-08-31 07:53:14.000000000 +0100
@@ -19,6 +19,7 @@
RC_GOT_DAEMON="yes"
[[ ${RC_GOT_FUNCTIONS} != "yes" ]] && source /sbin/functions.sh
+[[ ${RC_GOT_SERVICES} != "yes" ]] && source "${svclib}/sh/rc-services.sh"
RC_RETRY_KILL="no"
RC_RETRY_TIMEOUT=1
@@ -285,14 +286,45 @@
return "${retval}"
}
+# void update_service_status(char *service)
+#
+# Loads the service state file and ensures that all listed daemons are still
+# running - hopefully on their correct pids too
+# If not, we stop the service
+update_service_status() {
+ local service="$1" daemonfile="${svcdir}/daemons/$1" i
+ local -a RC_DAEMONS=() RC_PIDFILES=()
+
+ # We only care about marking started services as stopped if the daemon(s)
+ # for it are no longer running
+ ! service_started "${service}" && return
+ [[ ! -f ${daemonfile} ]] && return
+
+ # OK, now check that every daemon launched is active
+ # If the --start command was any good a pidfile was specified too
+ source "${daemonfile}"
+ for (( i=0; i<${#RC_DAEMONS[@]}; i++ )); do
+ if ! is_daemon_running ${RC_DAEMONS[i]} "${RC_PIDFILES[i]}" ; then
+ if [[ -e "/etc/init.d/${service}" ]]; then
+ /etc/init.d/"${service}" stop &>/dev/null
+ break
+ fi
+ fi
+ done
+}
+
# int start-stop-daemon(...)
#
# Provide a wrapper to start-stop-daemon
# Return the result of start_daemon or stop_daemon depending on
# how we are called
start-stop-daemon() {
- local args=$( requote "$@" )
- local cmd pidfile pid stopping signal nothing=false
+ local args=$( requote "$@" ) result i
+ local cmd pidfile pid stopping signal nothing=false
+ local daemonfile="${svcdir}/daemons/${myservice}"
+ local -a RC_DAEMONS=() RC_PIDFILES=()
+
+ [[ -e ${daemonfile} ]] && source "${daemonfile}"
rc_setup_daemon_vars
@@ -303,10 +335,49 @@
fi
if ${stopping}; then
- rc_stop_daemon
+ rc_stop_daemon
+ result="$?"
+ if [[ ${result} == "0" ]]; then
+ # We stopped the daemon successfully
+ # so we remove it from our state
+ for (( i=0; i<${#RC_DAEMONS[@]}; i++ )); do
+ # We should really check for valid cmd AND pidfile
+ # But most called to --stop only set the pidfile
+ if [[ ${RC_DAEMONS[i]} == "{cmd}" \
+ || ${RC_PIDFILES[i]}="${pidfile}" ]]; then
+ unset RC_DAEMONS[i] RC_PIDFILES[i]
+ RC_DAEMONS=( "${RC_DAEMONS[@]}" )
+ RC_PIDFILES=( "${RC_PIDFILES[@]}" )
+ break
+ fi
+ done
+ fi
else
rc_start_daemon
+ result="$?"
+ if [[ ${result} == "0" ]]; then
+ # We started the daemon sucessfully
+ # so we add it to our state
+ local max="${#RC_DAEMONS[@]}"
+ RC_DAEMONS[max]="${cmd}"
+ RC_PIDFILES[max]="${pidfile}"
+ fi
+ fi
+
+ # Write the new list of daemon states for this service
+ if [[ ${#RC_DAEMONS[@]} == "0" ]]; then
+ [[ -f ${daemonfile} ]] && rm -f "${daemonfile}"
+ else
+ echo "RC_DAEMONS[0]=\"${RC_DAEMONS[0]}\"" > "${daemonfile}"
+ echo "RC_PIDFILES[0]=\"${RC_PIDFILES[0]}\"" >> "${daemonfile}"
+
+ for (( i=1; i<${#RC_DAEMONS[@]}; i++ )); do
+ echo "RC_DAEMONS[${i}]=\"${RC_DAEMONS[i]}\"" >> "${daemonfile}"
+ echo "RC_PIDFILES[${i}]=\"${RC_PIDFILES[i]}\"" >> "${daemonfile}"
+ done
fi
+
+ return "${result}"
}
# vim:ts=4
[-- Attachment #3: sshd-ssd.patch --]
[-- Type: text/x-patch, Size: 490 bytes --]
--- sshd.orig 2005-08-31 08:07:05.000000000 +0100
+++ sshd 2005-08-31 08:08:05.000000000 +0100
@@ -40,12 +40,14 @@
start() {
checkconfig || return 1
ebegin "Starting sshd"
- /usr/sbin/sshd
+ start-stop-daemon --start --exec /usr/sbin/sshd \
+ --pidfile /var/run/sshd.pid
eend $?
}
stop() {
ebegin "Stopping sshd"
- start-stop-daemon --stop --quiet --pidfile /var/run/sshd.pid
+ start-stop-daemon --stop --exec /usr/bin/sshd \
+ --pidfile /var/run/sshd.pid
eend $?
}
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-08-31 7:13 ` Roy Marples
@ 2005-08-31 8:05 ` Roy Marples
2005-08-31 8:24 ` Georgi Georgiev
2005-08-31 17:38 ` Roy Marples
1 sibling, 1 reply; 19+ messages in thread
From: Roy Marples @ 2005-08-31 8:05 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 325 bytes --]
On Wed, 2005-08-31 at 08:13 +0100, Roy Marples wrote:
> Attached is a patch to baselayout-1.12.0_pre6-r3 that allows this.
> Basically when an init script calls start-stop-daemon --start then we
> log what it started (and hopefully a pidfile) in
> ${svcdir}/daemons/${myservice}
Forgot to attach a patch for depscan.sh
Roy
[-- Attachment #2: depscan.patch --]
[-- Type: text/x-patch, Size: 340 bytes --]
--- depscan.sh 2005-08-17 22:04:34.000000000 +0100
+++ /sbin/depscan.sh 2005-08-31 06:25:11.000000000 +0100
@@ -16,7 +16,7 @@
fi
fi
-for x in softscripts snapshot options \
+for x in softscripts snapshot options daemons \
started starting inactive stopping failed \
exclusive exitcodes ; do
if [[ ! -d "${svcdir}/${x}" ]] ; then
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-08-31 8:05 ` Roy Marples
@ 2005-08-31 8:24 ` Georgi Georgiev
0 siblings, 0 replies; 19+ messages in thread
From: Georgi Georgiev @ 2005-08-31 8:24 UTC (permalink / raw
To: gentoo-dev
maillog: 31/08/2005-09:05:51(+0100): Roy Marples types
> On Wed, 2005-08-31 at 08:13 +0100, Roy Marples wrote:
> > Attached is a patch to baselayout-1.12.0_pre6-r3 that allows this.
> > Basically when an init script calls start-stop-daemon --start then we
> > log what it started (and hopefully a pidfile) in
> > ${svcdir}/daemons/${myservice}
>
> Forgot to attach a patch for depscan.sh
Not related, but why not apply this as well, while you're at it:
--- /sbin/depscan.sh 2005-08-25 17:28:51.000000000 +0900
+++ /sbin/depscan.sh 2005-08-31 17:21:37.000000000 +0900
@@ -1,7 +1,7 @@
#!/bin/bash
# Copyright 1999-2004 Gentoo Foundation
# Distributed under the terms of the GNU General Public License v2
-# $Header$
+# $Header: $
source /etc/init.d/functions.sh
--
/ Georgi Georgiev / Depart in pieces, i.e., split. /
\ chutz@gg3.net \ \
/ +81(90)2877-8845 / /
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-08-31 7:13 ` Roy Marples
2005-08-31 8:05 ` Roy Marples
@ 2005-08-31 17:38 ` Roy Marples
1 sibling, 0 replies; 19+ messages in thread
From: Roy Marples @ 2005-08-31 17:38 UTC (permalink / raw
To: gentoo-dev
[-- Attachment #1: Type: text/plain, Size: 364 bytes --]
On Wed, 2005-08-31 at 08:13 +0100, Roy Marples wrote:
> Attached is a patch to baselayout-1.12.0_pre6-r3 that allows this.
> Basically when an init script calls start-stop-daemon --start then we
> log what it started (and hopefully a pidfile) in
> ${svcdir}/daemons/${myservice}
in pre7 :)
--
Roy Marples <uberlord@gentoo.org>
Gentoo Linux Developer
[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]
^ permalink raw reply [flat|nested] 19+ messages in thread
* Re: [gentoo-dev] init script guidelines
2005-07-19 16:42 Eric Brown
` (2 preceding siblings ...)
2005-07-19 18:00 ` Roy Marples
@ 2005-07-19 18:14 ` Francesco R
3 siblings, 0 replies; 19+ messages in thread
From: Francesco R @ 2005-07-19 18:14 UTC (permalink / raw
To: gentoo-dev
Eric Brown wrote:
> Services that use Gentoo init scripts often report a status of [started] or
>
> [OK] even though they fail to start. The most recent bug like this that I've
>
> found is with snort. If you have a bad rule, snort will initialize, the
>
> rc-scripts will give it an [OK] status, and then it will die once it parses the
>
> rules.
>
>
>
> The real problem is not that the daemons don't return errors, but that our init
>
> scripts do not make reasonable attempts to verify service startup. If a Gentoo
>
> init script claims that a service started, it should make an effort to check
>
> that the processes are actually running shortly after the script is run, even if
>
> start-stop-daemon says the parent process initialized. Relying on the return
>
> value of start-stop-daemon is simply insufficient for some services.
>
>
>
> I am aware that there are services that can monitor the status of other services
>
> (app-admin/mon?) but I think this issue is a little different. If an ebuild
>
> developer is aware of an error condition can commonly occur shortly after a
>
> daemon initializes, why not attempt to catch those errors? Most of them could
>
> probably be caught by simply checking to see if the process is still running
>
> shortly after the script is run.
>
>
>
> I propose increasing developer awareness of this problem, perhaps through some
>
> formal guidelines for ebuild developers. At the very least, I would like to see
>
> these bugs being acknowledged in bugs.gentoo.org instead of getting the same old
>
> upstream/it's not our fault response. We are responsible for our init scripts,
>
> and they are important to our users.
>
>
>
> I have 2 ideas for the actual implementation:
>
>
>
> 1) Some kind of check() function in the init.d script, or a generic check() function
>
> that just checks with ps | grep. This might typically be called after having the
>
> init script sleep for a certain amount of time.
>
>
>
> 2) Some kind of special init script that checks registered daemons after all services
>
> have started. (i.e. it depends on all daemons, or they are put into it’s config file).
>
> With this scheme we could avoid excessive sleeping during startup (to keep it fast),
>
> And perhaps even keep using service specific check() functions
>
>
>
> Does anyone else think this idea is worth looking into?
>
http://bugs.gentoo.org/show_bug.cgi?id=90471
We managed this checking for the socket mysql always create on *nix .
But whit a timeout of five seconds if there is no error message nor
socket in that time the script assume the server started.
I'm the first to say that this need to be improved but it's a start.
--
gentoo-dev@gentoo.org mailing list
^ permalink raw reply [flat|nested] 19+ messages in thread
end of thread, other threads:[~2005-08-31 17:42 UTC | newest]
Thread overview: 19+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-07-19 18:08 [gentoo-dev] init script guidelines Eric Brown
2005-07-19 18:40 ` Chris Gianelloni
2005-07-19 20:43 ` Michael Cummings
2005-07-19 21:07 ` Chris Gianelloni
2005-07-19 21:53 ` Martin Schlemmer
2005-07-20 6:30 ` Roy Marples
2005-07-19 20:03 ` Mike Frysinger
-- strict thread matches above, loose matches on Subject: below --
2005-07-19 19:39 Eric Brown
2005-07-19 16:42 Eric Brown
2005-07-19 17:22 ` Chris Gianelloni
2005-07-19 17:45 ` Mike Frysinger
2005-07-19 18:00 ` Roy Marples
2005-07-19 22:16 ` Francesco R
2005-08-23 14:09 ` Paul de Vrieze
2005-08-31 7:13 ` Roy Marples
2005-08-31 8:05 ` Roy Marples
2005-08-31 8:24 ` Georgi Georgiev
2005-08-31 17:38 ` Roy Marples
2005-07-19 18:14 ` Francesco R
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox