From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([69.77.167.62] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1JjIXi-0002L4-E3 for garchives@archives.gentoo.org; Tue, 08 Apr 2008 18:27:38 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 6C4EAE039F; Tue, 8 Apr 2008 18:27:35 +0000 (UTC) Received: from duke.localdomain (p78-102.acedsl.com [66.114.78.102]) by pigeon.gentoo.org (Postfix) with ESMTP id 204EFE039F for ; Tue, 8 Apr 2008 18:27:35 +0000 (UTC) Received: from [127.0.0.1] (duke.wrkhors.com [127.0.0.1]) by duke.localdomain (Postfix) with ESMTP id AA26228D675 for ; Tue, 8 Apr 2008 14:19:46 -0400 (EDT) Message-ID: <47FBB742.2050806@wrkhors.com> Date: Tue, 08 Apr 2008 14:19:46 -0400 From: Steven Lembark Organization: Workhorse Computing User-Agent: Thunderbird 2.0.0.9 (X11/20071212) Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] Re: Emergency shutdown, how to? References: <47EC9F50.5070503@bellsouth.net> <200804021549.15550.dirk.heinrichs.ext@nsn.com> <3297039.CL3N7pcUCC@schmarck.cn> <200804021628.29507.dirk.heinrichs.ext@nsn.com> <28748152.K45aiMzFyV@michael-schmarck.my-fqdn.de> <20080402184829.1c6d2b9c@zaphod.digimed.co.uk> <5bdc1c8b0804041405u6f3fdef1r802963828f3bf8c5@mail.gmail.com> <47F7AC03.6010101@wrkhors.com> <1207530457.15340.11.camel@orpheus> <47FA59DA.9090404@wrkhors.com> <1207610982.15340.50.camel@orpheus> In-Reply-To: <1207610982.15340.50.camel@orpheus> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Archives-Salt: 94cc1b71-b226-457c-ad2e-dbe504ef0fcf X-Archives-Hash: f4e63b67daf7638b0544f864e8f56008 > I agree that your script is nice and simple, and hence less prone to > errors. I coded mine in c++ because I use it not only for a machine > type watchdog, but also a task based watchdog that reboots the machine > based on certain tasks living or not. Each task has to register with > the watchdog server and continually tell the server they're alive, or > reboot! But that's a story for another thread... #!/path/to/perl use strict; use Sys::Syslog; open my $fh, '>', '/dev/watchdog' or die "/dev/watchdog: $!"; # if any of these go away we need to notice it. # ok... you'll notice the first one anyway. my @watchz = qw ( init ntpd apache /opt/sybase/ASE-12_5/bin/dataserver ); # wd timeout / 2, or 1 for minimum sleep # (avoid usleep: too much overhead). my $cycle = 15; # get the syslog handle openlog blah blah blah or die 'Et tu, syslog?'; CYCLE: for(;;) { sleep ( $cycle - ( time % $cycle ) ); # split and args vary by O/S, this works on linux. my @procz = map { split /\s+/, $_, 6 )[5] } qx( ps a ); my %chechz = (); @chechz{ @watchz } = (); delete @chechz{ @procz }; if( %chechz ) { # oops, current proc's don't include the # list of processes being watched. # # this can happen twice in a w/d interval # before the system goes down. my $nastygram = join "\t", 'Missing proc's:', join "\t", keys %chechz syslog LOG_CRIT | LOG_FOO, $nastygram; next CYCLE # alternative here is to close $fh here and # bounce the system immediately, the # approach of looping allows an # intentional restart of the service # (in less than 1 w/d cycle) w/o bouncing the box. } # if the proc check got this far then the w/d # file gets poked and we live for another loop. print $wd "\n"; } # this isn't a module 0 __END__ -- Steven Lembark 85-09 90th St. Workhorse Computing Woodhaven, NY, 11421 lembark@wrkhors.com +1 888 359 3508 -- gentoo-user@lists.gentoo.org mailing list