* [gentoo-user] log messages @ 2010-02-16 22:36 Harry Putnam 2010-02-16 23:00 ` Alan McKinnon 0 siblings, 1 reply; 6+ messages in thread From: Harry Putnam @ 2010-02-16 22:36 UTC (permalink / raw To: gentoo-user Hundreds, maybe thousands of lines like this (wrapped for mail): Feb 16 09:38:47 reader kernel: [162289.090685] usb 4-2.1:1.1: uevent Feb 16 09:38:48 reader kernel: [162289.467065] hdc: status error: status=0x00 { } Feb 16 09:38:48 reader kernel: [162289.467071] hdc: possibly failed opcode: 0xa0 Feb 16 09:38:48 reader kernel: [162289.467079] ide-atapi: hdc: Strange, packet command initiated yet DRQ isn't asserted When I noticed this output involving the cdrom I wondered if I might have left something in it but that was not the case. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] log messages 2010-02-16 22:36 [gentoo-user] log messages Harry Putnam @ 2010-02-16 23:00 ` Alan McKinnon 2010-02-17 6:49 ` [gentoo-user] " Harry Putnam 0 siblings, 1 reply; 6+ messages in thread From: Alan McKinnon @ 2010-02-16 23:00 UTC (permalink / raw To: gentoo-user On Wednesday 17 February 2010 00:36:42 Harry Putnam wrote: > Hundreds, maybe thousands of lines like this (wrapped for mail): > > Feb 16 09:38:47 reader kernel: [162289.090685] usb 4-2.1:1.1: uevent > > Feb 16 09:38:48 reader kernel: [162289.467065] hdc: status error: > status=0x00 { } > > Feb 16 09:38:48 reader kernel: [162289.467071] hdc: possibly failed > opcode: 0xa0 > > Feb 16 09:38:48 reader kernel: [162289.467079] ide-atapi: hdc: > Strange, packet command initiated yet DRQ isn't asserted > > When I noticed this output involving the cdrom I wondered if I might > have left something in it but that was not the case. Do you have hal configured to poll your cdrom drive every two seconds, to see if a disk is inserted? And if so, is the verbosity logging cranked up way higher than it should be? I haven't personally had to fix this myself (so can't give pointers on where to fix it), but it seems to be a common occurrence judging from posts I see here and at other forums. -- alan dot mckinnon at gmail dot com ^ permalink raw reply [flat|nested] 6+ messages in thread
* [gentoo-user] Re: log messages 2010-02-16 23:00 ` Alan McKinnon @ 2010-02-17 6:49 ` Harry Putnam 2010-02-17 8:47 ` Alan McKinnon 2010-02-17 18:51 ` Jörg Schaible 0 siblings, 2 replies; 6+ messages in thread From: Harry Putnam @ 2010-02-17 6:49 UTC (permalink / raw To: gentoo-user Alan McKinnon <alan.mckinnon@gmail.com> writes: > On Wednesday 17 February 2010 00:36:42 Harry Putnam wrote: >> Hundreds, maybe thousands of lines like this (wrapped for mail): >> >> Feb 16 09:38:47 reader kernel: [162289.090685] usb 4-2.1:1.1: uevent >> >> Feb 16 09:38:48 reader kernel: [162289.467065] hdc: status error: >> status=0x00 { } >> >> Feb 16 09:38:48 reader kernel: [162289.467071] hdc: possibly failed >> opcode: 0xa0 >> >> Feb 16 09:38:48 reader kernel: [162289.467079] ide-atapi: hdc: >> Strange, packet command initiated yet DRQ isn't asserted >> >> When I noticed this output involving the cdrom I wondered if I might >> have left something in it but that was not the case. > > Do you have hal configured to poll your cdrom drive every two seconds, to see > if a disk is inserted? And if so, is the verbosity logging cranked up way > higher than it should be? > > I haven't personally had to fix this myself (so can't give pointers on where > to fix it), but it seems to be a common occurrence judging from posts I see > here and at other forums. I do have hald running, but made no special config regarding cdrom polling. At least not on purpose. The messages do appear to be continuous. I will execute a reboot soon but don't want to right now. Why I'm pondering and following this up, is that I experience a serious freeze after some unspecified amount of uptime. Mouse and keyboard become unresponsive... and eventually the OS cannot be accessed at all. SSH appears to stop and cannot contact remotely either. This began happening quite some time ago... on a different earlier install. I never could see anything in the logs that gave a clue to why. I created a script that ran from cron. It pinged a remote host, and logged a unique easily findable string into the log using `logger', every 5 minutes. With that I was able to narrow down the time frame of freeze to within the last 5 minutes (of log lines). Even then, there was nothing to indicate a problem. This was an OS that had been running a very long time with upgrade after upgrade. Though I hated having to rebuild all the customizations etc, I finally completely reinstalled from scratch hoping to catch the problem with the shotgun approach. In that earlier OS there were no log messages regarding hdc being generated (by the way). Shortly after completing the new install and a couple of weeks of getting setup the way I wanted, I began to experience the freezes again. I have caught the freeze in the early stages before completely losing the network when just mouse and keyboard became unresponsive, was able to ssh in and noticed that restarting hald held off the freeze for some (again unspecified) amount of time. So cutting the lengthy narrative down a bit, and briefly put, I'm looking for anything unusual that is causing this. The hdc messages is the only odd thing I'm seeing. Something appears to be jamming up the hal layer somehow, but not leaving findable tracks. At least not findable by an someone with many yrs experience with linux but not much real debugging of complicated problems under his belt. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Re: log messages 2010-02-17 6:49 ` [gentoo-user] " Harry Putnam @ 2010-02-17 8:47 ` Alan McKinnon 2010-02-17 14:32 ` Harry Putnam 2010-02-17 18:51 ` Jörg Schaible 1 sibling, 1 reply; 6+ messages in thread From: Alan McKinnon @ 2010-02-17 8:47 UTC (permalink / raw To: gentoo-user On Wednesday 17 February 2010 08:49:28 Harry Putnam wrote: > I have caught the freeze in the early stages before completely losing > the network when just mouse and keyboard became unresponsive, was able > to ssh in and noticed that restarting hald held off the freeze for > some (again unspecified) amount of time. > > So cutting the lengthy narrative down a bit, and briefly put, I'm > looking for anything unusual that is causing this. The hdc messages > is the only odd thing I'm seeing. > > Something appears to be jamming up the hal layer somehow, but not > leaving findable tracks. At least not findable by an someone with > many yrs experience with linux but not much real debugging of > complicated problems under his belt. You say the box runs ssh, implying that other hosts are nearby, so what I would suggest is to configure your syslogger to send all logs to another host and have that host write them to a known location. I find that machines that freeze often still send logs to syslog properly right up to the moment of the freeze, but these do not get written to disk as IO is blocked. Then we restart the box, guaranteeing that the logs are lost :-) Remote logging and just leave it till the machine freezes again will hopefully give you the useful logs you need to identify the problem. To save disk space you can configure logrotate on the remote logger to delete the previous days stuff - you don't need logs from days where the box was working fine. Another option is to look at the pattern here: one day out of the blue a stable system developed problems and they still surface at random times. This is one of the characteristics of failing hardware. Have you done a full thorough hardware test, including such things as memtest and smart? -- alan dot mckinnon at gmail dot com ^ permalink raw reply [flat|nested] 6+ messages in thread
* [gentoo-user] Re: log messages 2010-02-17 8:47 ` Alan McKinnon @ 2010-02-17 14:32 ` Harry Putnam 0 siblings, 0 replies; 6+ messages in thread From: Harry Putnam @ 2010-02-17 14:32 UTC (permalink / raw To: gentoo-user Alan McKinnon <alan.mckinnon@gmail.com> writes: > Remote logging and just leave it till the machine freezes again will > hopefully give you the useful logs you need to identify the > problem. To save disk space you can configure logrotate on the > remote logger to delete the previous days stuff - you don't need > logs from days where the box was working fine. Thanks, that may be worth a try... I wonder if with rsyslog (my logger of choice) it may be possible to log to localhost as well as remote? I think I'll look into that too. > Another option is to look at the pattern here: one day out of the > blue a stable system developed problems and they still surface at > random times. This is one of the characteristics of failing > hardware. Have you done a full thorough hardware test, including > such things as memtest and smart? I agree that it sounds like hardware but even then some log tracks should appear right? (Maybe I'll see them with the remote logging suggestion) > . . . . . . . . . . . . . . . . . . . . . .Have you done a full > thorough hardware test Haven't done the memtest or smart But far as `full'; what other tests might I try? ps - I did find some reiserfs errors and currently running reiserfsck --rebuild-tree On that (now umounted) disk after a full backup, so maybe that is related and will cure the problem (fingers crossed hard) ^ permalink raw reply [flat|nested] 6+ messages in thread
* [gentoo-user] Re: log messages 2010-02-17 6:49 ` [gentoo-user] " Harry Putnam 2010-02-17 8:47 ` Alan McKinnon @ 2010-02-17 18:51 ` Jörg Schaible 1 sibling, 0 replies; 6+ messages in thread From: Jörg Schaible @ 2010-02-17 18:51 UTC (permalink / raw To: gentoo-user Hi Harry, Harry Putnam wrote: > Alan McKinnon <alan.mckinnon@gmail.com> writes: > >> On Wednesday 17 February 2010 00:36:42 Harry Putnam wrote: >>> Hundreds, maybe thousands of lines like this (wrapped for mail): >>> >>> Feb 16 09:38:47 reader kernel: [162289.090685] usb 4-2.1:1.1: uevent >>> >>> Feb 16 09:38:48 reader kernel: [162289.467065] hdc: status error: >>> status=0x00 { } >>> >>> Feb 16 09:38:48 reader kernel: [162289.467071] hdc: possibly failed >>> opcode: 0xa0 >>> >>> Feb 16 09:38:48 reader kernel: [162289.467079] ide-atapi: hdc: >>> Strange, packet command initiated yet DRQ isn't asserted >>> >>> When I noticed this output involving the cdrom I wondered if I might >>> have left something in it but that was not the case. >> >> Do you have hal configured to poll your cdrom drive every two seconds, to >> see if a disk is inserted? And if so, is the verbosity logging cranked up >> way higher than it should be? >> >> I haven't personally had to fix this myself (so can't give pointers on >> where to fix it), but it seems to be a common occurrence judging from >> posts I see here and at other forums. > > I do have hald running, but made no special config regarding cdrom > polling. At least not on purpose. > > The messages do appear to be continuous. I will execute a reboot soon > but don't want to right now. > > Why I'm pondering and following this up, is that I experience a > serious freeze after some unspecified amount of uptime. Mouse and > keyboard become unresponsive... and eventually the OS cannot be > accessed at all. > > SSH appears to stop and cannot contact remotely either. > > This began happening quite some time ago... on a different earlier > install. I never could see anything in the logs that gave a clue to > why. > > I created a script that ran from cron. It pinged a remote host, and > logged a unique easily findable string into the log using `logger', > every 5 minutes. With that I was able to narrow down the time frame > of freeze to within the last 5 minutes (of log lines). > > Even then, there was nothing to indicate a problem. This was an OS > that had been running a very long time with upgrade after upgrade. > > Though I hated having to rebuild all the customizations etc, I finally > completely reinstalled from scratch hoping to catch the problem with > the shotgun approach. > > In that earlier OS there were no log messages regarding hdc being > generated (by the way). > > Shortly after completing the new install and a couple of weeks of > getting setup the way I wanted, I began to experience the freezes > again. > > I have caught the freeze in the early stages before completely losing > the network when just mouse and keyboard became unresponsive, was able > to ssh in and noticed that restarting hald held off the freeze for > some (again unspecified) amount of time. > > So cutting the lengthy narrative down a bit, and briefly put, I'm > looking for anything unusual that is causing this. The hdc messages > is the only odd thing I'm seeing. > > Something appears to be jamming up the hal layer somehow, but not > leaving findable tracks. At least not findable by an someone with > many yrs experience with linux but not much real debugging of > complicated problems under his belt. I had once similar freezes and effects until I recognized that our rabbit had bitten into an USB cable and the computer got undefined signals on the USB due to contacts of the blank cable lines. Try to disconnect any external USB device first and check if the problem persists. - Jörg ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2010-02-17 18:52 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2010-02-16 22:36 [gentoo-user] log messages Harry Putnam 2010-02-16 23:00 ` Alan McKinnon 2010-02-17 6:49 ` [gentoo-user] " Harry Putnam 2010-02-17 8:47 ` Alan McKinnon 2010-02-17 14:32 ` Harry Putnam 2010-02-17 18:51 ` Jörg Schaible
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox