From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (qmail 30096 invoked by uid 1002); 31 Dec 2002 02:27:13 -0000 Mailing-List: contact gentoo-dev-help@gentoo.org; run by ezmlm Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-dev@gentoo.org Received: (qmail 26368 invoked from network); 31 Dec 2002 02:27:13 -0000 Date: Mon, 30 Dec 2002 21:25:40 -0500 (EST) From: Denis Shcherbakov To: gentoo-dev@gentoo.org Message-ID: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Subject: [gentoo-dev] SMP system hard-halts, weird crashes - revisited... tests... X-Archives-Salt: 4e6d4316-9948-49ce-94c2-1607d7b64fb5 X-Archives-Hash: e0ab733a0002f1a08323130aa3863d73 Hello, folks, For those of you who remember my threads from before regarding X halts, SMP kernel halts, etc, etc... I think I was able to track down the issue, with Riyad's kind guidance :) It turned out to be an issue with my power supply, I believe. I did, however, run a lot of tests, about which some of you might be interested to know. First, I thought it could be my IDE drives or the IDE controller. So, I put a good Seagate SCSI drive in my system, did a fresh install of Gentoo, and tried several versions of the kernels - 2.4.19-gentoo-r7, 2.4.19-gentoo-r10, 2.4.20-vanilla, and 2.5.52-development. The crash was reproducible with either kernel and the development kernel was no more unstable than the others. After reproducing the crashes without any devices hooked up to IDE or Floppy controllers, it was clear that it wasn't a controller or the drive problem. I also tested with each kernel whether it was a hyperthreading issue or not, and it wasn't. This may seem like obvious junk, but it does take quite a bit of time to test all of these configs out. By the end of the week (yes, a week), it was clear that it's either the power supply or the board itself. There was an idea from others that my nVidia GeForce3 card could be conflicting with the Tyan board, or drawing too much power or something, but I proved that untrue by reproducing all the crashes in the text-only mode. I use a Tyan Thunder i860 S2603 dual board with 2.2-gig Intel Xeons, and this board has received very good reviews in all the reviews I've seen. Actually, Tyan makes excellent boards... Period. Power supply was an easier and less expensive (and a more probable) thing to try :) I had a 460-W Zippy Emacs (Taiwanese) supply in my box, which came from the guys who sold me this machine. I decided to go for the best this time and purchase Antec's True550 EPS12V power supply for dual Xeon boards. My Tyan board requires a 24-pin main Molex power connector and an auxiliary 8-pin Molex power connector. An EPS supply accomodates the number of pins, but as I found out when I hooked it up, the pins are all in the wrong places!! So I plug it in, turn it on... Silence. A few mins of Googling yield a big DUH as to why I didn't do the search before. S2603 is a non-EPS board, which means it needs an EPS-to-nonEPS converter, which is sold my Enhance Electronics (out of California, www.enhanceusa.com). Surely I could make one myself (I had schematics in front of me), but most people who could have the tools (electrical engineering department) were gone for Christmas! I didn't have any Molex connectors or crimpers on hand, but Enhance Electronics gave a nice schematic of the adapter, if anyone is curious. So I had to buy this converter and now things look (and sound) pretty sweet. The system is up and running. The problem seems to be gone, unless it was something else (i.e. the board). One more note for some of you who have run into such mysterious crashes before. There's not a great deal of material about this on the net. Apparently, these are mostly caused by low voltages or noise on the +5VSB line from the power supply to the motherboard, which is a "standby" voltage line. This certainly explains why my system wouldn't awaken from sleeping in Windoze. :) This also explains why the system would halt after performing a string of strenuous operations. I wonder why it halted in the middle of strenuous operations, if it's really a VSB problem. Maybe the power supply wasn't too good and the voltages would droop at high load on other channels as well. I didn't check it with a meter. No time. It's never good to go cheap on power (or ram, or anything for that matter). If one really doesn't want to replace a power supply, they can put a capacitor between Common and +5VSB lines, which stabilizes the board by eliminating such voltage droops and noise on the VSB line. The capacitor I've seen used on the web was an electrolyte 6.3V rated 1000 micro-farad capacitor (although I wouldn't bank on it, it was hard to tell from the photo what Farad units those were, but numbers were pretty clear :)). The power cable extension with such a capacitor built-in is sold by www.highpowersupply.com (JDResearch). They only sell this for the usual 20-pin power connectors, not the 24-pin ones. I ordered it just to see for myself which exactly capacitor is on it ;) THe principle is the same for 24-pin lines, and one could trivially make this if they had a soldering iron on hand and the right capacitor. So, here are the pearls... Enjoy! :) This was a result of lots of searches. So - if you get mysterious crashes and system halts that point to other things than I/O devices, replace the power supply or try putting a capacitor between +5VSB and Common to stabilize the board. Gentoo rocks. :) Jeez, the installs are soo damn fast now that I have half-the-clue as to what I'm doing in Gentoo :) A brief note on the 2.5 dev kernel. It's real cool!! It compiles in a flash, it loads in a flash, and I haven't run into any instabilities with it yet!! It's absolutely blazing compared to 2.4.20 vanilla (or 2.4.19-gentoo, sorry :)) The only thing is, the nVidia kernel modules don't compile with this kernel. The modules are now called *.ko rather than *.o :) Alright, this is all for now. Sorry for making this so long, but there's so much to share. You all guru's have probably been thru most of this already, so forgive me for insulting your intelligence with this, but it's pretty exciting stuff to a novice like me! All the best to everyone for the Holiday Season! Denis P.S. Riyad - Many thanks!! You rock!! _________ Graduate Student Chemical Engineering Princeton University Princeton, NJ 08544-5263 -- gentoo-dev@gentoo.org mailing list