From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from pigeon.gentoo.org ([208.92.234.80] helo=lists.gentoo.org) by finch.gentoo.org with esmtp (Exim 4.60) (envelope-from ) id 1ODi4t-0000Pt-Dy for garchives@archives.gentoo.org; Sun, 16 May 2010 17:56:40 +0000 Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id A205AE0676; Sun, 16 May 2010 17:56:03 +0000 (UTC) Received: from mail-pv0-f181.google.com (mail-pv0-f181.google.com [74.125.83.181]) by pigeon.gentoo.org (Postfix) with ESMTP id 6DF31E0676 for ; Sun, 16 May 2010 17:56:03 +0000 (UTC) Received: by pvg16 with SMTP id 16so945774pvg.40 for ; Sun, 16 May 2010 10:56:02 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:date:message-id :subject:from:to:content-type; bh=/bMZDAJirNBn75nG8tlgsbJLHunhS4aVOIyq889UXBg=; b=V0QSzd7Qgr0Jc0WY6EzzpLV5CKtK9uQAKUO+WMNfsqrkJSbWtvzKOdTQwq+b+DnCIU G/G1YRslWZ4AguhQa0SiP3iJaw4HQ2cY+ZfsCUQjNvvLr5FZf5OsBqgGqoh1yzOjM+Me 5zlU9VL67e6y1IlC/S8X4b8AyM/S2oJnFLXY0= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:date:message-id:subject:from:to:content-type; b=MMas+KJGHdOeJJyXZWnLKVarysv8XDFlXw3U5FxjWNayFPQYon1iOYeeJWeDowPSXZ FaQqwKagi+65qKFdwhByyKiYG2P9qCx6+VE3O9+osw8QD2xiB7fMQAjoLO+krKGA4QfI s/pw5MoQaVC5Is38EeVbF36QKJ2jEkIvqRTMc= Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 Received: by 10.142.248.38 with SMTP id v38mr2499358wfh.246.1274032562856; Sun, 16 May 2010 10:56:02 -0700 (PDT) Received: by 10.143.37.15 with HTTP; Sun, 16 May 2010 10:56:02 -0700 (PDT) Date: Sun, 16 May 2010 10:56:02 -0700 Message-ID: Subject: [gentoo-user] RAID problems - Is udev at fault here? From: Mark Knecht To: gentoo-user@lists.gentoo.org Content-Type: text/plain; charset=UTF-8 X-Archives-Salt: 43dbecdc-f9b6-4914-8635-fee002a22b69 X-Archives-Hash: 117e3fb648f9cc7819d185732abba96f I have a newish high-end machine here that's causing me some problems with RAID, but looking at log files and dmesg I don't think the problem is actually RAID and more likely udev. I'm looking for some ideas on how to debug this. The hardware: Asus Rampage II Extreme Intel Core i7-980x 12GB DRAM 5 WD5002ABYS RAID Edition 500GB drives The drives are arranged as a 3-drive RAID1 and a 2-drive RAID0 using mdadm. The issue is that when booting gets to the point where it starts mdadm and then about 50% of the time mdadm fails to find some of the partitions and hence either starts the RAID1 with missing drives or in the case of RAID0 won't start the RAID. For instance, /dev/md5 might start with a failed partition, either /dev/sda5 or sdb5 or sdc5 isn't found and the RAID is started. Once the problem has occurred I don't seem to be able to fix it with anything other than a reboot so far. Investigating dmesg when there is a failure I actually don't see that the missing partition is ever identified and looking at the /dev directory the partition isn't there either. Personally I don't think the problem is with the drives as BIOS shows me a table of the drives attached before booting and the 5 drives are _always_ shown. If I drop into BIOS proper and use BIOS tools to look at the drives I can _always_ read smart data and all drives respond to DOS-based tools like SpinRite. It's only when I get into Linux that they aren't found. The problem hasn't changed much with different kernels from 2.6.32 through 2.6.34, nor do I see any difference running vanilla-sources or gentoo-sources. Currently I'm using udev-149 with devfs-compat and extra flags enabled. Where might I start looking for the root cause of a problem like this? Let me know what other info would be helpful. Thanks, Mark