From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits)) (No client certificate requested) by finch.gentoo.org (Postfix) with ESMTPS id A74C915802C for ; Fri, 20 Dec 2024 10:47:52 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id AE953E08ED; Fri, 20 Dec 2024 10:47:47 +0000 (UTC) Received: from mail.muc.de (mail.muc.de [193.149.48.3]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 92E1EE08C8 for ; Fri, 20 Dec 2024 10:47:45 +0000 (UTC) Received: (qmail 23076 invoked by uid 3782); 20 Dec 2024 11:47:42 +0100 Received: from muc.de (pd953a3d6.dip0.t-ipconnect.de [217.83.163.214]) (using STARTTLS) by colin.muc.de (tmda-ofmipd) with ESMTP; Fri, 20 Dec 2024 11:47:42 +0100 Received: (qmail 2222 invoked by uid 1000); 20 Dec 2024 10:47:41 -0000 Date: Fri, 20 Dec 2024 10:47:41 +0000 To: gentoo-user@lists.gentoo.org Subject: [gentoo-user] Fun with mdadm (Software RAID) Message-ID: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org X-Auto-Response-Suppress: DR, RN, NRN, OOF, AutoReply MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline X-Submission-Agent: TMDA/1.3.x (Ph3nix) From: Alan Mackenzie X-Primary-Address: acm@muc.de X-Archives-Salt: 49be03d9-5287-49eb-9a32-b690a72001fe X-Archives-Hash: da383448c622130d1ccdabfc2f47c18e Hello, Gentoo. After having got the syslinux boot manager working well, I lost the root partition on my newer machine. I spent the entire evening yesterday trying to get it back again, with various expedients for recovering ext4 partitions from backup superblocks, and so on. It wasn't until the middle of the night that it dawned on me what had happened, and I immediately got up and had it fixed within twenty minutes. The cause was me booting up the machine with a rescue disk. This assembled my RAID partitions /dev/md127 and /dev/md126 reversed, but also wrote those wrong identifiers, 126 and 127, into the "preferred minor" field of the partitions' super blocks. In essence, they got swapped. Hence trying to boot up into my normal system, /dev/md126, the root partition, was an unformatted empty space on the SSD. I don't blame the rescue disk for this occurrence. For some reason, when the kernel assembles /dev/md devices, it only seems to pay attention to the "preferred minor" fields when they are wrong. :-( mdadm appears to write the "preferred minor" fields at random when assembling the RAID arrays. I don't think it should, unless explicitly asked. There is an argument to mdadm which specifies the writing of these fields. In fact I used this to effect a repair, ironically enough, from the rescue disk booted with the option to suppress the automatic assembly of the arrays. Just for the record, all my RAID arrays have metadata version 0.90, the (old fashioned) one that allows auto-assembly by the kernel without the need of an initramfs. The moral of the story: if your system uses software RAID, be careful indeed before you boot up with a rescue disk. -- Alan Mackenzie (Nuremberg, Germany).