[gentoo-user] Software RAID-1

public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed

* [gentoo-user] Software RAID-1
@ 2014-08-24 13:51 Peter Humphrey
  2014-08-24 16:06 ` Mick
  2014-08-24 18:22 ` Kerin Millar
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Humphrey @ 2014-08-24 13:51 UTC (permalink / raw
  To: gentoo-user

Hello list,

For several years I've been running with / on /dev/md5 (0.99 metadata), which 
is built on /dev/sd[ab]5. At each boot I see a message scroll by saying 
something like "No devices found in config file or automatically" and then lvm 
continues to assemble md5 anyway and mount its file system. The rest of my 
partitions are on /dev/md7 (1.0 metadata), which is built on /dev/sd[ab]7. Oh, 
except for /boot, which is on /dev/sda1 with a copy on /dev/sdb1.

So I decided to clean up /etc/mdadm.conf by adding these lines:

DEVICE /dev/sda* /dev/sdb*
ARRAY /dev/md5 devices=/dev/sda5,/dev/sdb5
ARRAY /dev/md7 devices=/dev/sda7,/dev/sdb7
ARRAY /dev/md9 devices=/dev/sda9,/dev/sdb9

Now at boot time the no-devices error doesn't appear, but I get a blank line 
with a red asterisk.

What am I doing wrong?

This is stimulated by this week's upgrade of lvm2, for which I need to make 
some configuration changes, and I thought I ought to fix one thing at a time.

-- 
Regards
Peter

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-24 13:51 [gentoo-user] Software RAID-1 Peter Humphrey
@ 2014-08-24 16:06 ` Mick
  2014-08-24 18:22 ` Kerin Millar
  1 sibling, 0 replies; 17+ messages in thread
From: Mick @ 2014-08-24 16:06 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: Text/Plain, Size: 1238 bytes --]

On Sunday 24 Aug 2014 14:51:23 Peter Humphrey wrote:
> Hello list,
> 
> For several years I've been running with / on /dev/md5 (0.99 metadata),
> which is built on /dev/sd[ab]5. At each boot I see a message scroll by
> saying something like "No devices found in config file or automatically"
> and then lvm continues to assemble md5 anyway and mount its file system.
> The rest of my partitions are on /dev/md7 (1.0 metadata), which is built
> on /dev/sd[ab]7. Oh, except for /boot, which is on /dev/sda1 with a copy
> on /dev/sdb1.
> 
> So I decided to clean up /etc/mdadm.conf by adding these lines:
> 
> DEVICE /dev/sda* /dev/sdb*
> ARRAY /dev/md5 devices=/dev/sda5,/dev/sdb5
> ARRAY /dev/md7 devices=/dev/sda7,/dev/sdb7
> ARRAY /dev/md9 devices=/dev/sda9,/dev/sdb9
> 
> Now at boot time the no-devices error doesn't appear, but I get a blank
> line with a red asterisk.
> 
> What am I doing wrong?

I suspect that mdadm is warning you that it can't find any arrays in 
mdadm.conf and then proceeds to automatically scan and mount RAIDs.

If you want to add something in it, then you can use the output of:

mdadm --examine --scan

and append the relevant line to your mdadm.conf.

-- 
Regards,
Mick

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 473 bytes --]

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-24 13:51 [gentoo-user] Software RAID-1 Peter Humphrey
  2014-08-24 16:06 ` Mick
@ 2014-08-24 18:22 ` Kerin Millar
  2014-08-25  9:22   ` Peter Humphrey
  1 sibling, 1 reply; 17+ messages in thread
From: Kerin Millar @ 2014-08-24 18:22 UTC (permalink / raw
  To: gentoo-user

On 24/08/2014 14:51, Peter Humphrey wrote:
> Hello list,
>
> For several years I've been running with / on /dev/md5 (0.99 metadata), which
> is built on /dev/sd[ab]5. At each boot I see a message scroll by saying
> something like "No devices found in config file or automatically" and then lvm

LVM does not handle md arrays.

> continues to assemble md5 anyway and mount its file system. The rest of my
> partitions are on /dev/md7 (1.0 metadata), which is built on /dev/sd[ab]7. Oh,
> except for /boot, which is on /dev/sda1 with a copy on /dev/sdb1.
>
> So I decided to clean up /etc/mdadm.conf by adding these lines:
>
> DEVICE /dev/sda* /dev/sdb*
> ARRAY /dev/md5 devices=/dev/sda5,/dev/sdb5
> ARRAY /dev/md7 devices=/dev/sda7,/dev/sdb7
> ARRAY /dev/md9 devices=/dev/sda9,/dev/sdb9
>

Perhaps you should not include /dev/md5 here. As you have made a point 
of building the array containing the root filesystem with 0.99 metadata, 
I would assume that it is being assembled in kernelspace as a result of 
CONFIG_MD_AUTODETECT being enabled. Alternatively, perhaps you are using 
an initramfs.

Either way, by the time the mdraid init.d script executes, the /dev/md5 
array must - by definition - be up and mounted. Does it make a 
difference if you add the following line to the config?

   AUTO +1.x homehost -all

That will prevent it from considering arrays with 0.99 metadata.

On a related note, despite upstream's efforts to make this as awkward as 
possible, it is possible to mimic the kernel's autodetect functionality 
in userspace with a config such as this:

   HOMEHOST <ignore>
   DEVICE partitions
   AUTO +1.x -all

Bear in mind that the mdraid script runs `mdadm --assemble --scan`. 
There is no need to specifically map out the properties of each array. 
This is what the metadata is for.

--Kerin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-24 18:22 ` Kerin Millar
@ 2014-08-25  9:22   ` Peter Humphrey
  2014-08-25 11:17     ` Peter Humphrey
  2014-08-25 12:18     ` [gentoo-user] Software RAID-1 Kerin Millar
  0 siblings, 2 replies; 17+ messages in thread
From: Peter Humphrey @ 2014-08-25  9:22 UTC (permalink / raw
  To: gentoo-user

On Sunday 24 August 2014 19:22:40 Kerin Millar wrote:
> On 24/08/2014 14:51, Peter Humphrey wrote:
--->8
> > So I decided to clean up /etc/mdadm.conf by adding these lines:
> > 
> > DEVICE /dev/sda* /dev/sdb*
> > ARRAY /dev/md5 devices=/dev/sda5,/dev/sdb5
> > ARRAY /dev/md7 devices=/dev/sda7,/dev/sdb7
> > ARRAY /dev/md9 devices=/dev/sda9,/dev/sdb9
> 
> Perhaps you should not include /dev/md5 here.

I wondered about that.

> As you have made a point of building the array containing the root
> filesystem with 0.99 metadata, ...

...as was instructed in the howto at the time...

> I would assume that it is being assembled in kernelspace as a result of
> CONFIG_MD_AUTODETECT being enabled.

Yes, I think that's what's happening.

> Alternatively, perhaps you are using an initramfs.

Nope.

> Either way, by the time the mdraid init.d script executes, the /dev/md5
> array must - by definition - be up and mounted. Does it make a
> difference if you add the following line to the config?
> 
>    AUTO +1.x homehost -all
> 
> That will prevent it from considering arrays with 0.99 metadata.

No, I get the same result. Just a red asterisk at the left end of the line 
after "Starting up RAID devices..."

Now that I look at /etc/init.d/mdraid I see a few things that aren't quite 
kosher. The first is that it runs "mdadm -As 2>&1", which returns null after 
booting is finished (whence the empty line before the asterisk). Then it tests 
for the existence of /dev/md_d*. That also doesn't exist, though /dev/md* 
does:

# ls -l /dev/md*
brw-rw---- 1 root disk 9, 0 Aug 25 10:03 /dev/md0
brw-rw---- 1 root disk 9, 5 Aug 25 10:03 /dev/md5
brw-rw---- 1 root disk 9, 7 Aug 25 10:03 /dev/md7
brw-rw---- 1 root disk 9, 9 Aug 25 10:03 /dev/md9

/dev/md:
total 0
lrwxrwxrwx 1 root root 6 Aug 25 10:03 5_0 -> ../md5
lrwxrwxrwx 1 root root 6 Aug 25 10:03 7_0 -> ../md7
lrwxrwxrwx 1 root root 6 Aug 25 10:03 9_0 -> ../md9

Looks like I have some experimenting to do.

I forgot to mention in my first post that, on shutdown, when the script runs 
"mdadm -Ss 2>&1" I always get "Cannot get exclusive access to /dev/md5..." 
I've always just ignored it until now, but perhaps it's important?

> On a related note, despite upstream's efforts to make this as awkward as
> possible, it is possible to mimic the kernel's autodetect functionality
> in userspace with a config such as this:
> 
>    HOMEHOST <ignore>
>    DEVICE partitions
>    AUTO +1.x -all
> 
> Bear in mind that the mdraid script runs `mdadm --assemble --scan`.
> There is no need to specifically map out the properties of each array.
> This is what the metadata is for.

Thanks for the info, and the help. The fog is dispersing a bit...

-- 
Regards
Peter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-25  9:22   ` Peter Humphrey
@ 2014-08-25 11:17     ` Peter Humphrey
  2014-08-25 12:35       ` Kerin Millar
  2014-08-25 12:18     ` [gentoo-user] Software RAID-1 Kerin Millar
  1 sibling, 1 reply; 17+ messages in thread
From: Peter Humphrey @ 2014-08-25 11:17 UTC (permalink / raw
  To: gentoo-user

On Monday 25 August 2014 10:22:31 Peter Humphrey wrote:
> On Sunday 24 August 2014 19:22:40 Kerin Millar wrote:
> > On 24/08/2014 14:51, Peter Humphrey wrote:
> --->8
> 
> > > So I decided to clean up /etc/mdadm.conf by adding these lines:
> > > 
> > > DEVICE /dev/sda* /dev/sdb*
> > > ARRAY /dev/md5 devices=/dev/sda5,/dev/sdb5
> > > ARRAY /dev/md7 devices=/dev/sda7,/dev/sdb7
> > > ARRAY /dev/md9 devices=/dev/sda9,/dev/sdb9
> > 
> > Perhaps you should not include /dev/md5 here.
> 
> I wondered about that.
> 
> > As you have made a point of building the array containing the root
> > filesystem with 0.99 metadata, ...
> 
> ...as was instructed in the howto at the time...
> 
> > I would assume that it is being assembled in kernelspace as a result of
> > CONFIG_MD_AUTODETECT being enabled.
> 
> Yes, I think that's what's happening.
> 
> > Alternatively, perhaps you are using an initramfs.
> 
> Nope.
> 
> > Either way, by the time the mdraid init.d script executes, the /dev/md5
> > array must - by definition - be up and mounted. Does it make a
> > difference if you add the following line to the config?
> > 
> >    AUTO +1.x homehost -all
> > 
> > That will prevent it from considering arrays with 0.99 metadata.
> 
> No, I get the same result. Just a red asterisk at the left end of the line
> after "Starting up RAID devices..."
> 
> Now that I look at /etc/init.d/mdraid I see a few things that aren't quite
> kosher. The first is that it runs "mdadm -As 2>&1", which returns null after
> booting is finished (whence the empty line before the asterisk). Then it
> tests for the existence of /dev/md_d*. That also doesn't exist, though
> /dev/md* does:
> 
> # ls -l /dev/md*
> brw-rw---- 1 root disk 9, 0 Aug 25 10:03 /dev/md0
> brw-rw---- 1 root disk 9, 5 Aug 25 10:03 /dev/md5
> brw-rw---- 1 root disk 9, 7 Aug 25 10:03 /dev/md7
> brw-rw---- 1 root disk 9, 9 Aug 25 10:03 /dev/md9
> 
> /dev/md:
> total 0
> lrwxrwxrwx 1 root root 6 Aug 25 10:03 5_0 -> ../md5
> lrwxrwxrwx 1 root root 6 Aug 25 10:03 7_0 -> ../md7
> lrwxrwxrwx 1 root root 6 Aug 25 10:03 9_0 -> ../md9
> 
> Looks like I have some experimenting to do.

Well, it was simple. I just said "rc-update del mdraid boot" and all is now 
well. I'd better revisit the docs to see if they still give the same advice.

-- 
Regards
Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-25  9:22   ` Peter Humphrey
  2014-08-25 11:17     ` Peter Humphrey
@ 2014-08-25 12:18     ` Kerin Millar
  2014-08-25 12:24       ` Kerin Millar
  1 sibling, 1 reply; 17+ messages in thread
From: Kerin Millar @ 2014-08-25 12:18 UTC (permalink / raw
  To: gentoo-user

On 25/08/2014 10:22, Peter Humphrey wrote:
> On Sunday 24 August 2014 19:22:40 Kerin Millar wrote:
>> On 24/08/2014 14:51, Peter Humphrey wrote:
> --->8
>>> So I decided to clean up /etc/mdadm.conf by adding these lines:
>>>
>>> DEVICE /dev/sda* /dev/sdb*
>>> ARRAY /dev/md5 devices=/dev/sda5,/dev/sdb5
>>> ARRAY /dev/md7 devices=/dev/sda7,/dev/sdb7
>>> ARRAY /dev/md9 devices=/dev/sda9,/dev/sdb9
>>
>> Perhaps you should not include /dev/md5 here.
>
> I wondered about that.
>
>> As you have made a point of building the array containing the root
>> filesystem with 0.99 metadata, ...
>
> ...as was instructed in the howto at the time...
>
>> I would assume that it is being assembled in kernelspace as a result of
>> CONFIG_MD_AUTODETECT being enabled.
>
> Yes, I think that's what's happening.
>
>> Alternatively, perhaps you are using an initramfs.
>
> Nope.
>
>> Either way, by the time the mdraid init.d script executes, the /dev/md5
>> array must - by definition - be up and mounted. Does it make a
>> difference if you add the following line to the config?
>>
>>     AUTO +1.x homehost -all
>>
>> That will prevent it from considering arrays with 0.99 metadata.
>
> No, I get the same result. Just a red asterisk at the left end of the line
> after "Starting up RAID devices..."

It since dawned upon me that defining AUTO as such won't help because 
you define the arrays explicitly. Can you try again with the mdraid 
script in the default runlevel but without the line defining /dev/md5?

>
> Now that I look at /etc/init.d/mdraid I see a few things that aren't quite
> kosher. The first is that it runs "mdadm -As 2>&1", which returns null after
> booting is finished (whence the empty line before the asterisk). Then it tests

Interesting. I think that you should file a bug because the implication 
is that mdadm is returning a non-zero exit status in the case of arrays 
that have already been assembled. Here's a post from the Arch forums 
suggesting the same:

https://bbs.archlinux.org/viewtopic.php?pid=706175#p706175

Is the exit status something other than 1? Try inserting eerror "$?" 
immediately after the call to mdadm -As. Granted, it's just an annoyance 
but it looks silly, not to mention unduly worrying.

> for the existence of /dev/md_d*. That also doesn't exist, though /dev/md*
> does:
>
> # ls -l /dev/md*
> brw-rw---- 1 root disk 9, 0 Aug 25 10:03 /dev/md0
> brw-rw---- 1 root disk 9, 5 Aug 25 10:03 /dev/md5
> brw-rw---- 1 root disk 9, 7 Aug 25 10:03 /dev/md7
> brw-rw---- 1 root disk 9, 9 Aug 25 10:03 /dev/md9
>
> /dev/md:
> total 0
> lrwxrwxrwx 1 root root 6 Aug 25 10:03 5_0 -> ../md5
> lrwxrwxrwx 1 root root 6 Aug 25 10:03 7_0 -> ../md7
> lrwxrwxrwx 1 root root 6 Aug 25 10:03 9_0 -> ../md9
>

I think this has something to do with partitionable RAID. Yes, it is 
possible to superimpose partitions upon an md device, though I have 
never seen fit to do so myself. For those that do not, the md_d* device 
nodes won't exist.

> Looks like I have some experimenting to do.
>
> I forgot to mention in my first post that, on shutdown, when the script runs
> "mdadm -Ss 2>&1" I always get "Cannot get exclusive access to /dev/md5..."
> I've always just ignored it until now, but perhaps it's important?

I would guess that it's because a) the array hosts the root filesystem 
b) you have the array explicitly defined in mdadm.conf and mdadm is 
being called with -s/--scan again.

>
>> On a related note, despite upstream's efforts to make this as awkward as
>> possible, it is possible to mimic the kernel's autodetect functionality
>> in userspace with a config such as this:
>>
>>     HOMEHOST <ignore>
>>     DEVICE partitions
>>     AUTO +1.x -all
>>
>> Bear in mind that the mdraid script runs `mdadm --assemble --scan`.
>> There is no need to specifically map out the properties of each array.
>> This is what the metadata is for.
>
> Thanks for the info, and the help. The fog is dispersing a bit...
>


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-25 12:18     ` [gentoo-user] Software RAID-1 Kerin Millar
@ 2014-08-25 12:24       ` Kerin Millar
  0 siblings, 0 replies; 17+ messages in thread
From: Kerin Millar @ 2014-08-25 12:24 UTC (permalink / raw
  To: gentoo-user

On 25/08/2014 13:18, Kerin Millar wrote:

<snip>

>>
>> No, I get the same result. Just a red asterisk at the left end of the
>> line
>> after "Starting up RAID devices..."
>
> It since dawned upon me that defining AUTO as such won't help because
> you define the arrays explicitly. Can you try again with the mdraid
> script in the default runlevel but without the line defining /dev/md5?

Sorry, I wasn't clear. Would you remove/comment the line describing 
/dev/md5 but also include this line:-

   AUTO +1.x -all

--Kerin


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-25 11:17     ` Peter Humphrey
@ 2014-08-25 12:35       ` Kerin Millar
  2014-08-25 16:51         ` Peter Humphrey
  0 siblings, 1 reply; 17+ messages in thread
From: Kerin Millar @ 2014-08-25 12:35 UTC (permalink / raw
  To: gentoo-user

On 25/08/2014 12:17, Peter Humphrey wrote:

<snip>

> Well, it was simple. I just said "rc-update del mdraid boot" and all is now
> well. I'd better revisit the docs to see if they still give the same advice.
>
> -- Regards Peter

Very interesting indeed. I now wonder if this is a race condition 
between the init script running `mdadm -As` and the fact that the mdadm 
package installs udev rules that allow for automatic incremental 
assembly? Refer to /lib/udev/rules.d/64-md-raid.rules and you'll see 
that it calls `mdadm --incremental` for newly added devices.

With that in mind, here's something else for you to try. Doing this will 
render these udev rules null and void:

# touch /etc/udev/rules.d/64-md-raid.rules

Thereafter, the mdraid script will be the only agent trying to assemble 
the 1.x metadata arrays so make sure that it is re-enabled.

I'm not actually sure that there is any point in calling mdadm -As where 
the udev rules are present. I would expect it to be one approach or the 
other, but not both at the same time.

Incidentally, the udev rules were a source of controversy in the 
following bug. Not everyone appreciates that they are installed by default.

https://bugs.gentoo.org/show_bug.cgi?id=401707

--Kerin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-25 12:35       ` Kerin Millar
@ 2014-08-25 16:51         ` Peter Humphrey
  2014-08-25 17:46           ` Kerin Millar
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Humphrey @ 2014-08-25 16:51 UTC (permalink / raw
  To: gentoo-user

On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
> On 25/08/2014 12:17, Peter Humphrey wrote:
> 
> <snip>
> 
> > Well, it was simple. I just said "rc-update del mdraid boot" and all is
> > now
> > well. I'd better revisit the docs to see if they still give the same
> > advice.
> 
> Very interesting indeed.

You wrote this e-mail after the other two, so I'll stick to this route, 
leaving the other idea for later if needed.

> I now wonder if this is a race condition between the init script running
> `mdadm -As` and the fact that the mdadm package installs udev rules that
> allow for automatic incremental assembly?

Isn't it just that, with the kernel auto-assembly of the root partition, and 
udev rules having assembled the rest, all the work's been done by the time the 
mdraid init script is called? I had wondered about the time that udev startup 
takes; assembling the raids would account for it.

> Refer to /lib/udev/rules.d/64-md-raid.rules and you'll see that it calls
> `mdadm --incremental` for newly added devices.

# ls -l /lib/udev/rules.d | grep raid
-rw-r--r-- 1 root root 2.1K Aug 23 10:34 63-md-raid-arrays.rules
-rw-r--r-- 1 root root 1.4K Aug 23 10:34 64-md-raid-assembly.rules

> With that in mind, here's something else for you to try. Doing this will
> render these udev rules null and void:
> 
> # touch /etc/udev/rules.d/64-md-raid.rules

I did that, but I think I need instead to 
# touch /etc/udev/rules.d/63-md-raid-arrays.rules
# touch /etc/udev/rules.d/64-md-raid-assembly.rules

I'll try it now...

> Thereafter, the mdraid script will be the only agent trying to assemble
> the 1.x metadata arrays so make sure that it is re-enabled.

Right. Here's the position:
1.	I've left /etc/init.d/mdraid out of all run levels. I have nothing but
	comments in mdadm.conf, but then it's not likely to be read anyway if the
	init script isn't running.
2.	I have empty /etc/udev rules files as above.
3.	I have kernel auto-assembly of raid enabled.
4.	I don't use an init ram disk.
5.	The root partition is on /dev/md5 (0.99 metadata)
6.	All other partitions except /boot are under /dev/vg7 which is built on
	top of /dev/md7 (1.x metadata).
7.	The system boots normally.

> I'm not actually sure that there is any point in calling mdadm -As where
> the udev rules are present. I would expect it to be one approach or the
> other, but not both at the same time.

That makes sense to me too. Do I even need sys-fs/mdadm installed? Maybe I'll 
try removing it. I have a little rescue system in the same box, so it'd be 
easy to put it back if necessary.

> Incidentally, the udev rules were a source of controversy in the
> following bug. Not everyone appreciates that they are installed by default.
> 
> https://bugs.gentoo.org/show_bug.cgi?id=401707

I'll have a look at that - thanks.

-- 
Regards
Peter


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-25 16:51         ` Peter Humphrey
@ 2014-08-25 17:46           ` Kerin Millar
  2014-08-26  9:38             ` Peter Humphrey
  0 siblings, 1 reply; 17+ messages in thread
From: Kerin Millar @ 2014-08-25 17:46 UTC (permalink / raw
  To: gentoo-user

On 25/08/2014 17:51, Peter Humphrey wrote:
> On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
>> On 25/08/2014 12:17, Peter Humphrey wrote:
>>
>> <snip>
>>
>>> Well, it was simple. I just said "rc-update del mdraid boot" and all is
>>> now
>>> well. I'd better revisit the docs to see if they still give the same
>>> advice.
>>
>> Very interesting indeed.
>
> You wrote this e-mail after the other two, so I'll stick to this route,
> leaving the other idea for later if needed.
>
>> I now wonder if this is a race condition between the init script running
>> `mdadm -As` and the fact that the mdadm package installs udev rules that
>> allow for automatic incremental assembly?
>
> Isn't it just that, with the kernel auto-assembly of the root partition, and
> udev rules having assembled the rest, all the work's been done by the time the
> mdraid init script is called? I had wondered about the time that udev startup
> takes; assembling the raids would account for it.

Yes, it's a possibility and would constitute a race condition - even 
though it might ultimately be a harmless one. As touched upon in the 
preceding post, I'd really like to know why mdadm sees fit to return a 
non-zero exit code given that the arrays are actually assembled 
successfully.

After all, even if the arrays are assembled at the point that mdadm is 
executed by the mdraid init script, partially or fully, it surely ought 
not to matter. As long as the arrays are fully assembled by the time 
mdadm exits, it should return 0 to signify success. Nothing else makes 
sense, in my opinion. It's absurd that the mdraid script is drawn into 
printing a blank error message where nothing has gone wrong.

Further, the mdadm ebuild still prints elog messages stating that mdraid 
is a requirement for the boot runlevel but, with udev rules, I don't see 
how that can be true. With udev being event-driven and calling mdadm 
upon the introduction of a new device, the array should be up and 
running as of the very moment that all the disks are seen, no matter 
whether the mdraid init script is executed or not.

>> Refer to /lib/udev/rules.d/64-md-raid.rules and you'll see that it calls
>> `mdadm --incremental` for newly added devices.
>
> # ls -l /lib/udev/rules.d | grep raid
> -rw-r--r-- 1 root root 2.1K Aug 23 10:34 63-md-raid-arrays.rules
> -rw-r--r-- 1 root root 1.4K Aug 23 10:34 64-md-raid-assembly.rules
>
>> With that in mind, here's something else for you to try. Doing this will
>> render these udev rules null and void:
>>
>> # touch /etc/udev/rules.d/64-md-raid.rules
>
> I did that, but I think I need instead to
> # touch /etc/udev/rules.d/63-md-raid-arrays.rules
> # touch /etc/udev/rules.d/64-md-raid-assembly.rules

Ah, yes. Looks like the rules have changed in >=mdadm-3.3. I'm still 
using mdadm-3.2.6-r1.

>
> I'll try it now...
>
>> Thereafter, the mdraid script will be the only agent trying to assemble
>> the 1.x metadata arrays so make sure that it is re-enabled.
>
> Right. Here's the position:
> 1.	I've left /etc/init.d/mdraid out of all run levels. I have nothing but
> 	comments in mdadm.conf, but then it's not likely to be read anyway if the
> 	init script isn't running.
> 2.	I have empty /etc/udev rules files as above.
> 3.	I have kernel auto-assembly of raid enabled.
> 4.	I don't use an init ram disk.
> 5.	The root partition is on /dev/md5 (0.99 metadata)
> 6.	All other partitions except /boot are under /dev/vg7 which is built on
> 	top of /dev/md7 (1.x metadata).
> 7.	The system boots normally.

I must confess that this boggles my mind. Under these circumstances, I 
cannot fathom how - or when - the 1.x arrays are being assembled. 
Something has to be executing mdadm at some point.

>
>> I'm not actually sure that there is any point in calling mdadm -As where
>> the udev rules are present. I would expect it to be one approach or the
>> other, but not both at the same time.
>
> That makes sense to me too. Do I even need sys-fs/mdadm installed? Maybe I'll
> try removing it. I have a little rescue system in the same box, so it'd be
> easy to put it back if necessary.

Yes, you need mdadm because 1.x metadata arrays must be assembled in 
userspace. In Gentoo, there are three contexts I know of in which this 
may occur:-

   1) Within an initramfs
   2) As a result of the udev rules
   3) As a result of the mdraid script

>
>> Incidentally, the udev rules were a source of controversy in the
>> following bug. Not everyone appreciates that they are installed by default.
>>
>> https://bugs.gentoo.org/show_bug.cgi?id=401707
>
> I'll have a look at that - thanks.
>

--Kerin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-25 17:46           ` Kerin Millar
@ 2014-08-26  9:38             ` Peter Humphrey
  2014-08-26 13:21               ` Kerin Millar
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Humphrey @ 2014-08-26  9:38 UTC (permalink / raw
  To: gentoo-user

On Monday 25 August 2014 18:46:23 Kerin Millar wrote:
> On 25/08/2014 17:51, Peter Humphrey wrote:
> > On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
> >> I now wonder if this is a race condition between the init script running
> >> `mdadm -As` and the fact that the mdadm package installs udev rules that
> >> allow for automatic incremental assembly?
> > 
> > Isn't it just that, with the kernel auto-assembly of the root partition,
> > and udev rules having assembled the rest, all the work's been done by the
> > time the mdraid init script is called? I had wondered about the time that
> > udev startup takes; assembling the raids would account for it.
> 
> Yes, it's a possibility and would constitute a race condition - even
> though it might ultimately be a harmless one.

I thought a race involved the competitors setting off at more-or-less the same 
time, not one waiting until the other had finished. No matter.

> As touched upon in the preceding post, I'd really like to know why mdadm
> sees fit to return a non-zero exit code given that the arrays are actually
> assembled successfully.

I can see why a dev might think "I haven't managed to do my job" here.

> After all, even if the arrays are assembled at the point that mdadm is
> executed by the mdraid init script, partially or fully, it surely ought
> not to matter. As long as the arrays are fully assembled by the time
> mdadm exits, it should return 0 to signify success. Nothing else makes
> sense, in my opinion. It's absurd that the mdraid script is drawn into
> printing a blank error message where nothing has gone wrong.

I agree, that is absurd.

> Further, the mdadm ebuild still prints elog messages stating that mdraid
> is a requirement for the boot runlevel but, with udev rules, I don't see
> how that can be true. With udev being event-driven and calling mdadm
> upon the introduction of a new device, the array should be up and
> running as of the very moment that all the disks are seen, no matter
> whether the mdraid init script is executed or not.

We agree again. The question is what to do about it. Maybe a bug report 
against mdadm?

--->8

> > Right. Here's the position:
> > 1.  I've left /etc/init.d/mdraid out of all run levels. I have nothing but
> > 	comments in mdadm.conf, but then it's not likely to be read anyway if the
> > 	init script isn't running.
> > 2.  I have empty /etc/udev rules files as above.
> > 3.  I have kernel auto-assembly of raid enabled.
> > 4.  I don't use an init ram disk.
> > 5.  The root partition is on /dev/md5 (0.99 metadata)
> > 6.  All other partitions except /boot are under /dev/vg7 which is built on
> > 	top of /dev/md7 (1.x metadata).
> > 7.  The system boots normally.
> 
> I must confess that this boggles my mind. Under these circumstances, I
> cannot fathom how - or when - the 1.x arrays are being assembled.
> Something has to be executing mdadm at some point.

I think it's udev. I had a look at the rules, but I no grok. I do see 
references to mdadm though.

> > Do I even need sys-fs/mdadm installed? Maybe
> > I'll try removing it. I have a little rescue system in the same box, so
> > it'd be easy to put it back if necessary.
> 
> Yes, you need mdadm because 1.x metadata arrays must be assembled in
> userspace.

I realised after writing that that I may well need it for maintenance. I'd do 
that from my rescue system though, which does have it installed, so I think I 
can ditch it from the main system.

-- 
Regards
Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-26  9:38             ` Peter Humphrey
@ 2014-08-26 13:21               ` Kerin Millar
  2014-08-26 14:54                 ` Peter Humphrey
  0 siblings, 1 reply; 17+ messages in thread
From: Kerin Millar @ 2014-08-26 13:21 UTC (permalink / raw
  To: gentoo-user

On 26/08/2014 10:38, Peter Humphrey wrote:
> On Monday 25 August 2014 18:46:23 Kerin Millar wrote:
>> On 25/08/2014 17:51, Peter Humphrey wrote:
>>> On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
>>>> I now wonder if this is a race condition between the init script running
>>>> `mdadm -As` and the fact that the mdadm package installs udev rules that
>>>> allow for automatic incremental assembly?
>>>
>>> Isn't it just that, with the kernel auto-assembly of the root partition,
>>> and udev rules having assembled the rest, all the work's been done by the
>>> time the mdraid init script is called? I had wondered about the time that
>>> udev startup takes; assembling the raids would account for it.
>>
>> Yes, it's a possibility and would constitute a race condition - even
>> though it might ultimately be a harmless one.
>
> I thought a race involved the competitors setting off at more-or-less the same
> time, not one waiting until the other had finished. No matter.

The mdraid script can assemble arrays and runs at a particular point in 
the boot sequence. The udev rules can also assemble arrays and, being 
event-driven, I suspect that they are likely to prevail. The point is 
that both the sequence and timing of these two mechanisms is not 
deterministic. There is definitely the potential for a race condition. I 
just don't yet know whether it is a harmful race condition.

>
>> As touched upon in the preceding post, I'd really like to know why mdadm
>> sees fit to return a non-zero exit code given that the arrays are actually
>> assembled successfully.
>
> I can see why a dev might think "I haven't managed to do my job" here.

It may be that mdadm returns different non-zero exit codes depending on 
the exact circumstances. It does have this characteristic for certain 
other operations (such as -t --detail).

>
>> After all, even if the arrays are assembled at the point that mdadm is
>> executed by the mdraid init script, partially or fully, it surely ought
>> not to matter. As long as the arrays are fully assembled by the time
>> mdadm exits, it should return 0 to signify success. Nothing else makes
>> sense, in my opinion. It's absurd that the mdraid script is drawn into
>> printing a blank error message where nothing has gone wrong.
>
> I agree, that is absurd.
>
>> Further, the mdadm ebuild still prints elog messages stating that mdraid
>> is a requirement for the boot runlevel but, with udev rules, I don't see
>> how that can be true. With udev being event-driven and calling mdadm
>> upon the introduction of a new device, the array should be up and
>> running as of the very moment that all the disks are seen, no matter
>> whether the mdraid init script is executed or not.
>
> We agree again. The question is what to do about it. Maybe a bug report
> against mdadm?

Definitely. Again, can you find out what the exit status is under the 
circumstances that mdadm produces a blank error? I am hoping it is 
something other than 1. If so, solving this problem might be as simple 
as having the mdraid script consider only a specific non-zero value to 
indicate an intractable error.

There is also the matter of whether it makes sense to explicitly 
assemble the arrays in the script where udev rules are already doing the 
job. However, I think this would require further investigation before 
considering making a bug of it.

>
> --->8
>
>>> Right. Here's the position:
>>> 1.  I've left /etc/init.d/mdraid out of all run levels. I have nothing but
>>> 	comments in mdadm.conf, but then it's not likely to be read anyway if the
>>> 	init script isn't running.
>>> 2.  I have empty /etc/udev rules files as above.
>>> 3.  I have kernel auto-assembly of raid enabled.
>>> 4.  I don't use an init ram disk.
>>> 5.  The root partition is on /dev/md5 (0.99 metadata)
>>> 6.  All other partitions except /boot are under /dev/vg7 which is built on
>>> 	top of /dev/md7 (1.x metadata).
>>> 7.  The system boots normally.
>>
>> I must confess that this boggles my mind. Under these circumstances, I
>> cannot fathom how - or when - the 1.x arrays are being assembled.
>> Something has to be executing mdadm at some point.
>
> I think it's udev. I had a look at the rules, but I no grok. I do see
> references to mdadm though.

So would I, only you said in step 2 that you have "empty" rules, which I 
take to mean that you had overridden the mdadm-provided udev rules with 
empty files. If all of the conditions you describe were true, you would 
have eliminated all three of the aformentioned contexts in which mdadm 
can be invoked. Given that mdadm is needed to assemble your 1.x arrays 
(see below), I would expect such conditions to result in mount errors on 
account of the missing arrays.

>
>>> Do I even need sys-fs/mdadm installed? Maybe
>>> I'll try removing it. I have a little rescue system in the same box, so
>>> it'd be easy to put it back if necessary.
>>
>> Yes, you need mdadm because 1.x metadata arrays must be assembled in
>> userspace.
>
> I realised after writing that that I may well need it for maintenance. I'd do
> that from my rescue system though, which does have it installed, so I think I
> can ditch it from the main system.

Again, 1.x arrays must be assembled in userspace. The kernel cannot 
assemble them by itself as it can with 0.9x arrays. If you uninstall 
mdadm, you will be removing the very userspace tool that is employed for 
assembly. Neither udev nor mdraid will be able to execute it, which 
cannot end well.

It's a different matter when using an initramfs, because it will bundle 
and make use of its own copy of mdadm.

--Kerin

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-26 13:21               ` Kerin Millar
@ 2014-08-26 14:54                 ` Peter Humphrey
  2014-08-26 16:00                   ` Kerin Millar
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Humphrey @ 2014-08-26 14:54 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 4144 bytes --]

On Tuesday 26 August 2014 14:21:19 Kerin Millar wrote:
> On 26/08/2014 10:38, Peter Humphrey wrote:
> > On Monday 25 August 2014 18:46:23 Kerin Millar wrote:
> >> On 25/08/2014 17:51, Peter Humphrey wrote:
> >>> On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
--->8
> Again, can you find out what the exit status is under the circumstances that
> mdadm produces a blank error? I am hoping it is something other than 1.

I've remerged mdadm to run this test. I'll report the result in a moment. 
[...] In fact it returned status 1. Sorry to disappoint :)

> >>> Here's the position:
> >>> 1.  I've left /etc/init.d/mdraid out of all run levels. I have nothing
> >>> but comments in mdadm.conf, but then it's not likely to be read anyway
> >>> if the init script isn't running.
> >>> 2.  I have empty /etc/udev rules files as above.
> >>> 3.  I have kernel auto-assembly of raid enabled.
> >>> 4.  I don't use an init ram disk.
> >>> 5.  The root partition is on /dev/md5 (0.99 metadata)
> >>> 6.  All other partitions except /boot are under /dev/vg7 which is built
> >>> on top of /dev/md7 (1.x metadata).
> >>> 7.  The system boots normally.
> >> 
> >> I must confess that this boggles my mind. Under these circumstances, I
> >> cannot fathom how - or when - the 1.x arrays are being assembled.
> >> Something has to be executing mdadm at some point.
> > 
> > I think it's udev. I had a look at the rules, but I no grok. I do see
> > references to mdadm though.
> So would I, only you said in step 2 that you have "empty" rules, which I
> take to mean that you had overridden the mdadm-provided udev rules with
> empty files.

Correct; that's what I did, but since removing mdadm I've also removed the 
corresponding, empty /etc/udev files.

I don't think it's udev any more; I now think the kernel is cleverer than we 
gave it credit for (see below and attached dmesg).

> If all of the conditions you describe were true, you would have eliminated
> all three of the aformentioned contexts in which mdadm can be invoked. Given
> that mdadm is needed to assemble your 1.x arrays (see below), I would expect
> such conditions to result in mount errors on account of the missing arrays.
--->8
> Again, 1.x arrays must be assembled in userspace. The kernel cannot
> assemble them by itself as it can with 0.9x arrays. If you uninstall
> mdadm, you will be removing the very userspace tool that is employed for
> assembly. Neither udev nor mdraid will be able to execute it, which
> cannot end well.

I had done that, with no ill effect. I've just booted the box with no mdadm 
present. It seems the kernel can after all assemble the arrays (see attached 
dmesg.txt, edited). Or maybe I was wrong about the metadata and they're all 
0.99. In course of checking this I tried a couple of things:

# lvm pvck /dev/md7
  Found label on /dev/md7, sector 1, type=LVM2 001
  Found text metadata area: offset=4096, size=1044480
# lvm vgdisplay
  --- Volume group ---
  VG Name               vg7
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  14
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                13
  Open LV               13
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               500.00 GiB
  PE Size               4.00 MiB
  Total PE              127999
  Alloc PE / Size       108800 / 425.00 GiB
  Free  PE / Size       19199 / 75.00 GiB
  VG UUID               ll8OHc-if2H-DVTf-AxrQ-5EW0-FOLM-Z73y0z

Can you tell from that which metadata version I used when I created vg7? It 
looks like 1.x to me, since man lvm refers to formats (=metadata types) lvm1 
and lvm2 - or am I reading too much into that?

See here what the postinst message said when I remerged sys-fs/mdadm-3.3.1-r2 
for the return-code test you asked for:

 * If you're not relying on kernel auto-detect of your RAID
 * devices, you need to add 'mdraid' to your 'boot' runlevel:
 *      rc-update add mdraid boot

Could be thought ambiguous.

Is nobody else experiencing this behaviour?

-- 
Regards
Peter

[-- Attachment #2: dmesg.txt --]
[-- Type: text/plain, Size: 12581 bytes --]

I seem to have a BIOS problem here. I switched DMA relocation off in the
kernel config when I found this error the first time, but it still appears,
as you see.

[    0.000000] ------------[ cut here ]------------
[    0.000000] WARNING: CPU: 0 PID: 0 at drivers/iommu/dmar.c:503 warn_invalid_dmar+0x7c/0x8e()
[    0.000000] Your BIOS is broken; DMAR reported at address fed90000 returns all ones!
BIOS vendor: American Megatrends Inc.; Ver: 1102   ; Product Version: System Version
[    0.000000] Modules linked in:
[    0.000000] CPU: 0 PID: 0 Comm: swapper Not tainted 3.14.14-gentoo #2
[    0.000000] Hardware name: System manufacturer System Product Name/P7P55D, BIOS 1102    11/23/2009
[    0.000000]  000000000000000b ffffffff81801e08 ffffffff814ba5c6 00000000000000a8
[    0.000000]  ffffffff81801e58 ffffffff81801e48 ffffffff81041887 ffffffff81801e68
[    0.000000]  ffffffff81af001c ffffffff81af0058 00000000fed90000 0000000000000000
[    0.000000] Call Trace:
[    0.000000]  [<ffffffff814ba5c6>] dump_stack+0x46/0x58
[    0.000000]  [<ffffffff81041887>] warn_slowpath_common+0x87/0xb0
[    0.000000]  [<ffffffff8104190a>] warn_slowpath_fmt_taint+0x3a/0x40
[    0.000000]  [<ffffffff812e06bf>] ? acpi_tb_verify_checksum+0x20/0x55
[    0.000000]  [<ffffffff814bb97e>] warn_invalid_dmar+0x7c/0x8e
[    0.000000]  [<ffffffff818c81e9>] detect_intel_iommu+0xd4/0x13e
[    0.000000]  [<ffffffff81897e07>] pci_iommu_alloc+0x4a/0x72
[    0.000000]  [<ffffffff818a43b6>] mem_init+0x9/0x3e
[    0.000000]  [<ffffffff81893ca2>] start_kernel+0x1f8/0x35f
[    0.000000]  [<ffffffff818938a9>] ? repair_env_string+0x5e/0x5e
[    0.000000]  [<ffffffff818935af>] x86_64_start_reservations+0x2a/0x2c
[    0.000000]  [<ffffffff818936a9>] x86_64_start_kernel+0xf8/0xfc
[    0.000000] ---[ end trace 492cc958e666c6fa ]---

Nevertheless, I still get this:

[    0.108051] Last level iTLB entries: 4KB 512, 2MB 7, 4MB 7
Last level dTLB entries: 4KB 512, 2MB 32, 4MB 32, 1GB 0
tlb_flushall_shift: 6
[    0.108330] Freeing SMP alternatives memory: 20K (ffffffff81951000 - ffffffff81956000)
[    0.108515] dmar: Host address width 36
[    0.108605] dmar: DRHD base: 0x000000fed90000 flags: 0x1
[    0.108704] dmar: IOMMU: failed to map dmar0
[    0.108794] dmar: parse DMAR table failure.
[    0.109268] ..TIMER: vector=0x30 apic1=0 pin1=2 apic2=-1 pin2=-1
[    0.119369] smpboot: CPU0: Intel(R) Core(TM) i5 CPU         750  @ 2.67GHz (fam: 06, model: 1e, stepping: 05)

Here's where disk detection starts:

[    0.340082] ahci 0000:00:1f.2: version 3.0
[    0.340170] ahci 0000:00:1f.2: irq 44 for MSI/MSI-X
[    0.340192] ahci 0000:00:1f.2: SSS flag set, parallel bus scan disabled
[    0.351054] ahci 0000:00:1f.2: AHCI 0001.0300 32 slots 6 ports 3 Gbps 0x3f impl SATA mode
[    0.351196] ahci 0000:00:1f.2: flags: 64bit ncq sntf stag pm led clo pmp pio slum part ems sxs apst 
[    0.362882] scsi0 : ahci
[    0.363489] scsi1 : ahci
[    0.364060] scsi2 : ahci
[    0.364673] scsi3 : ahci
[    0.365266] scsi4 : ahci
[    0.365881] scsi5 : ahci
[    0.366044] ata1: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7100 irq 44
[    0.366185] ata2: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7180 irq 44
[    0.366326] ata3: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7200 irq 44
[    0.366466] ata4: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7280 irq 44
[    0.366606] ata5: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7300 irq 44
[    0.366747] ata6: SATA max UDMA/133 abar m2048@0xf7ff7000 port 0xf7ff7380 irq 44
[    0.366966] i8042: PNP: PS/2 Controller [PNP0303:PS2K,PNP0f03:PS2M] at 0x60,0x64 irq 1,12
[    0.370095] serio: i8042 KBD port at 0x60,0x64 irq 1
[    0.370192] serio: i8042 AUX port at 0x60,0x64 irq 12
[    0.370478] mousedev: PS/2 mouse device common for all mice
[    0.370602] md: raid1 personality registered for level 1
[    0.370958] device-mapper: ioctl: 4.27.0-ioctl (2013-10-30) initialised: dm-devel@redhat.com
[    0.371119] hidraw: raw HID events driver (C) Jiri Kosina
[    0.372090] snd_hda_intel 0000:00:1b.0: irq 45 for MSI/MSI-X
[    0.372208] NET: Registered protocol family 17
[    0.372304] NET: Registered protocol family 15
[    0.372400] Key type dns_resolver registered
[    0.372866] registered taskstats version 1
[    0.373630] ALSA device list:
[    0.373728]   No soundcards found.
[    0.396880] input: AT Translated Set 2 keyboard as /devices/platform/i8042/serio0/input/input0
[    0.672182] ata1: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    0.678134] ata1.00: ATA-8: SAMSUNG HD103SJ, 1AJ100E4, max UDMA/133
[    0.678242] ata1.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[    0.684315] ata1.00: configured for UDMA/133
[    0.684729] scsi 0:0:0:0: Direct-Access     ATA      SAMSUNG HD103SJ  1AJ1 PQ: 0 ANSI: 5
[    0.685048] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    0.685234] sd 0:0:0:0: [sda] Write Protect is off
[    0.685326] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[    0.685337] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    0.747681]  sda: sda1 sda2 sda3 sda4 < sda5 sda6 sda7 sda8 sda9 >
[    0.749559] sd 0:0:0:0: [sda] Attached SCSI disk
[    0.990341] ata2: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[    0.996335] ata2.00: ATA-8: SAMSUNG HD103SJ, 1AJ100E4, max UDMA/133
[    0.996443] ata2.00: 1953525168 sectors, multi 0: LBA48 NCQ (depth 31/32), AA
[    1.002515] ata2.00: configured for UDMA/133
[    1.002917] scsi 1:0:0:0: Direct-Access     ATA      SAMSUNG HD103SJ  1AJ1 PQ: 0 ANSI: 5
[    1.003243] sd 1:0:0:0: [sdb] 1953525168 512-byte logical blocks: (1.00 TB/931 GiB)
[    1.003418] sd 1:0:0:0: [sdb] Write Protect is off
[    1.003511] sd 1:0:0:0: [sdb] Mode Sense: 00 3a 00 00
[    1.003522] sd 1:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[    1.043495]  sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 >
[    1.045159] sd 1:0:0:0: [sdb] Attached SCSI disk
[    1.308476] ata3: SATA link up 1.5 Gbps (SStatus 113 SControl 300)
[    1.310264] ata3.00: ATAPI: Optiarc DVD RW AD-7240S, 1.02, max UDMA/100
[    1.312470] ata3.00: configured for UDMA/100
[    1.314570] scsi 2:0:0:0: CD-ROM            Optiarc  DVD RW AD-7240S  1.02 PQ: 0 ANSI: 5
[    1.337522] tsc: Refined TSC clocksource calibration: 2674.966 MHz
[    1.619654] ata4: SATA link down (SStatus 0 SControl 300)
[    1.924871] ata5: SATA link down (SStatus 0 SControl 300)
[    2.229959] ata6: SATA link down (SStatus 0 SControl 300)
[    2.230247] md: Waiting for all devices to be available before autodetect
[    2.230354] md: If you don't use raid, use raid=noautodetect
[    2.230706] md: Autodetecting RAID arrays.
[    2.291257] md: Scanned 6 and added 6 devices.
[    2.291359] md: autorun ...
[    2.291451] md: considering sdb9 ...
[    2.291542] md:  adding sdb9 ...
[    2.291631] md: sdb7 has different UUID to sdb9
[    2.291722] md: sdb5 has different UUID to sdb9
[    2.291814] md:  adding sda9 ...
[    2.291911] md: sda7 has different UUID to sdb9
[    2.292003] md: sda5 has different UUID to sdb9
[    2.292381] md: created md9
[    2.292479] md: bind<sda9>
[    2.292576] md: bind<sdb9>
[    2.292670] md: running: <sdb9><sda9>
[    2.293140] kworker/u8:5 (67) used greatest stack depth: 6728 bytes left
[    2.293351] kworker/u8:5 (66) used greatest stack depth: 6688 bytes left
[    2.293837] md/raid1:md9: active with 2 out of 2 mirrors
[    2.293982] md9: detected capacity change from 0 to 405345533952
[    2.294087] md: considering sdb7 ...
[    2.294178] md:  adding sdb7 ...
[    2.294267] md: sdb5 has different UUID to sdb7
[    2.294359] md:  adding sda7 ...
[    2.294448] md: sda5 has different UUID to sdb7
[    2.294634] md: created md7
[    2.294723] md: bind<sda7>
[    2.294820] md: bind<sdb7>
[    2.294921] md: running: <sdb7><sda7>
[    2.295504] md/raid1:md7: active with 2 out of 2 mirrors
[    2.295632] md7: detected capacity change from 0 to 536870846464
[    2.295738] md: considering sdb5 ...
[    2.295828] md:  adding sdb5 ...
[    2.295948] md:  adding sda5 ...
[    2.296248] md: created md5
[    2.296345] md: bind<sda5>
[    2.296440] md: bind<sdb5>
[    2.296534] md: running: <sdb5><sda5>
[    2.297089] md/raid1:md5: active with 2 out of 2 mirrors
[    2.297219] md5: detected capacity change from 0 to 21474770944
[    2.297316] md: ... autorun DONE.
[    2.319988]  md5: unknown partition table
[    2.338158] Switched to clocksource tsc
[    2.346224] EXT4-fs (md5): mounted filesystem with ordered data mode. Opts: (null)
[    2.346370] VFS: Mounted root (ext4 filesystem) readonly on device 9:5.
[    2.365114] devtmpfs: mounted
[    2.366034] Freeing unused kernel memory: 840K (ffffffff8187f000 - ffffffff81951000)
[    2.366174] Write protecting the kernel read-only data: 8192k
[    2.369403] Freeing unused kernel memory: 1252K (ffff8800014c7000 - ffff880001600000)
[    2.370560] Freeing unused kernel memory: 424K (ffff880001796000 - ffff880001800000)
[    2.416614] random: nonblocking pool is initialized
[    3.003048] setfont (84) used greatest stack depth: 4176 bytes left

Here's where udev starts (it's the next entry in dmesg - I haven't cut
anything here):

[    4.831342] systemd-udevd[446]: starting version 215
[    5.218493] input: Power Button as /devices/LNXSYSTM:00/device:00/PNP0C0C:00/input/input2
[    5.218498] ACPI: Power Button [PWRB]
[    5.218550] input: Power Button as /devices/LNXSYSTM:00/LNXPWRBN:00/input/input3
[    5.218552] ACPI: Power Button [PWRF]
[    5.219707] rtc_cmos 00:02: RTC can wake from S4
[    5.219819] rtc_cmos 00:02: rtc core: registered rtc_cmos as rtc0
[    5.219845] rtc_cmos 00:02: alarms up to one month, y3k, 114 bytes nvram, hpet irqs

And now the file-systems:

[    8.847508] EXT4-fs (md5): re-mounted. Opts: (null)
[    9.099859] mount (867) used greatest stack depth: 4136 bytes left
[    9.203031] Adding 2097148k swap on /dev/sda3.  Priority:10 extents:1 across:2097148k FS
[    9.207315] Adding 2097148k swap on /dev/sdb3.  Priority:10 extents:1 across:2097148k FS
[    9.216701] Adding 20971516k swap on /dev/sda6.  Priority:1 extents:1 across:20971516k FS
[    9.224521] Adding 20971516k swap on /dev/sdb6.  Priority:1 extents:1 across:20971516k FS
[    9.292744] EXT4-fs (dm-0): mounted filesystem with ordered data mode. Opts: (null)
[    9.362160] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: (null)
[    9.386163] EXT4-fs (dm-1): mounted filesystem with ordered data mode. Opts: (null)
[    9.419314] EXT4-fs (dm-9): mounted filesystem with ordered data mode. Opts: (null)
[    9.461981] EXT4-fs (dm-3): mounted filesystem with ordered data mode. Opts: (null)
[    9.511962] EXT4-fs (dm-4): mounted filesystem with ordered data mode. Opts: (null)
[    9.590775] EXT4-fs (dm-5): mounted filesystem with ordered data mode. Opts: (null)
[    9.622095] EXT4-fs (dm-6): mounted filesystem with ordered data mode. Opts: (null)
[    9.643743] EXT4-fs (dm-7): mounted filesystem with ordered data mode. Opts: (null)
[    9.674996] EXT4-fs (dm-8): mounted filesystem with ordered data mode. Opts: (null)
[    9.699230] EXT4-fs (dm-12): mounted filesystem with ordered data mode. Opts: (null)
[   13.602089] r8169 0000:02:00.0 eth0: link down
[   13.602108] r8169 0000:02:00.0 eth0: link down
[   13.602150] ip (1728) used greatest stack depth: 2520 bytes left
[   14.281876] NET: Registered protocol family 10
[   14.282472] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready
[   15.205672] w83627ehf: Found W83667HG-B chip at 0x290
[   15.989801] r8169 0000:02:00.0 eth0: link up
[   15.989812] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
[   18.788649] ip_tables: (C) 2000-2006 Netfilter Core Team
[   20.034863] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   31.426556] EXT4-fs (md5): re-mounted. Opts: commit=0
[   31.627416] EXT4-fs (dm-0): re-mounted. Opts: commit=0
[   31.685620] EXT4-fs (dm-2): re-mounted. Opts: commit=0
[   31.939499] EXT4-fs (dm-1): re-mounted. Opts: commit=0
[   32.010016] EXT4-fs (dm-9): re-mounted. Opts: commit=0
[   32.033846] EXT4-fs (dm-3): re-mounted. Opts: commit=0
[   32.075817] EXT4-fs (dm-4): re-mounted. Opts: commit=0
[   32.117996] EXT4-fs (dm-5): re-mounted. Opts: commit=0
[   32.251796] EXT4-fs (dm-6): re-mounted. Opts: commit=0
[   32.293969] EXT4-fs (dm-7): re-mounted. Opts: commit=0
[   32.308344] EXT4-fs (dm-8): re-mounted. Opts: commit=0
[   32.351373] EXT4-fs (dm-12): re-mounted. Opts: commit=0

Remaining entries snipped.

^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1
  2014-08-26 14:54                 ` Peter Humphrey
@ 2014-08-26 16:00                   ` Kerin Millar
  2014-08-26 16:49                     ` [gentoo-user] Software RAID-1 - FIXED Peter Humphrey
  0 siblings, 1 reply; 17+ messages in thread
From: Kerin Millar @ 2014-08-26 16:00 UTC (permalink / raw
  To: gentoo-user

On 26/08/2014 15:54, Peter Humphrey wrote:
> On Tuesday 26 August 2014 14:21:19 Kerin Millar wrote:
>> On 26/08/2014 10:38, Peter Humphrey wrote:
>>> On Monday 25 August 2014 18:46:23 Kerin Millar wrote:
>>>> On 25/08/2014 17:51, Peter Humphrey wrote:
>>>>> On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
> --->8
>> Again, can you find out what the exit status is under the circumstances that
>> mdadm produces a blank error? I am hoping it is something other than 1.
>
> I've remerged mdadm to run this test. I'll report the result in a moment.
> [...] In fact it returned status 1. Sorry to disappoint :)

Thanks for testing. Can you tell me exactly what /etc/mdadm.conf 
contained at the time?

>
>>>>> Here's the position:
>>>>> 1.  I've left /etc/init.d/mdraid out of all run levels. I have nothing
>>>>> but comments in mdadm.conf, but then it's not likely to be read anyway
>>>>> if the init script isn't running.
>>>>> 2.  I have empty /etc/udev rules files as above.
>>>>> 3.  I have kernel auto-assembly of raid enabled.
>>>>> 4.  I don't use an init ram disk.
>>>>> 5.  The root partition is on /dev/md5 (0.99 metadata)
>>>>> 6.  All other partitions except /boot are under /dev/vg7 which is built
>>>>> on top of /dev/md7 (1.x metadata).
>>>>> 7.  The system boots normally.
>>>>
>>>> I must confess that this boggles my mind. Under these circumstances, I
>>>> cannot fathom how - or when - the 1.x arrays are being assembled.
>>>> Something has to be executing mdadm at some point.
>>>
>>> I think it's udev. I had a look at the rules, but I no grok. I do see
>>> references to mdadm though.
>> So would I, only you said in step 2 that you have "empty" rules, which I
>> take to mean that you had overridden the mdadm-provided udev rules with
>> empty files.
>
> Correct; that's what I did, but since removing mdadm I've also removed the
> corresponding, empty /etc/udev files.
>
> I don't think it's udev any more; I now think the kernel is cleverer than we
> gave it credit for (see below and attached dmesg).

Absolutely not ...

https://raid.wiki.kernel.org/index.php/RAID_superblock_formats#A_Note_about_kernel_autodetection_of_different_superblock_formats

https://github.com/torvalds/linux/blob/master/Documentation/md.txt

Both texts state unequivocally that kernel autodetection/assembly only 
works with the old superblock format.

Having read your dmesg.txt, I can only conclude that all of the arrays 
that the kernel is assembling are using the old superblock format, 
contrary to the information you have provided up until now. If so, then 
you do not rely on any of the three methods that I (correctly) said were 
necessary for 1.x superblock arrays.

To settle the matter, check the superblock versions using the method 
described below.

>
>> If all of the conditions you describe were true, you would have eliminated
>> all three of the aformentioned contexts in which mdadm can be invoked. Given
>> that mdadm is needed to assemble your 1.x arrays (see below), I would expect
>> such conditions to result in mount errors on account of the missing arrays.
> --->8
>> Again, 1.x arrays must be assembled in userspace. The kernel cannot
>> assemble them by itself as it can with 0.9x arrays. If you uninstall
>> mdadm, you will be removing the very userspace tool that is employed for
>> assembly. Neither udev nor mdraid will be able to execute it, which
>> cannot end well.
>
> I had done that, with no ill effect. I've just booted the box with no mdadm
> present. It seems the kernel can after all assemble the arrays (see attached
> dmesg.txt, edited). Or maybe I was wrong about the metadata and they're all
> 0.99. In course of checking this I tried a couple of things:
>
> # lvm pvck /dev/md7
>    Found label on /dev/md7, sector 1, type=LVM2 001
>    Found text metadata area: offset=4096, size=1044480
> # lvm vgdisplay
>    --- Volume group ---
>    VG Name               vg7
>    System ID
>    Format                lvm2
>    Metadata Areas        1
>    Metadata Sequence No  14
>    VG Access             read/write
>    VG Status             resizable
>    MAX LV                0
>    Cur LV                13
>    Open LV               13
>    Max PV                0
>    Cur PV                1
>    Act PV                1
>    VG Size               500.00 GiB
>    PE Size               4.00 MiB
>    Total PE              127999
>    Alloc PE / Size       108800 / 425.00 GiB
>    Free  PE / Size       19199 / 75.00 GiB
>    VG UUID               ll8OHc-if2H-DVTf-AxrQ-5EW0-FOLM-Z73y0z
>
> Can you tell from that which metadata version I used when I created vg7? It
> looks like 1.x to me, since man lvm refers to formats (=metadata types) lvm1
> and lvm2 - or am I reading too much into that?

LVM has nothing to do with md. I did allude to this in my first response 
on the thread. The above output demonstrates that you have designated an 
md block device as a PV (LVM physical volume). Any block device can be a 
PV - LVM does not care.

When I talk about 1.x metadata, I am talking about the md superblock. 
You can find out what the metadata format is like so:-

# mdadm --detail /dev/md7 | grep Version

To be clear, LVM does not enter into it.

>
> See here what the postinst message said when I remerged sys-fs/mdadm-3.3.1-r2
> for the return-code test you asked for:
>
>   * If you're not relying on kernel auto-detect of your RAID
>   * devices, you need to add 'mdraid' to your 'boot' runlevel:
>   *      rc-update add mdraid boot
>
> Could be thought ambiguous.

I would go so far as to say it is false, but this is a distinct matter.

>
> Is nobody else experiencing this behaviour?
>

--Kerin


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1 - FIXED
  2014-08-26 16:00                   ` Kerin Millar
@ 2014-08-26 16:49                     ` Peter Humphrey
  2014-08-26 17:25                       ` Kerin Millar
  0 siblings, 1 reply; 17+ messages in thread
From: Peter Humphrey @ 2014-08-26 16:49 UTC (permalink / raw
  To: gentoo-user

On Tuesday 26 August 2014 17:00:37 Kerin Millar wrote:
> On 26/08/2014 15:54, Peter Humphrey wrote:
> > On Tuesday 26 August 2014 14:21:19 Kerin Millar wrote:
> >> On 26/08/2014 10:38, Peter Humphrey wrote:
> >>> On Monday 25 August 2014 18:46:23 Kerin Millar wrote:
> >>>> On 25/08/2014 17:51, Peter Humphrey wrote:
> >>>>> On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
> > --->8
> > 
> >> Again, can you find out what the exit status is under the circumstances
> >> that mdadm produces a blank error? I am hoping it is something other
> >> than 1.> 
> > I've remerged mdadm to run this test. I'll report the result in a moment.
> > [...] In fact it returned status 1. Sorry to disappoint :)
> 
> Thanks for testing. Can you tell me exactly what /etc/mdadm.conf
> contained at the time?

It was the installed file, untouched, which contains only comments.

> LVM has nothing to do with md.

No, I know. I was just searching around for sources of info.

> When I talk about 1.x metadata, I am talking about the md superblock.
> You can find out what the metadata format is like so:-
> 
> # mdadm --detail /dev/md7 | grep Version

That's what I was looking for - thanks. It shows version 0.90. I did suspect 
that before, as I said, but couldn't find the command to check. If I had, I 
might not have started this thread.

So all this has been for nothing. I was sure I'd set 1.x metadata when 
creating the md device, but I must eat humble pie and glare once again at my 
own memory.

Many thanks for the effort you've put into this for me.

-- 
Regards
Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1 - FIXED
  2014-08-26 16:49                     ` [gentoo-user] Software RAID-1 - FIXED Peter Humphrey
@ 2014-08-26 17:25                       ` Kerin Millar
  2014-08-27  9:43                         ` Peter Humphrey
  0 siblings, 1 reply; 17+ messages in thread
From: Kerin Millar @ 2014-08-26 17:25 UTC (permalink / raw
  To: gentoo-user

On 26/08/2014 17:49, Peter Humphrey wrote:
> On Tuesday 26 August 2014 17:00:37 Kerin Millar wrote:
>> On 26/08/2014 15:54, Peter Humphrey wrote:
>>> On Tuesday 26 August 2014 14:21:19 Kerin Millar wrote:
>>>> On 26/08/2014 10:38, Peter Humphrey wrote:
>>>>> On Monday 25 August 2014 18:46:23 Kerin Millar wrote:
>>>>>> On 25/08/2014 17:51, Peter Humphrey wrote:
>>>>>>> On Monday 25 August 2014 13:35:11 Kerin Millar wrote:
>>> --->8
>>>
>>>> Again, can you find out what the exit status is under the circumstances
>>>> that mdadm produces a blank error? I am hoping it is something other
>>>> than 1.>
>>> I've remerged mdadm to run this test. I'll report the result in a moment.
>>> [...] In fact it returned status 1. Sorry to disappoint :)
>>
>> Thanks for testing. Can you tell me exactly what /etc/mdadm.conf
>> contained at the time?
>
> It was the installed file, untouched, which contains only comments.
>
>> LVM has nothing to do with md.
>
> No, I know. I was just searching around for sources of info.
>
>> When I talk about 1.x metadata, I am talking about the md superblock.
>> You can find out what the metadata format is like so:-
>>
>> # mdadm --detail /dev/md7 | grep Version
>
> That's what I was looking for - thanks. It shows version 0.90. I did suspect
> that before, as I said, but couldn't find the command to check. If I had, I
> might not have started this thread.
>
> So all this has been for nothing. I was sure I'd set 1.x metadata when
> creating the md device, but I must eat humble pie and glare once again at my
> own memory.

Not to worry. However, I still think that it's a bug that mdadm behaves 
as it does, leading to the curious behaviour of the mdraid script. 
Please consider filing one and, if you do so, cc me into it. I have an 
interest in pursuing it.

--Kerin


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [gentoo-user] Software RAID-1 - FIXED
  2014-08-26 17:25                       ` Kerin Millar
@ 2014-08-27  9:43                         ` Peter Humphrey
  0 siblings, 0 replies; 17+ messages in thread
From: Peter Humphrey @ 2014-08-27  9:43 UTC (permalink / raw
  To: gentoo-user

On Tuesday 26 August 2014 18:25:00 Kerin Millar wrote:
> On 26/08/2014 17:49, Peter Humphrey wrote:
> > So all this has been for nothing. I was sure I'd set 1.x metadata when
> > creating the md device, but I must eat humble pie and glare once again at
> > my own memory.
> 
> Not to worry. However, I still think that it's a bug that mdadm behaves
> as it does, leading to the curious behaviour of the mdraid script.
> Please consider filing one and, if you do so, cc me into it. I have an
> interest in pursuing it.

Done:	https://bugs.gentoo.org/show_bug.cgi?id=521280

-- 
Regards
Peter



^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2014-08-27  9:43 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-24 13:51 [gentoo-user] Software RAID-1 Peter Humphrey
2014-08-24 16:06 ` Mick
2014-08-24 18:22 ` Kerin Millar
2014-08-25  9:22   ` Peter Humphrey
2014-08-25 11:17     ` Peter Humphrey
2014-08-25 12:35       ` Kerin Millar
2014-08-25 16:51         ` Peter Humphrey
2014-08-25 17:46           ` Kerin Millar
2014-08-26  9:38             ` Peter Humphrey
2014-08-26 13:21               ` Kerin Millar
2014-08-26 14:54                 ` Peter Humphrey
2014-08-26 16:00                   ` Kerin Millar
2014-08-26 16:49                     ` [gentoo-user] Software RAID-1 - FIXED Peter Humphrey
2014-08-26 17:25                       ` Kerin Millar
2014-08-27  9:43                         ` Peter Humphrey
2014-08-25 12:18     ` [gentoo-user] Software RAID-1 Kerin Millar
2014-08-25 12:24       ` Kerin Millar

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox