* Re: [gentoo-user] zfs repair needed (due to fingers being faster than brain)
2021-03-01 22:25 [gentoo-user] zfs repair needed (due to fingers being faster than brain) John Blinka
@ 2021-03-02 1:15 ` antlists
2021-03-02 2:30 ` Grant Taylor
[not found] ` <7407510.jJDZkT8p0M@robert-notebook>
2 siblings, 0 replies; 5+ messages in thread
From: antlists @ 2021-03-02 1:15 UTC (permalink / raw
To: gentoo-user
Firstly, I'll say I'm not experienced, but knowing a fair bit about raid
and recovering corrupted arrays ...
On 01/03/2021 22:25, John Blinka wrote:
> HI, Gentooers!
>
> So, I typed dd if=/dev/zero of=/dev/sd<wrong letter>, and despite
> hitting ctrl-c quite quickly, zeroed out some portion of the initial
> part of a disk. Which did this to my zfs raidz3 array:
>
> NAME STATE READ WRITE CKSUM
> zfs DEGRADED 0 0 0
> raidz3-0 DEGRADED 0 0 0
> ata-HGST_HUS724030ALE640_PK1234P8JJJVKP ONLINE 0 0 0
> ata-HGST_HUS724030ALE640_PK1234P8JJP3AP ONLINE 0 0 0
> ata-ST4000NM0033-9ZM170_Z1Z80P4C ONLINE 0 0 0
> ata-ST4000NM0033-9ZM170_Z1ZAZ8F1 ONLINE 0 0 0
> 14296253848142792483 UNAVAIL 0 0
> 0 was /dev/disk/by-id/ata-ST4000NM0033-9ZM170_Z1ZAZDJ0-part1
> ata-ST4000NM0033-9ZM170_Z1Z80KG0 ONLINE 0 0 0
>
> Could have been worse. I do have backups, and it is raid3, so all
> I've injured is my pride, but I do want to fix things. I'd
> appreciate some guidance before I attempt doing this - I have no
> experience at it myself.
>
> The steps I envision are
>
> 1) zpool offline zfs 14296253848142792483 (What's that number?)
> 2) do something to repair the damaged disk
> 3) zpool online zfs <repaired disk>
>
> Right now, the device name for the damaged disk is /dev/sda. Gdisk
> says this about it:
>
> Caution: invalid main GPT header, but valid backup; regenerating main header
> from backup!
The GPT table is stored at least twice, this is telling you the primary
copy is trashed, but the backup seems okay ...
>
> Warning: Invalid CRC on main header data; loaded backup partition table.
> Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
> on the recovery & transformation menu to examine the two tables.
>
> Warning! Main partition table CRC mismatch! Loaded backup partition table
> instead of main partition table!
>
> Warning! One or more CRCs don't match. You should repair the disk!
> Main header: ERROR
> Backup header: OK
> Main partition table: ERROR
> Backup partition table: OK
>
> Partition table scan:
> MBR: not present
> BSD: not present
> APM: not present
> GPT: damaged
>
> Found invalid MBR and corrupt GPT. What do you want to do? (Using the
> GPT MAY permit recovery of GPT data.)
> 1 - Use current GPT
> 2 - Create blank GPT
>
> Your answer: ( I haven't given one yet)
>
> I'm not exactly sure what this is telling me. But I'm guessing it
> means that the main partition table is gone, but there's a good
> backup.
Yup. I don't understand that prompt, but I THINK it's saying that if you
do choose choice 1, it will recover your partition table for you.
> In addition, some, but not all disk id info is gone:
> 1) /dev/disk/by-id still shows ata-ST4000NM0033-9ZM170_Z1ZAZDJ0 (the
> damaged disk) but none of its former partitions
Because this is the disk, and you've damaged the contents, so this is
completely unaffected.
> 2) /dev/disk/by-partlabel shows entries for the undamaged disks in the
> pool, but not the damaged one
> 3) /dev/disk/by-partuuid similar to /dev/disk/by-partlabel
For both of these, "part" is short for partition, and you've just
trashed them ...
> 4) /dev/disk/by-uuid does not show the damaged disk
>
Because the uuid is part of the partition table.
> This particular disk is from a batch of 4 I bought with the same make
> and specification and very similar ids (/dev/disk/by-id). Can I
> repair this disk by copying something off one of those other disks
> onto this one?
GOD NO! You'll start copying uuids, so they'll no longer be unique, and
things really will be broken!
> Is repair just repartitioning - as in the Gentoo
> handbook? Is it as simple as running gdisk and typing 1 to accept
> gdisk's attempt at recovering the gpt? Is running gdisk's recovery
> and transformation facilities the way to go (the b option looks like
> it's made for exactly this situation)?
>
> Anybody experienced at this and willing to guide me?
>
Make sure that option 1 really does recover the GPT, then use it. Of
course, the question then becomes what further damage will rear its head.
You need to make sure that your raid 3 array can recover from a corrupt
disk. THIS IS IMPORTANT. If you tried to recover an md-raid-5 array from
this situation you'd almost certainly trash it completely.
Actually, if your setup is raid, I'd just blow out the trashed disk
completely. Take it out of your system, replace it, and let zfs repair
itself onto the new disk.
You can then zero out the old disk and it's now a spare.
Just be careful here, because I don't know what zfs does, but btrfs by
default mirrors metadata but not data, so with that you'd think a
mirrored filesystem could repair itself but it can't ... if you want to
repair the filesystem without rebuilding from scratch, you need to know
rather more about zfs than I do ...
Cheers,
Wol
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [gentoo-user] zfs repair needed (due to fingers being faster than brain)
2021-03-01 22:25 [gentoo-user] zfs repair needed (due to fingers being faster than brain) John Blinka
2021-03-02 1:15 ` antlists
@ 2021-03-02 2:30 ` Grant Taylor
[not found] ` <7407510.jJDZkT8p0M@robert-notebook>
2 siblings, 0 replies; 5+ messages in thread
From: Grant Taylor @ 2021-03-02 2:30 UTC (permalink / raw
To: gentoo-user
On 3/1/21 3:25 PM, John Blinka wrote:
> HI, Gentooers!
Hi,
> So, I typed dd if=/dev/zero of=/dev/sd<wrong letter>, and despite
> hitting ctrl-c quite quickly, zeroed out some portion of the initial
> part of a disk. Which did this to my zfs raidz3 array:
OOPS!!!
> NAME STATE READ WRITE CKSUM
> zfs DEGRADED 0 0 0
> raidz3-0 DEGRADED 0 0 0
> ata-HGST_HUS724030ALE640_PK1234P8JJJVKP ONLINE 0 0 0
> ata-HGST_HUS724030ALE640_PK1234P8JJP3AP ONLINE 0 0 0
> ata-ST4000NM0033-9ZM170_Z1Z80P4C ONLINE 0 0 0
> ata-ST4000NM0033-9ZM170_Z1ZAZ8F1 ONLINE 0 0 0
> 14296253848142792483 UNAVAIL 0 0
> 0 was /dev/disk/by-id/ata-ST4000NM0033-9ZM170_Z1ZAZDJ0-part1
> ata-ST4000NM0033-9ZM170_Z1Z80KG0 ONLINE 0 0 0
Okay. So the pool is online and the data is accessible. That's
actually better than I originally thought. -- I thought you had
accidentally damaged part of the ZFS partition that existed on a single
disk. -- I've been able to repair this with minimal data loss (zeros)
with Oracle's help on Solaris in the past.
Aside: My understanding is that ZFS stores multiple copies of it's
metadata on the disk (assuming single disk) and that it is possible to
recover a pool if any one (or maybe two for consistency checks) are
viable. Though doing so is further into the weeds than you normally
want to be.
> Could have been worse. I do have backups, and it is raid3, so all I've
> injured is my pride, but I do want to fix things. I'd appreciate
> some guidance before I attempt doing this - I have no experience at
> it myself.
First, your pool / it's raidz3 is only 'DEGRADED', which means that the
data is still accessible. 'OFFLINE' would be more problematic.
> The steps I envision are
>
> 1) zpool offline zfs 14296253848142792483 (What's that number?)
I'm guessing it's an internal ZFS serial number. You will probably need
to reference it.
I see no reason to take the pool offline.
> 2) do something to repair the damaged disk
I don't think you need to do anything at the individual disk level yet.
> 3) zpool online zfs <repaired disk>
I think you can fix this with the pool online.
> Right now, the device name for the damaged disk is /dev/sda.
> Gdisk says this about it:
>
> Caution: invalid main GPT header,
This is to be expected.
> but valid backup; regenerating main header from backup!
This looks promising.
> Warning: Invalid CRC on main header data; loaded backup partition table.
> Warning! Main and backup partition tables differ! Use the 'c' and 'e' options
> on the recovery & transformation menu to examine the two tables.
I'm assuming that the main partition table is at the start of the disk
and that it's what got wiped out.
So I'd think that you can look at the 'c' and 'e' options on the
recovery & transformation menu for options to repair the main partition
table.
> Warning! Main partition table CRC mismatch! Loaded backup partition table
> instead of main partition table!
I know. Thank you for using the backup partition table.
> Warning! One or more CRCs don't match. You should repair the disk!
I'm guessing that this is a direct result of the dd oops. I would want
more evidence to support it being a larger problem.
The CRC may be calculated over a partially zeroed chunk of disk. (Chunk
because I don't know what term is best here and I want to avoid implying
anything specific or incorrectly.)
> Main header: ERROR
> Backup header: OK
> Main partition table: ERROR
> Backup partition table: OK
ACK
> Partition table scan:
> MBR: not present
> BSD: not present
> APM: not present
> GPT: damaged
>
> Found invalid MBR and corrupt GPT. What do you want to do? (Using the
> GPT MAY permit recovery of GPT data.)
> 1 - Use current GPT
> 2 - Create blank GPT
>
> Your answer: ( I haven't given one yet)
I'd assume #1, Use current GPT.
> I'm not exactly sure what this is telling me. But I'm guessing it
> means that the main partition table is gone, but there's a good
> backup.
That's my interpretation too.
It jives with the description of what happened.
> In addition, some, but not all disk id info is gone:
> 1) /dev/disk/by-id still shows ata-ST4000NM0033-9ZM170_Z1ZAZDJ0
> (the damaged disk) but none of its former partitions
The disk ID still being there may be a symptom / side effect of when
udev creates the links. I would expect it to not be there post-reboot.
Well, maybe. The disk serial number is independent of any data on the disk.
Partitions by ID would probably be gone post reboot (or eject and
re-insertion).
> 2) /dev/disk/by-partlabel shows entries for the undamaged disks in
> the pool, but not the damaged one
Okay. That means that udev is recognizing the change faster than I
would have expected.
That probably means that the ID in #1 has survived any such update.
> 3) /dev/disk/by-partuuid similar to /dev/disk/by-partlabel
Given #2, I'm not surprised at #3.
> 4) /dev/disk/by-uuid does not show the damaged disk
Hum.
> This particular disk is from a batch of 4 I bought with the same make
> and specification and very similar ids (/dev/disk/by-id). Can I
> repair this disk by copying something off one of those other disks
> onto this one?
Maybe. But I would not bother. (See below.)
> Is repair just repartitioning - as in the Gentoo handbook? Is it
> as simple as running gdisk and typing 1 to accept gdisk's attempt at
> recovering the gpt? Is running gdisk's recovery and transformation
> facilities the way to go (the b option looks like it's made for
> exactly this situation)?
gdisk will address the partition problem. But that doesn't do anything
for ZFS.
> Anybody experienced at this and willing to guide me?
I've not dealt with this particular problem. But I have dealt with a
few different things.
My course of action would be:
0) Copy the entire disk to another disk if possible and if you are
sufficiently paranoid.
1) Let gdisk repair the main partition table using the data from the
backup partition table.
2) Leverage ZFS's ZRAID functionality to recover the ZFS data.
I /think/ that #2 can be done with one command. Do your homework to
understand, check, and validate this. You are responsible for your own
actions, despite what some random on the Internet says. ;-)
# zpool replace 14296253848142792483 sda
Assuming that /dev/sda is the corrupted disk.
This will cause ZFS to remove the 14296253848142792483 disk from the
pool and rebuild onto the (/dev/)sda disk. -- ZFS doesn't care that
they are the same disk.
You can keep track of the resilver with something like the following:
# while true; do zpool status zfs; sleep 60; done
Since your pool is only 'DEGRADED', you are probably in an okay
position. It's just a matter of not making things worse while trying to
make them better.
Given that you have a RAIDZ3 and all of the other disks are ONLINE, your
data should currently be safe.
--
Grant. . . .
unix || die
^ permalink raw reply [flat|nested] 5+ messages in thread