* [gentoo-user] Help running fsck on reiserfs lvm /var on production server?
@ 2013-06-08 21:37 Tanstaafl
2013-06-09 7:45 ` Alan McKinnon
0 siblings, 1 reply; 6+ messages in thread
From: Tanstaafl @ 2013-06-08 21:37 UTC (permalink / raw
To: gentoo-user
Hi everyone,
What is best practice for doing this?
If I reboot in single user mode, will my lvm volumes (ie, /var) be
available for fsck'ing, or do I have to mount them first?
The current problem started after a different problem required me to do
a hard reset on the server - had to do with a mounted QNAP device being
unavailable when I initiated a reboot, and everything just hung.
Ever since I did this hard reset, the server hangs at unmounting /var.
I've let it sit there for at least an hour, and it never goes past that.
Then after I hard reset it, it fsck's /var partition again, maybe fixes
minor problems very quickly, and everything works fine until I have to
reboot or shutdown again.
This became a major problem this weekend when we had one extended power
outage (about 8 hours) yesterday evening, then another one (about 4
hours) this morning right after I got everything back up and running
from last nights outage.
Anyway, I need to do this this weened if at all possible, so...
Anyone have any pointers to detailed docs and or willing to hold my hand
through this a little?
Thanks,
Charles
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Help running fsck on reiserfs lvm /var on production server?
2013-06-08 21:37 [gentoo-user] Help running fsck on reiserfs lvm /var on production server? Tanstaafl
@ 2013-06-09 7:45 ` Alan McKinnon
2013-06-09 14:43 ` Tanstaafl
0 siblings, 1 reply; 6+ messages in thread
From: Alan McKinnon @ 2013-06-09 7:45 UTC (permalink / raw
To: gentoo-user
On 08/06/2013 23:37, Tanstaafl wrote:
> Hi everyone,
>
> What is best practice for doing this?
>
> If I reboot in single user mode, will my lvm volumes (ie, /var) be
> available for fsck'ing, or do I have to mount them first?
>
> The current problem started after a different problem required me to do
> a hard reset on the server - had to do with a mounted QNAP device being
> unavailable when I initiated a reboot, and everything just hung.
>
> Ever since I did this hard reset, the server hangs at unmounting /var.
> I've let it sit there for at least an hour, and it never goes past that.
>
> Then after I hard reset it, it fsck's /var partition again, maybe fixes
> minor problems very quickly, and everything works fine until I have to
> reboot or shutdown again.
>
> This became a major problem this weekend when we had one extended power
> outage (about 8 hours) yesterday evening, then another one (about 4
> hours) this morning right after I got everything back up and running
> from last nights outage.
>
> Anyway, I need to do this this weened if at all possible, so...
>
> Anyone have any pointers to detailed docs and or willing to hold my hand
> through this a little?
fsck'ing that filesystem should be no different from any other fsck - it
should find what it finds and fix what it can. The fs must be unmounted
of course which means you have to do it in single-user mode, or from
booting a rescue system (I prefer the second, I find it easier as none
of the production filesystems are required to be mounted).
fsck.resiserfs has several modes, IIRC there's --rebuild-tree or similar
that does an extensive checks but takes ages. I needed to do this 2 or 3
times when I was still using reiser. There's also an option to do not
writes if you want a sanity check first.
I'm not convinced a power outage broke the fs so that you now can't
umount it, I'm having a hard time imaging how that would happen. More
likely some other script file elsewhere is damaged and leaves files open
when the system wants to umount /var.
You have some options:
This requires considerable downtime, easily an hour or more. You can dd
/var somewhere to get a copy you can experiment on with another host. At
least you will then know how much downtime to schedule.
You should do a full check and repair on all filesystems to be 100% certain.
For the umount issues, that is trickier as you won't have log files in
/var after the fact. Any clues on the Alt-F12 console whilst shutting
down? Try configure your syslogger to send logs to another host, you
might be lucky enough to get some logs that way that describe what is
going on.
--
Alan McKinnon
alan.mckinnon@gmail.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Help running fsck on reiserfs lvm /var on production server?
2013-06-09 7:45 ` Alan McKinnon
@ 2013-06-09 14:43 ` Tanstaafl
2013-06-09 15:14 ` Alan McKinnon
0 siblings, 1 reply; 6+ messages in thread
From: Tanstaafl @ 2013-06-09 14:43 UTC (permalink / raw
To: gentoo-user
On 2013-06-09 3:45 AM, Alan McKinnon <alan.mckinnon@gmail.com> wrote:
> I'm not convinced a power outage broke the fs so that you now can't
> umount it, I'm having a hard time imaging how that would happen. More
> likely some other script file elsewhere is damaged and leaves files open
> when the system wants to umount /var.
Hmmm....
Admittedly, I don't reboot this system often, so maybe I'm
misremembering when the problem crept in.
Could it be the NFS mount that it is hanging on, and unmounting/var is
just that last thing showing on the screen?
I think I'll try manually unmounting that before rebooting the next time
I need to(need to update the kernel soon anyway)...
I do know the last few times this has happened, the NFS mount was
'unavailable' (the device had powered down without first unmounting it
from the server)...
I hope that is all it is...
Thanks Alan
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Help running fsck on reiserfs lvm /var on production server?
2013-06-09 14:43 ` Tanstaafl
@ 2013-06-09 15:14 ` Alan McKinnon
2013-06-09 15:27 ` Neil Bothwick
2013-06-09 20:21 ` Tanstaafl
0 siblings, 2 replies; 6+ messages in thread
From: Alan McKinnon @ 2013-06-09 15:14 UTC (permalink / raw
To: gentoo-user
On 09/06/2013 16:43, Tanstaafl wrote:
> On 2013-06-09 3:45 AM, Alan McKinnon <alan.mckinnon@gmail.com> wrote:
>> I'm not convinced a power outage broke the fs so that you now can't
>> umount it, I'm having a hard time imaging how that would happen. More
>> likely some other script file elsewhere is damaged and leaves files open
>> when the system wants to umount /var.
>
> Hmmm....
>
> Admittedly, I don't reboot this system often, so maybe I'm
> misremembering when the problem crept in.
>
> Could it be the NFS mount that it is hanging on, and unmounting/var is
> just that last thing showing on the screen?
>
> I think I'll try manually unmounting that before rebooting the next time
> I need to(need to update the kernel soon anyway)...
>
> I do know the last few times this has happened, the NFS mount was
> 'unavailable' (the device had powered down without first unmounting it
> from the server)...
>
> I hope that is all it is...
Ugh, NFS complicates things :-)
I have a similar thing with my notebook and NFS mounts at home, I often
forget to umount the NFS dirs, causing issues when I then go to work and
wake the machine up
If you have NFS in the mix, I'd certainly investigate that first before
getting into more complex things. Also check that your NFS and mount
stuff in /etc/init.d are doing the right thing in the right order with
both startup and shutdown
--
Alan McKinnon
alan.mckinnon@gmail.com
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Help running fsck on reiserfs lvm /var on production server?
2013-06-09 15:14 ` Alan McKinnon
@ 2013-06-09 15:27 ` Neil Bothwick
2013-06-09 20:21 ` Tanstaafl
1 sibling, 0 replies; 6+ messages in thread
From: Neil Bothwick @ 2013-06-09 15:27 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: text/plain, Size: 491 bytes --]
On Sun, 09 Jun 2013 17:14:57 +0200, Alan McKinnon wrote:
> I have a similar thing with my notebook and NFS mounts at home, I often
> forget to umount the NFS dirs, causing issues when I then go to work and
> wake the machine up
That's why I have my hibernate script unmount NFS shares and take the
network down before hibernating.
--
Neil Bothwick
WinErr 01B: Illegal error - You are not allowed to get this error.
Next time you will get a penalty for that.
[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [gentoo-user] Help running fsck on reiserfs lvm /var on production server?
2013-06-09 15:14 ` Alan McKinnon
2013-06-09 15:27 ` Neil Bothwick
@ 2013-06-09 20:21 ` Tanstaafl
1 sibling, 0 replies; 6+ messages in thread
From: Tanstaafl @ 2013-06-09 20:21 UTC (permalink / raw
To: gentoo-user
On 2013-06-09 11:14 AM, Alan McKinnon <alan.mckinnon@gmail.com> wrote:
> On 09/06/2013 16:43, Tanstaafl wrote:
>> I do know the last few times this has happened, the NFS mount was
>> 'unavailable' (the device had powered down without first unmounting it
>> from the server)...
>>
>> I hope that is all it is...
> Ugh, NFS complicates things :-)
>
> I have a similar thing with my notebook and NFS mounts at home, I often
> forget to umount the NFS dirs, causing issues when I then go to work and
> wake the machine up
>
> If you have NFS in the mix, I'd certainly investigate that first before
> getting into more complex things. Also check that your NFS and mount
> stuff in /etc/init.d are doing the right thing in the right order with
> both startup and shutdown
Yep, that bugger was it...
umounted the NFS mount and the reboot went smooth as silk.
Now, to figure out why the NFS mount isn't unmounting properly during a
shutdown or reboot, but that will be another thread...
Thanks Alan!
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2013-06-09 20:22 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-06-08 21:37 [gentoo-user] Help running fsck on reiserfs lvm /var on production server? Tanstaafl
2013-06-09 7:45 ` Alan McKinnon
2013-06-09 14:43 ` Tanstaafl
2013-06-09 15:14 ` Alan McKinnon
2013-06-09 15:27 ` Neil Bothwick
2013-06-09 20:21 ` Tanstaafl
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox