* [gentoo-user] Mysteriously dismounting partition
@ 2015-10-26 14:47 Peter Humphrey
2015-10-27 11:04 ` Stefan G. Weichinger
0 siblings, 1 reply; 7+ messages in thread
From: Peter Humphrey @ 2015-10-26 14:47 UTC (permalink / raw
To: gentoo-user
Hello list,
I have a small rescue system in this box, using /dev/sda3 and /dev/sdb3 in a
traditional partition layout. The disks are (supposedly) identical SSDs. All
goes well when I boot the system, but by the time I come to write to sdb3 it's
dismounted itself. It even dismounted itself once in the middle of syncing
portage. Here's a snippet from fstab:
LABEL=RescueSys / ext4 relatime 1 1
LABEL=RescUsrBits /usr-bits ext4 relatime 1 2
I keep the portage tree under /usr-bits.
# dmesg | grep sdb3
[ 1.753508] sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 >
[ 4.833460] EXT4-fs (sdb3): mounted filesystem with ordered data mode. Opts:
(null)
[ 107.205918] EXT4-fs (sdb3): mounted filesystem with ordered data mode. Opts:
(null)
You can see the successful mount at 4.8 s; the entry at 107 s is me mounting
it again manually.
I've rewritten the partition label, and I've run a smartctl test which
reported no faults found. I've also just reduced the speed of the chipset,
which has three settings: good performance, better performance and turbo. It
adopts the turbo setting by default and I've now set it to "better". It's too
early yet to see if that will help.
What else can I try?
--
Rgds
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-user] Mysteriously dismounting partition
2015-10-26 14:47 [gentoo-user] Mysteriously dismounting partition Peter Humphrey
@ 2015-10-27 11:04 ` Stefan G. Weichinger
2015-10-27 12:25 ` Peter Humphrey
0 siblings, 1 reply; 7+ messages in thread
From: Stefan G. Weichinger @ 2015-10-27 11:04 UTC (permalink / raw
To: gentoo-user
Am 26.10.2015 um 15:47 schrieb Peter Humphrey:
> I keep the portage tree under /usr-bits.
>
> # dmesg | grep sdb3
> [ 1.753508] sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 >
> [ 4.833460] EXT4-fs (sdb3): mounted filesystem with ordered data mode. Opts:
> (null)
> [ 107.205918] EXT4-fs (sdb3): mounted filesystem with ordered data mode. Opts:
> (null)
>
> You can see the successful mount at 4.8 s; the entry at 107 s is me mounting
> it again manually.
>
> I've rewritten the partition label, and I've run a smartctl test which
> reported no faults found. I've also just reduced the speed of the chipset,
> which has three settings: good performance, better performance and turbo. It
> adopts the turbo setting by default and I've now set it to "better". It's too
> early yet to see if that will help.
interesting ...
What init-system? openrc or systemd?
No trace of the actual unmount in any logs?
Maybe also look/grep for the LABEL of the fs.
Maybe test if using the device-name itself ( /dev/sdb3 ) or the UUID in
fstab changes the behavior.
I use UUIDs here without problems (with systemd).
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-user] Mysteriously dismounting partition
2015-10-27 11:04 ` Stefan G. Weichinger
@ 2015-10-27 12:25 ` Peter Humphrey
2015-10-27 14:16 ` J. Roeleveld
0 siblings, 1 reply; 7+ messages in thread
From: Peter Humphrey @ 2015-10-27 12:25 UTC (permalink / raw
To: gentoo-user
On Tuesday 27 October 2015 12:04:46 Stefan G. Weichinger wrote:
> Am 26.10.2015 um 15:47 schrieb Peter Humphrey:
> > I keep the portage tree under /usr-bits.
> >
> > # dmesg | grep sdb3
> > [ 1.753508] sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 >
> > [ 4.833460] EXT4-fs (sdb3): mounted filesystem with ordered data mode.
> > Opts: (null)
> > [ 107.205918] EXT4-fs (sdb3): mounted filesystem with ordered data mode.
> > Opts: (null)
> >
> > You can see the successful mount at 4.8 s; the entry at 107 s is me
> > mounting it again manually.
> >
> > I've rewritten the partition label, and I've run a smartctl test which
> > reported no faults found. I've also just reduced the speed of the chipset,
> > which has three settings: good performance, better performance and turbo.
> > It adopts the turbo setting by default and I've now set it to "better".
> > It's too early yet to see if that will help.
>
> interesting ...
>
> What init-system? openrc or systemd?
Openrc.
> No trace of the actual unmount in any logs?
Not that I can find, no.
> Maybe also look/grep for the LABEL of the fs.
Nope, nor that.
> Maybe test if using the device-name itself ( /dev/sdb3 ) or the UUID in
> fstab changes the behavior.
I'll try reverting to /dev/sdb3 and see if that helps.
> I use UUIDs here without problems (with systemd).
The only thing I use UUIDs for here is in mdadm.conf to get the LVs started
reliably for the main system*. Those live in partitions /dev/sd[ab][5789].
Three more things: I've had the cover off and checked the seating of the SATA
cables; while the lid was off I watched the MB LEDs during startup, which
seemed okay; and today the kernel was upgraded from 4.0.5 to 4.0.9; that may
help too. (Hm ... too many changes at once.)
* Now that I think of it, one of the LVs came up as inactive the other day,
and nothing I could think of would activate it (consulting man mdadm of
course). In the end I had to reboot. This machine has shown some bizarre
behaviour over the last few months. Something is definitely wrong; I just can't
figure out what it is.
--
Rgds
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-user] Mysteriously dismounting partition
2015-10-27 12:25 ` Peter Humphrey
@ 2015-10-27 14:16 ` J. Roeleveld
2015-10-27 16:36 ` Peter Humphrey
0 siblings, 1 reply; 7+ messages in thread
From: J. Roeleveld @ 2015-10-27 14:16 UTC (permalink / raw
To: gentoo-user
On Tuesday, October 27, 2015 12:25:07 PM Peter Humphrey wrote:
> On Tuesday 27 October 2015 12:04:46 Stefan G. Weichinger wrote:
> > Am 26.10.2015 um 15:47 schrieb Peter Humphrey:
> > > I keep the portage tree under /usr-bits.
> > >
> > > # dmesg | grep sdb3
> > > [ 1.753508] sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 sdb7 sdb8 sdb9 >
> > > [ 4.833460] EXT4-fs (sdb3): mounted filesystem with ordered data
> > > mode.
> > > Opts: (null)
> > > [ 107.205918] EXT4-fs (sdb3): mounted filesystem with ordered data
> > > mode.
> > > Opts: (null)
> > >
> > > You can see the successful mount at 4.8 s; the entry at 107 s is me
> > > mounting it again manually.
> > >
> > > I've rewritten the partition label, and I've run a smartctl test which
> > > reported no faults found. I've also just reduced the speed of the
> > > chipset,
> > > which has three settings: good performance, better performance and
> > > turbo.
> > > It adopts the turbo setting by default and I've now set it to "better".
> > > It's too early yet to see if that will help.
> >
> > interesting ...
> >
> > What init-system? openrc or systemd?
>
> Openrc.
>
> > No trace of the actual unmount in any logs?
>
> Not that I can find, no.
>
> > Maybe also look/grep for the LABEL of the fs.
>
> Nope, nor that.
>
> > Maybe test if using the device-name itself ( /dev/sdb3 ) or the UUID in
> > fstab changes the behavior.
>
> I'll try reverting to /dev/sdb3 and see if that helps.
>
> > I use UUIDs here without problems (with systemd).
>
> The only thing I use UUIDs for here is in mdadm.conf to get the LVs started
> reliably for the main system*. Those live in partitions /dev/sd[ab][5789].
>
> Three more things: I've had the cover off and checked the seating of the
> SATA cables; while the lid was off I watched the MB LEDs during startup,
> which seemed okay; and today the kernel was upgraded from 4.0.5 to 4.0.9;
> that may help too. (Hm ... too many changes at once.)
>
> * Now that I think of it, one of the LVs came up as inactive the other day,
> and nothing I could think of would activate it (consulting man mdadm of
> course). In the end I had to reboot. This machine has shown some bizarre
> behaviour over the last few months. Something is definitely wrong; I just
> can't figure out what it is.
The full log for that entire period might be useful.
If a disk is umounted/removed, something needs to be logged somewhere.
Might even be a comment from the scsi-subsystem or the SATA driver.
I usually only grep the log to try to find specific messages.
If I know the time-period something weird happened in, I tend to go through
the unfiltered log for that period.
--
Joost
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-user] Mysteriously dismounting partition
2015-10-27 14:16 ` J. Roeleveld
@ 2015-10-27 16:36 ` Peter Humphrey
2015-10-27 17:18 ` J. Roeleveld
0 siblings, 1 reply; 7+ messages in thread
From: Peter Humphrey @ 2015-10-27 16:36 UTC (permalink / raw
To: gentoo-user
On Tuesday 27 October 2015 15:16:26 J. Roeleveld wrote:
> If a disk is umounted/removed, something needs to be logged somewhere.
> Might even be a comment from the scsi-subsystem or the SATA driver.
>
> I usually only grep the log to try to find specific messages.
> If I know the time-period something weird happened in, I tend to go through
> the unfiltered log for that period.
I have been scanning dmesg and /var/log/messages by eye and not noticed
anything. I'll keep doing it though.
--
Rgds
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-user] Mysteriously dismounting partition
2015-10-27 16:36 ` Peter Humphrey
@ 2015-10-27 17:18 ` J. Roeleveld
2015-10-28 10:27 ` Peter Humphrey
0 siblings, 1 reply; 7+ messages in thread
From: J. Roeleveld @ 2015-10-27 17:18 UTC (permalink / raw
To: gentoo-user
On 27 October 2015 17:36:00 CET, Peter Humphrey <peter@prh.myzen.co.uk> wrote:
>On Tuesday 27 October 2015 15:16:26 J. Roeleveld wrote:
>
>> If a disk is umounted/removed, something needs to be logged
>somewhere.
>> Might even be a comment from the scsi-subsystem or the SATA driver.
>>
>> I usually only grep the log to try to find specific messages.
>> If I know the time-period something weird happened in, I tend to go
>through
>> the unfiltered log for that period.
>
>I have been scanning dmesg and /var/log/messages by eye and not noticed
>
>anything. I'll keep doing it though.
What does your fstab look like?
And maybe some more info, like which kernel version. Mount version.
And maybe check for some weird crontab entry somewhere?
You could also rule out the use of umount by replacing it with a wrapper script that logs every call with as much info as is possible?
--
Joost
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [gentoo-user] Mysteriously dismounting partition
2015-10-27 17:18 ` J. Roeleveld
@ 2015-10-28 10:27 ` Peter Humphrey
0 siblings, 0 replies; 7+ messages in thread
From: Peter Humphrey @ 2015-10-28 10:27 UTC (permalink / raw
To: gentoo-user
On Tuesday 27 October 2015 18:18:18 J. Roeleveld wrote:
> On 27 October 2015 17:36:00 CET, Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
> >On Tuesday 27 October 2015 15:16:26 J. Roeleveld wrote:
> >> If a disk is umounted/removed, something needs to be logged
> >>somewhere.
> >
> >> Might even be a comment from the scsi-subsystem or the SATA driver.
> >>
> >> I usually only grep the log to try to find specific messages.
> >> If I know the time-period something weird happened in, I tend to go
> >> through the unfiltered log for that period.
> >
> > I have been scanning dmesg and /var/log/messages by eye and not noticed
> > anything. I'll keep doing it though.
>
> What does your fstab look like?
I gave the two relevant lines in my first message, but here's the whole thing:
LABEL=RescueSys / ext4 relatime 1 1
LABEL=RescUsrBits /usr-bits ext4 relatime 1 2
/dev/md1 /boot ext2 relatime,noauto 1 2
/dev/md5 /mnt/main ext4 relatime,noauto,dev,exec 0 2
/dev/vg7/local /mnt/main/usr/local ext4 relatime,noauto 0 3
/dev/vg7/home /mnt/main/home ext4 relatime,noauto 0 3
/dev/vg7/common /mnt/main/home/prh/common ext4 relatime,noauto 0 4
/dev/vg7/virt /mnt/main/home/prh/.VirtualBox ext4 relatime,noauto 0 4
/dev/vg7/boinc /mnt/main/home/prh/boinc ext4 relatime,noauto 0 4
/dev/vg7/var /mnt/main/var ext4 relatime,noauto 0 2
/dev/vg7/portage /mnt/main/usr/portage ext4 relatime,noauto 0 2
/dev/vg7/packages /mnt/main/usr/portage/packages ext4 relatime,noauto 0 3
/dev/vg7/distfiles /mnt/main/usr/portage/distfiles ext4 relatime,noauto 0 3
/dev/vg7/opt /mnt/main/opt ext4 relatime,noauto 0 3
/dev/vg7/atom /mnt/main/mnt/atom ext4 relatime,noauto 0 3
/dev/vg7/atomresc /mnt/main/mnt/atomresc ext4 relatime,noauto 0 3
/dev/vg7/tpad /mnt/main/mnt/tpad ext4 relatime,noauto 0 3
/dev/vg7/vartmp /mnt/main/var/tmp ext4 relatime,noauto 0 3
/dev/sdc5 /mnt/sdc ext4 relatime,noauto,user 0 0
/dev/sdd5 /mnt/sdd ext4 relatime,noauto,user 0 0
/dev/sr0 /mnt/dvd iso9660 noauto,user 0 0
/dev/sda2 none swap sw 0 0
/dev/sdb2 none swap sw 0 0
tmpfs /tmp tmpfs nodev,nosuid,size=6G 0 0
proc /proc proc defaults 0 0
shm /dev/shm tmpfs nodev,nosuid,noexec 0 0
/dev/md8 /mnt/qt5 ext4 noauto,relatime 0 0
/dev/vg9/home /mnt/qt5/home ext4 noauto,relatime 0 0
/dev/vg9/var /mnt/qt5/var ext4 noauto,relatime 0 0
/dev/vg9/vartmp /mnt/qt5/var/tmp ext4 noauto,relatime,nosuid,nodev 0 0
/dev/vg9/local /mnt/qt5/usr/local ext4 noauto,relatime 0 0
/dev/vg9/portage /mnt/qt5/usr/portage ext4 noauto,relatime 0 0
/dev/vg9/packages /mnt/qt5/usr/portage/packages ext4 noauto,relatime 0 0
/dev/vg9/distfiles /mnt/qt5/usr/portage/distfiles ext4 noauto,relatime 0 0
> And maybe some more info, like which kernel version. Mount version.
> And maybe check for some weird crontab entry somewhere?
No cron installed. Kernel was 4.0.5 at the time of writing, upgraded to 4.0.9
yesterday. Mount is in sys-apps/util-linux-2.26.2 - was 2.25.2-r2 until 27
September but the problem occurred with both versions.
> You could also rule out the use of umount by replacing it with a wrapper
> script that logs every call with as much info as is possible?
Hm. That may be above my bash grade.
I'm inclined to suspect the kernel. No real evidence, just that I've booted
the rescue system twice since installing the new kernel and everything worked
as it should.
--
Rgds
Peter
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2015-10-28 10:27 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-10-26 14:47 [gentoo-user] Mysteriously dismounting partition Peter Humphrey
2015-10-27 11:04 ` Stefan G. Weichinger
2015-10-27 12:25 ` Peter Humphrey
2015-10-27 14:16 ` J. Roeleveld
2015-10-27 16:36 ` Peter Humphrey
2015-10-27 17:18 ` J. Roeleveld
2015-10-28 10:27 ` Peter Humphrey
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox