public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] Is my SSD dying?
@ 2017-09-01  9:46 Peter Humphrey
  2017-09-01  9:54 ` Arthur Țițeică
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Peter Humphrey @ 2017-09-01  9:46 UTC (permalink / raw
  To: gentoo-user

Hello list,

For the last week or two my NVMe SSD isn't being detected on startup. I get 
this error on manual invocation:

# smartctl -a /dev/nvme0n1
smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.5-gentoo] (local build)
Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org

/dev/nvme0n1: Unable to detect device type
Please specify device type with the -d option.


Most things still seem to be working, but do I need to rush out and buy 
another drive? This one's only 18 months old. I don't really want to box up 
the machine and send it to Watford under warranty.

Two things that aren't working properly are KMail (surprise!), and BOINC, 
which insists that VirtualBox isn't installed, when of course it (more or 
less) always has been.

-- 
Regards,
Peter.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-01  9:46 [gentoo-user] Is my SSD dying? Peter Humphrey
@ 2017-09-01  9:54 ` Arthur Țițeică
  2017-09-02  9:24   ` Peter Humphrey
  2017-09-01 14:48 ` Daniel Frey
  2017-09-02  1:24 ` Adam Carter
  2 siblings, 1 reply; 15+ messages in thread
From: Arthur Țițeică @ 2017-09-01  9:54 UTC (permalink / raw
  To: gentoo-user, Peter Humphrey



În 1 septembrie 2017 12:46:39 EEST, Peter Humphrey <peter@prh.myzen.co.uk> a scris:
>Hello list,
>
>For the last week or two my NVMe SSD isn't being detected on startup. I
>get 
>this error on manual invocation:
>
># smartctl -a /dev/nvme0n1
>smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.5-gentoo] (local
>build)
>Copyright (C) 2002-15, Bruce Allen, Christian Franke,
>www.smartmontools.org
>
>/dev/nvme0n1: Unable to detect device type
>Please specify device type with the -d option.
>

Smartmontools supports NVMe starting from version 6.5.

>
>Most things still seem to be working, but do I need to rush out and buy
>
>another drive? This one's only 18 months old. I don't really want to
>box up 
>the machine and send it to Watford under warranty.
>
>Two things that aren't working properly are KMail (surprise!), and
>BOINC, 
>which insists that VirtualBox isn't installed, when of course it (more
>or 
>less) always has been.

Is the boinc user in the vboxusers group?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-01  9:46 [gentoo-user] Is my SSD dying? Peter Humphrey
  2017-09-01  9:54 ` Arthur Țițeică
@ 2017-09-01 14:48 ` Daniel Frey
  2017-09-02  1:24 ` Adam Carter
  2 siblings, 0 replies; 15+ messages in thread
From: Daniel Frey @ 2017-09-01 14:48 UTC (permalink / raw
  To: gentoo-user

On 09/01/2017 02:46 AM, Peter Humphrey wrote:
> Hello list,
> 
> For the last week or two my NVMe SSD isn't being detected on startup. I get 
> this error on manual invocation:
> 
> # smartctl -a /dev/nvme0n1
> smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.5-gentoo] (local build)
> Copyright (C) 2002-15, Bruce Allen, Christian Franke, www.smartmontools.org
> 
> /dev/nvme0n1: Unable to detect device type
> Please specify device type with the -d option.
> 
> 
> Most things still seem to be working, but do I need to rush out and buy 
> another drive? This one's only 18 months old. I don't really want to box up 
> the machine and send it to Watford under warranty.
> 
> Two things that aren't working properly are KMail (surprise!), and BOINC, 
> which insists that VirtualBox isn't installed, when of course it (more or 
> less) always has been.
> 

If your BIOS isn't detecting it, it's probably on its way out. Before
dismissing it to that though, I'd see if there's a firmware update for it.

I had a Crucial (regular SSD, non-NVME) that was doing this and after a
firmware update it's still going 3 years later.

Make sure you back up your data on it (if you can.)

Dan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-01  9:46 [gentoo-user] Is my SSD dying? Peter Humphrey
  2017-09-01  9:54 ` Arthur Țițeică
  2017-09-01 14:48 ` Daniel Frey
@ 2017-09-02  1:24 ` Adam Carter
  2017-09-02  9:32   ` Peter Humphrey
  2 siblings, 1 reply; 15+ messages in thread
From: Adam Carter @ 2017-09-02  1:24 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org

[-- Attachment #1: Type: text/plain, Size: 468 bytes --]

On Fri, Sep 1, 2017 at 7:46 PM, Peter Humphrey <peter@prh.myzen.co.uk>
wrote:

> Hello list,
>
> For the last week or two my NVMe SSD isn't being detected on startup. I get
> this error on manual invocation:
>
> # smartctl -a /dev/nvme0n1
> smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.5-gentoo] (local build)
>
>
Probably also worth updating to 4.12.10, there's some important sounding
security fixes in it, and the Changelogs for 4.12.6 and 4.12.8 mention nvme.

[-- Attachment #2: Type: text/html, Size: 942 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-01  9:54 ` Arthur Țițeică
@ 2017-09-02  9:24   ` Peter Humphrey
  0 siblings, 0 replies; 15+ messages in thread
From: Peter Humphrey @ 2017-09-02  9:24 UTC (permalink / raw
  To: gentoo-user

On Friday, 1 September 2017 10:54:45 BST Arthur Țițeică wrote:
> În 1 septembrie 2017 12:46:39 EEST, Peter Humphrey <peter@prh.myzen.co.uk> 
a scris:
> >Hello list,
> >
> >For the last week or two my NVMe SSD isn't being detected on startup. I
> >get
> >this error on manual invocation:
> >
> ># smartctl -a /dev/nvme0n1
> >smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.5-gentoo] (local
> >build)
> >Copyright (C) 2002-15, Bruce Allen, Christian Franke,
> >www.smartmontools.org
> >
> >/dev/nvme0n1: Unable to detect device type
> >Please specify device type with the -d option.
> 
> Smartmontools supports NVMe starting from version 6.5.

That was it - thanks. It's odd, though, that I hadn't noticed this before.

> >Most things still seem to be working, but do I need to rush out and buy
> >
> >another drive? This one's only 18 months old. I don't really want to
> >box up
> >the machine and send it to Watford under warranty.
> >
> >Two things that aren't working properly are KMail (surprise!), and
> >BOINC,
> >which insists that VirtualBox isn't installed, when of course it (more
> >or
> >less) always has been.
> 
> Is the boinc user in the vboxusers group?

Yes:

# groups boinc
vboxusers boinc

-- 
Regards,
Peter.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-02  1:24 ` Adam Carter
@ 2017-09-02  9:32   ` Peter Humphrey
  2017-09-02  9:51     ` Peter Humphrey
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Humphrey @ 2017-09-02  9:32 UTC (permalink / raw
  To: gentoo-user

On Saturday, 2 September 2017 02:24:57 BST Adam Carter wrote:
> On Fri, Sep 1, 2017 at 7:46 PM, Peter Humphrey <peter@prh.myzen.co.uk>
> wrote:

> > For the last week or two my NVMe SSD isn't being detected on startup. I
> > get this error on manual invocation:
> > 
> > # smartctl -a /dev/nvme0n1
> > smartctl 6.4 2015-06-04 r4109 [x86_64-linux-4.12.5-gentoo] (local build)
> 
> Probably also worth updating to 4.12.10, there's some important sounding
> security fixes in it, and the Changelogs for 4.12.6 and 4.12.8 mention
> nvme.

I went to version 6.5. Now smartmon appears to run ok - provided that I 
remove DEVICESCAN from /etc/smartd.conf and give it a specific device to 
monitor, like this:

/dev/nvme0n1 -a -o on -S on -s (S/../.././02|L/../../6/03)

(following an example in the file). So I'm still feeling somewhat edgy.

-- 
Regards,
Peter.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-02  9:32   ` Peter Humphrey
@ 2017-09-02  9:51     ` Peter Humphrey
  2017-09-02 12:28       ` Jacques Montier
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Humphrey @ 2017-09-02  9:51 UTC (permalink / raw
  To: gentoo-user

On Saturday, 2 September 2017 10:32:23 BST I wrote:

> ... Now smartmon appears to run ok - provided that I remove DEVICESCAN
> from /etc/smartd.conf and give it a specific device to monitor ...

Some months ago someone here mentioned a test suite for SSDs, but I can't 
remember what it was called and now I can't find it. Can someone point me in 
the right direction, please?

-- 
Regards,
Peter.



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-02  9:51     ` Peter Humphrey
@ 2017-09-02 12:28       ` Jacques Montier
  2017-09-02 13:49         ` Peter Humphrey
  0 siblings, 1 reply; 15+ messages in thread
From: Jacques Montier @ 2017-09-02 12:28 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 771 bytes --]

Hello,

I once encountered the problem with my Crucial SSD.
I found a procedure to make the SSD detected which worked for me.
http://forums.crucial.com/t5/Crucial-SSDs/Why-did-my-SSD-quot-disappear-quot-from-my-system/ta-p/65215
Hope this will help.

Cheers,



*--*
*Jacques*

2017-09-02 11:51 GMT+02:00 Peter Humphrey <peter@prh.myzen.co.uk>:

> On Saturday, 2 September 2017 10:32:23 BST I wrote:
>
> > ... Now smartmon appears to run ok - provided that I remove DEVICESCAN
> > from /etc/smartd.conf and give it a specific device to monitor ...
>
> Some months ago someone here mentioned a test suite for SSDs, but I can't
> remember what it was called and now I can't find it. Can someone point me
> in
> the right direction, please?
>
> --
> Regards,
> Peter.
>
>
>

[-- Attachment #2: Type: text/html, Size: 1718 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-02 12:28       ` Jacques Montier
@ 2017-09-02 13:49         ` Peter Humphrey
  2017-09-03  2:34           ` R0b0t1
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Humphrey @ 2017-09-02 13:49 UTC (permalink / raw
  To: gentoo-user

On Saturday, 2 September 2017 13:28:44 BST Jacques Montier wrote:

> I once encountered the problem with my Crucial SSD.
> I found a procedure to make the SSD detected which worked for me.
> http://forums.crucial.com/t5/Crucial-SSDs/Why-did-my-SSD-quot-disappear-qu
> ot-from-my-system/ta-p/65215 Hope this will help.

Mine's by Samsung:

# lspci -v -s 05:00.0
05:00.0 Non-Volatile memory controller: Samsung Electronics Co Ltd NVMe SSD 
Controller SM951/PM951 (rev 01) (prog-if 02 [NVM Express])
        Subsystem: Samsung Electronics Co Ltd NVMe SSD Controller SM951/
PM951
        Flags: bus master, fast devsel, latency 0, IRQ 35, NUMA node 0
        Memory at fbd00000 (64-bit, non-prefetchable) [size=16K]
        I/O ports at d000 [size=256]
        Capabilities: [40] Power Management version 3
        Capabilities: [50] MSI: Enable- Count=1/8 Maskable- 64bit+
        Capabilities: [70] Express Endpoint, MSI 00
        Capabilities: [b0] MSI-X: Enable+ Count=9 Masked-
        Capabilities: [100] Advanced Error Reporting
        Capabilities: [148] Device Serial Number 00-00-00-00-00-00-00-00
        Capabilities: [158] Power Budgeting <?>
        Capabilities: [168] #19
        Capabilities: [188] Latency Tolerance Reporting
        Capabilities: [190] L1 PM Substates
        Kernel driver in use: nvme

It's clearly visible to the kernel, and smartd finds it too if I tell it 
what to look for.

I don't want to start unplugging it unless I have to, as it's in a PCI slot 
and I'd probably make things worse. And it is only 18 months old, as I said.

A week or two ago I was investigating some other weirdnesses and at one 
point I zeroed out the first partition: the unformatted one containing the 
UEFI data. It took longer than I expected, having only 2MB to fill. I wonder 
if it strayed outside the partition...

-- 
Regards,
Peter.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-02 13:49         ` Peter Humphrey
@ 2017-09-03  2:34           ` R0b0t1
  2017-09-03  7:39             ` Peter Humphrey
  0 siblings, 1 reply; 15+ messages in thread
From: R0b0t1 @ 2017-09-03  2:34 UTC (permalink / raw
  To: gentoo-user

On Sat, Sep 2, 2017 at 8:49 AM, Peter Humphrey <peter@prh.myzen.co.uk> wrote:
> A week or two ago I was investigating some other weirdnesses and at one
> point I zeroed out the first partition: the unformatted one containing the
> UEFI data. It took longer than I expected, having only 2MB to fill. I wonder
> if it strayed outside the partition...
>

Are you trimming your drive?


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-03  2:34           ` R0b0t1
@ 2017-09-03  7:39             ` Peter Humphrey
  2017-09-03 20:56               ` R0b0t1
  0 siblings, 1 reply; 15+ messages in thread
From: Peter Humphrey @ 2017-09-03  7:39 UTC (permalink / raw
  To: gentoo-user

On Sunday, 3 September 2017 03:34:06 BST R0b0t1 wrote:
> On Sat, Sep 2, 2017 at 8:49 AM, Peter Humphrey <peter@prh.myzen.co.uk> 
wrote:
> > A week or two ago I was investigating some other weirdnesses and at one
> > point I zeroed out the first partition: the unformatted one containing
> > the UEFI data. It took longer than I expected, having only 2MB to fill.
> > I wonder if it strayed outside the partition...
> 
> Are you trimming your drive?

Yes; this is root's crontab:

9 3,15 * * *    /sbin/fstrim -a

-- 
Regards,
Peter.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-03  7:39             ` Peter Humphrey
@ 2017-09-03 20:56               ` R0b0t1
  2017-09-06 10:01                 ` Peter Humphrey
  0 siblings, 1 reply; 15+ messages in thread
From: R0b0t1 @ 2017-09-03 20:56 UTC (permalink / raw
  To: gentoo-user

On Sun, Sep 3, 2017 at 2:39 AM, Peter Humphrey <peter@prh.myzen.co.uk> wrote:
> On Sunday, 3 September 2017 03:34:06 BST R0b0t1 wrote:
>> On Sat, Sep 2, 2017 at 8:49 AM, Peter Humphrey <peter@prh.myzen.co.uk>
> wrote:
>> > A week or two ago I was investigating some other weirdnesses and at one
>> > point I zeroed out the first partition: the unformatted one containing
>> > the UEFI data. It took longer than I expected, having only 2MB to fill.
>> > I wonder if it strayed outside the partition...
>>
>> Are you trimming your drive?
>
> Yes; this is root's crontab:
>
> 9 3,15 * * *    /sbin/fstrim -a
>

I think a reduction in drive performance (when you are maintaining it
properly) is the best argument for being ready to replace the drive,
as this seems unlikely to happen to me unless the drive is actually
wearing out.

At the same time I have seen this exact situation fixed by a firmware
upgrade. Still, this seems more alarming than the other issues you've
described.


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-03 20:56               ` R0b0t1
@ 2017-09-06 10:01                 ` Peter Humphrey
  2017-09-06 14:11                   ` Daniel Frey
  2017-09-06 17:13                   ` R0b0t1
  0 siblings, 2 replies; 15+ messages in thread
From: Peter Humphrey @ 2017-09-06 10:01 UTC (permalink / raw
  To: gentoo-user

On Sunday, 3 September 2017 21:56:43 BST R0b0t1 wrote:
> On Sun, Sep 3, 2017 at 2:39 AM, Peter Humphrey <peter@prh.myzen.co.uk> 
wrote:
> > On Sunday, 3 September 2017 03:34:06 BST R0b0t1 wrote:
> >> On Sat, Sep 2, 2017 at 8:49 AM, Peter Humphrey 
<peter@prh.myzen.co.uk>
> > 
> > wrote:
> >> > A week or two ago I was investigating some other weirdnesses and at
> >> > one
> >> > point I zeroed out the first partition: the unformatted one
> >> > containing
> >> > the UEFI data. It took longer than I expected, having only 2MB to
> >> > fill.
> >> > I wonder if it strayed outside the partition...
> >> 
> >> Are you trimming your drive?
> > 
> > Yes; this is root's crontab:
> > 
> > 9 3,15 * * *    /sbin/fstrim -a
> 
> I think a reduction in drive performance (when you are maintaining it
> properly) is the best argument for being ready to replace the drive,
> as this seems unlikely to happen to me unless the drive is actually
> wearing out.

I haven't noticed any degradation of performance, though I haven't run any 
tests.

> At the same time I have seen this exact situation fixed by a firmware
> upgrade. Still, this seems more alarming than the other issues you've
> described.

Do you mean the firmware of the NVMe drive? How would I go about that? I 
don't see any mention of firmware on Samsung's site.

-- 
Regards
Peter



^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-06 10:01                 ` Peter Humphrey
@ 2017-09-06 14:11                   ` Daniel Frey
  2017-09-06 17:13                   ` R0b0t1
  1 sibling, 0 replies; 15+ messages in thread
From: Daniel Frey @ 2017-09-06 14:11 UTC (permalink / raw
  To: gentoo-user

On 09/06/2017 03:01 AM, Peter Humphrey wrote:
>> At the same time I have seen this exact situation fixed by a firmware
>> upgrade. Still, this seems more alarming than the other issues you've
>> described.
> 
> Do you mean the firmware of the NVMe drive? How would I go about that? I 
> don't see any mention of firmware on Samsung's site.
> 

I mentioned the same in an earlier post, but unfortunately it seems
Samsung's Magician software is required to do this (Windows application.)

Dan


^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [gentoo-user] Is my SSD dying?
  2017-09-06 10:01                 ` Peter Humphrey
  2017-09-06 14:11                   ` Daniel Frey
@ 2017-09-06 17:13                   ` R0b0t1
  1 sibling, 0 replies; 15+ messages in thread
From: R0b0t1 @ 2017-09-06 17:13 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org

[-- Attachment #1: Type: text/plain, Size: 1723 bytes --]

On Wednesday, September 6, 2017, Peter Humphrey <peter@prh.myzen.co.uk>
wrote:
> On Sunday, 3 September 2017 21:56:43 BST R0b0t1 wrote:
>> On Sun, Sep 3, 2017 at 2:39 AM, Peter Humphrey <peter@prh.myzen.co.uk>
> wrote:
>> > On Sunday, 3 September 2017 03:34:06 BST R0b0t1 wrote:
>> >> On Sat, Sep 2, 2017 at 8:49 AM, Peter Humphrey
> <peter@prh.myzen.co.uk>
>> >
>> > wrote:
>> >> > A week or two ago I was investigating some other weirdnesses and at
>> >> > one
>> >> > point I zeroed out the first partition: the unformatted one
>> >> > containing
>> >> > the UEFI data. It took longer than I expected, having only 2MB to
>> >> > fill.
>> >> > I wonder if it strayed outside the partition...
>> >>
>> >> Are you trimming your drive?
>> >
>> > Yes; this is root's crontab:
>> >
>> > 9 3,15 * * *    /sbin/fstrim -a
>>
>> I think a reduction in drive performance (when you are maintaining it
>> properly) is the best argument for being ready to replace the drive,
>> as this seems unlikely to happen to me unless the drive is actually
>> wearing out.
>
> I haven't noticed any degradation of performance, though I haven't run any
> tests.
>

I interpreted the slow zeroing as a performance decrease. If you can
benchmark to check you may want to.

If that situation doesn't correspond to a general decrease in performance I
will be very surprised.

>> At the same time I have seen this exact situation fixed by a firmware
>> upgrade. Still, this seems more alarming than the other issues you've
>> described.
>
> Do you mean the firmware of the NVMe drive? How would I go about that? I
> don't see any mention of firmware on Samsung's site.
>

Typically a closed source Windows program which may bundle the firmware in
it.

[-- Attachment #2: Type: text/html, Size: 2332 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2017-09-06 17:13 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-09-01  9:46 [gentoo-user] Is my SSD dying? Peter Humphrey
2017-09-01  9:54 ` Arthur Țițeică
2017-09-02  9:24   ` Peter Humphrey
2017-09-01 14:48 ` Daniel Frey
2017-09-02  1:24 ` Adam Carter
2017-09-02  9:32   ` Peter Humphrey
2017-09-02  9:51     ` Peter Humphrey
2017-09-02 12:28       ` Jacques Montier
2017-09-02 13:49         ` Peter Humphrey
2017-09-03  2:34           ` R0b0t1
2017-09-03  7:39             ` Peter Humphrey
2017-09-03 20:56               ` R0b0t1
2017-09-06 10:01                 ` Peter Humphrey
2017-09-06 14:11                   ` Daniel Frey
2017-09-06 17:13                   ` R0b0t1

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox