* [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
@ 2010-03-26 20:17 Paul Hartman
2010-03-26 21:31 ` Neil Walker
2010-03-27 3:23 ` Stroller
0 siblings, 2 replies; 9+ messages in thread
From: Paul Hartman @ 2010-03-26 20:17 UTC (permalink / raw
To: gentoo-user
Hi,
Setting up and testing my new system (after wasting nearly 1 month
with bad RAM modules), I got this error today:
[48055.741389] ata3.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x6 frozen
[48055.741393] ata3.00: failed command: READ FPDMA QUEUED
[48055.741398] ata3.00: cmd 60/20:08:38:15:03/01:00:18:00:00/40 tag 1
ncq 147456 in
[48055.741400] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[48055.741402] ata3.00: status: { DRDY }
[48055.741405] ata3: hard resetting link
[48056.198746] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[48056.210514] ata3.00: configured for UDMA/133
[48056.210518] ata3.00: device reported invalid CHS sector 0
[48056.210523] ata3: EH complete
I really don't understand what it means, but the "timeout", "hard
resetting link" and "invalid CHS sector 0" look scary to me...
Initial bootup messages for this device were:
Mar 25 22:02:32 [kernel] [ 4.496102] ata3: SATA max UDMA/133 abar
m2048@0xfbffc000 port 0xfbffc200 irq 34
Mar 25 22:02:32 [kernel] [ 8.519169] ata3: SATA link up 3.0 Gbps
(SStatus 123 SControl 300)
Mar 25 22:02:32 [kernel] [ 8.536681] ata3.00: ATA-8: SAMSUNG
HD203WI, 1AN10002, max UDMA/133
Mar 25 22:02:32 [kernel] [ 8.548388] ata3.00: 3907029168 sectors,
multi 0: LBA48 NCQ (depth 31/32), AA
Mar 25 22:02:32 [kernel] [ 8.566100] ata3.00: configured for UDMA/133
That disk is part of a md RAID5, but I was at work when this error
happened so I didn't notice if the RAID repaired itself or whatever
would happen in this case (I don't have mdadm monitoring configured
yet). Right now all RAID disks are all up and healthy.
I googled it but most of the results are pastebin snippets. I'm using
kernel 2.6.33 and ahci driver for the SATA controllers.
From libata documentation in the section about timeout errors it says:
"Most often this is due to an unrelated interrupt subsystem bug (try
booting with 'pci=nomsi' or 'acpi=off' or 'noapic'), which failed to
deliver an interrupt when we were expecting one from the hardware."
I really don't know the potential implications of disabling MSI or
APIC, but in /proc/interrupts I do see AHCI related to both MSI and
APIC rows. So at least I know they are active right now.
Temperatures in my system are good, hddtemp says the drive in question
is 21C degrees right now.
Another possibility is that I need to increase voltage on the
motherboard, since it is running 6 hdd's and 1 DVD-ROM. I'll have to
research to see which voltage is related to this. (X58 motherboard)
Thanks in advance if anyone has any knowledge about this, otherwise I
go to trial-and-hopefully-no-error mode. :)
Paul
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
2010-03-26 20:17 [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link Paul Hartman
@ 2010-03-26 21:31 ` Neil Walker
2010-03-26 22:21 ` Paul Hartman
2010-03-27 3:23 ` Stroller
1 sibling, 1 reply; 9+ messages in thread
From: Neil Walker @ 2010-03-26 21:31 UTC (permalink / raw
To: gentoo-user
On 26/03/10 20:17, Paul Hartman wrote:
> I'm using kernel 2.6.33 and ahci driver for the SATA controllers.
>
I never had good results with the ahci driver - hardware-specific
drivers have always worked better for me.
> Another possibility is that I need to increase voltage on the
> motherboard, since it is running 6 hdd's and 1 DVD-ROM. I'll have to
> research to see which voltage is related to this. (X58 motherboard)
>
No, no, no! Don't touch the motherboard voltages! They have nothing
to do with the connected drives. Surely you must have noticed that
power to the drives comes directly from the power supply. If the
drives are not receiving sufficient voltage, a new, bigger power supply
is the only option.
Be lucky,
Neil
http://www.neiljw.com/
--
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
2010-03-26 21:31 ` Neil Walker
@ 2010-03-26 22:21 ` Paul Hartman
2010-03-27 3:39 ` Stroller
0 siblings, 1 reply; 9+ messages in thread
From: Paul Hartman @ 2010-03-26 22:21 UTC (permalink / raw
To: gentoo-user
On Fri, Mar 26, 2010 at 4:31 PM, Neil Walker <neil@ep.mine.nu> wrote:
> On 26/03/10 20:17, Paul Hartman wrote:
>> I'm using kernel 2.6.33 and ahci driver for the SATA controllers.
>>
>
> I never had good results with the ahci driver - hardware-specific
> drivers have always worked better for me.
>
>> Another possibility is that I need to increase voltage on the
>> motherboard, since it is running 6 hdd's and 1 DVD-ROM. I'll have to
>> research to see which voltage is related to this. (X58 motherboard)
>>
>
> No, no, no! Don't touch the motherboard voltages! They have nothing
> to do with the connected drives. Surely you must have noticed that
> power to the drives comes directly from the power supply. If the
> drives are not receiving sufficient voltage, a new, bigger power supply
> is the only option.
Well, I was thinking more about something like alteriong IOH/ICH
voltage, or whichever voltage powers the SATA controllers. It has 12
SATA headers on this motherboard but I don't know how much it can
realistically handle at once. Hopefully all of my disks. :)
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
2010-03-26 20:17 [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link Paul Hartman
2010-03-26 21:31 ` Neil Walker
@ 2010-03-27 3:23 ` Stroller
2010-03-27 16:41 ` Paul Hartman
1 sibling, 1 reply; 9+ messages in thread
From: Stroller @ 2010-03-27 3:23 UTC (permalink / raw
To: gentoo-user
On 26 Mar 2010, at 20:17, Paul Hartman wrote:
> ...
> Setting up and testing my new system (after wasting nearly 1 month
> with bad RAM modules), I got this error today:
>
> [48055.741389] ata3.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action
> 0x6 frozen
> [48055.741393] ata3.00: failed command: READ FPDMA QUEUED
> [48055.741398] ata3.00: cmd 60/20:08:38:15:03/01:00:18:00:00/40 tag 1
> ncq 147456 in
> [48055.741400] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [48055.741402] ata3.00: status: { DRDY }
> [48055.741405] ata3: hard resetting link
> [48056.198746] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [48056.210514] ata3.00: configured for UDMA/133
> [48056.210518] ata3.00: device reported invalid CHS sector 0
> [48056.210523] ata3: EH complete
>
> I really don't understand what it means, but the "timeout", "hard
> resetting link" and "invalid CHS sector 0" look scary to me...
How new is this motherboard? One may see something similar if one
connects a SATA2 drive to an older SATA motherboard without the jumper
(on the drive) set to restrict it to 1.5 Gbps.
Stroller.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
2010-03-26 22:21 ` Paul Hartman
@ 2010-03-27 3:39 ` Stroller
2010-03-27 16:56 ` Paul Hartman
0 siblings, 1 reply; 9+ messages in thread
From: Stroller @ 2010-03-27 3:39 UTC (permalink / raw
To: gentoo-user
On 26 Mar 2010, at 22:21, Paul Hartman wrote:
> ...
> Well, I was thinking more about something like alteriong IOH/ICH
> voltage, or whichever voltage powers the SATA controllers. It has 12
> SATA headers on this motherboard but I don't know how much it can
> realistically handle at once. Hopefully all of my disks. :)
Honestly, I'm not sure that I'd do that.
If I thought it was a power issue - and that seems quite a reasonable
possibility - I would replace the PSU first.
What PSU are you using at the moment? Brand / wattage?
I think these are probably overkill:
http://www.bluepoint.net/AKAPSU066
http://www.bluepoint.net/AKAPSU071
However, if you're using a 450W PSU at the moment, you can get an OCZ
branded 600W for less than £50. I tend to be suspicious of cheap
unbranded and Wong Fu 350W - 450W PSUs. Often a 350W PSU from a
manufacturer with a half-decent brand (eg Trust) will be better than a
no-name 450W PSU.
It seems important not only the actual wattage that the PSU gives out
(measured with an analogue-needle multimeter) but also how stable that
voltage is, how smooth, consistent and reliable it is. Unless your PSU
is absolute top-notch quality, try to under-utilise it.
With 6 x hard-drives, I would be generous with supplying power - I
think a 600W PSU would easily be justified.
Stroller.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
2010-03-27 3:23 ` Stroller
@ 2010-03-27 16:41 ` Paul Hartman
0 siblings, 0 replies; 9+ messages in thread
From: Paul Hartman @ 2010-03-27 16:41 UTC (permalink / raw
To: gentoo-user
On Fri, Mar 26, 2010 at 10:23 PM, Stroller
<stroller@stellar.eclipse.co.uk> wrote:
>
> On 26 Mar 2010, at 20:17, Paul Hartman wrote:
>>
>> ...
>> Setting up and testing my new system (after wasting nearly 1 month
>> with bad RAM modules), I got this error today:
>>
>> [48055.741389] ata3.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x6
>> frozen
>> [48055.741393] ata3.00: failed command: READ FPDMA QUEUED
>> [48055.741398] ata3.00: cmd 60/20:08:38:15:03/01:00:18:00:00/40 tag 1
>> ncq 147456 in
>> [48055.741400] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
>> 0x4 (timeout)
>> [48055.741402] ata3.00: status: { DRDY }
>> [48055.741405] ata3: hard resetting link
>> [48056.198746] ata3: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
>> [48056.210514] ata3.00: configured for UDMA/133
>> [48056.210518] ata3.00: device reported invalid CHS sector 0
>> [48056.210523] ata3: EH complete
>>
>> I really don't understand what it means, but the "timeout", "hard
>> resetting link" and "invalid CHS sector 0" look scary to me...
>
> How new is this motherboard? One may see something similar if one connects a
> SATA2 drive to an older SATA motherboard without the jumper (on the drive)
> set to restrict it to 1.5 Gbps.
It is a new motherboard with 8 SATA2 headers, 2 SATA3 (6Gbps) and 2
eSATA. I'm only using the internal SATA2 ports now, not the SATA3. One
floppy and one IDE also unused :)
The drive is 2TB and new as well. No jumpers. Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
2010-03-27 3:39 ` Stroller
@ 2010-03-27 16:56 ` Paul Hartman
2010-08-12 5:50 ` Paul Hartman
0 siblings, 1 reply; 9+ messages in thread
From: Paul Hartman @ 2010-03-27 16:56 UTC (permalink / raw
To: gentoo-user
On Fri, Mar 26, 2010 at 10:39 PM, Stroller
<stroller@stellar.eclipse.co.uk> wrote:
>
> On 26 Mar 2010, at 22:21, Paul Hartman wrote:
>>
>> ...
>> Well, I was thinking more about something like alteriong IOH/ICH
>> voltage, or whichever voltage powers the SATA controllers. It has 12
>> SATA headers on this motherboard but I don't know how much it can
>> realistically handle at once. Hopefully all of my disks. :)
>
> Honestly, I'm not sure that I'd do that.
>
> If I thought it was a power issue - and that seems quite a reasonable
> possibility - I would replace the PSU first.
>
> What PSU are you using at the moment? Brand / wattage?
>
> I think these are probably overkill:
>
> http://www.bluepoint.net/AKAPSU066
> http://www.bluepoint.net/AKAPSU071
>
> However, if you're using a 450W PSU at the moment, you can get an OCZ
> branded 600W for less than £50. I tend to be suspicious of cheap unbranded
> and Wong Fu 350W - 450W PSUs. Often a 350W PSU from a manufacturer with a
> half-decent brand (eg Trust) will be better than a no-name 450W PSU.
>
> It seems important not only the actual wattage that the PSU gives out
> (measured with an analogue-needle multimeter) but also how stable that
> voltage is, how smooth, consistent and reliable it is. Unless your PSU is
> absolute top-notch quality, try to under-utilise it.
>
> With 6 x hard-drives, I would be generous with supplying power - I think a
> 600W PSU would easily be justified.
>
> Stroller.
>
I have a 750W Corsair 750TX which should be plenty (if not overkill).
Seems to have all the bells and whistles (though I wish I had bought
the modular version). I definitely agree about the cheap no-name PSUs,
I've seen some USD$10 ones go up in smoke after just a few minutes of
use.
Based on the little info I could find Googling it seems the most
likely explanations are:
IRQ handling weirdness (but that was more probable in older
motherboards and kernels)
ahci driver issues (I will reboot with the controllers in "legacy"
mode to use the non-ahci drivers and see what happens)
a bad drive, bad cable, bad controller, bad sector, something physically bad.
Overnight I copied several hundred gigs of data (and still going) and
the error has not been reproduced. Before installing, I did a full
SMART test (a few hours) and a badblocks read-only test (even more
hours) and neither showed any errors. I did not do a full read/write
test, though.
I haven't heard any unexpected noises coming from the drives (no
clicks of doom), though I have so many fans going it may be hard to
hear.
Thanks.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
2010-03-27 16:56 ` Paul Hartman
@ 2010-08-12 5:50 ` Paul Hartman
2010-08-13 7:11 ` Mick
0 siblings, 1 reply; 9+ messages in thread
From: Paul Hartman @ 2010-08-12 5:50 UTC (permalink / raw
To: gentoo-user
On Sat, Mar 27, 2010 at 11:56 AM, Paul Hartman
<paul.hartman+gentoo@gmail.com> wrote:
> On Fri, Mar 26, 2010 at 10:39 PM, Stroller
> <stroller@stellar.eclipse.co.uk> wrote:
>>
>> On 26 Mar 2010, at 22:21, Paul Hartman wrote:
>>>
>>> ...
>>> Well, I was thinking more about something like alteriong IOH/ICH
>>> voltage, or whichever voltage powers the SATA controllers. It has 12
>>> SATA headers on this motherboard but I don't know how much it can
>>> realistically handle at once. Hopefully all of my disks. :)
>>
>> Honestly, I'm not sure that I'd do that.
>>
>> If I thought it was a power issue - and that seems quite a reasonable
>> possibility - I would replace the PSU first.
>>
>> What PSU are you using at the moment? Brand / wattage?
>>
>> I think these are probably overkill:
>>
>> http://www.bluepoint.net/AKAPSU066
>> http://www.bluepoint.net/AKAPSU071
>>
>> However, if you're using a 450W PSU at the moment, you can get an OCZ
>> branded 600W for less than £50. I tend to be suspicious of cheap unbranded
>> and Wong Fu 350W - 450W PSUs. Often a 350W PSU from a manufacturer with a
>> half-decent brand (eg Trust) will be better than a no-name 450W PSU.
>>
>> It seems important not only the actual wattage that the PSU gives out
>> (measured with an analogue-needle multimeter) but also how stable that
>> voltage is, how smooth, consistent and reliable it is. Unless your PSU is
>> absolute top-notch quality, try to under-utilise it.
>>
>> With 6 x hard-drives, I would be generous with supplying power - I think a
>> 600W PSU would easily be justified.
>>
>> Stroller.
>>
>
> I have a 750W Corsair 750TX which should be plenty (if not overkill).
> Seems to have all the bells and whistles (though I wish I had bought
> the modular version). I definitely agree about the cheap no-name PSUs,
> I've seen some USD$10 ones go up in smoke after just a few minutes of
> use.
>
> Based on the little info I could find Googling it seems the most
> likely explanations are:
>
> IRQ handling weirdness (but that was more probable in older
> motherboards and kernels)
> ahci driver issues (I will reboot with the controllers in "legacy"
> mode to use the non-ahci drivers and see what happens)
> a bad drive, bad cable, bad controller, bad sector, something physically bad.
>
> Overnight I copied several hundred gigs of data (and still going) and
> the error has not been reproduced. Before installing, I did a full
> SMART test (a few hours) and a badblocks read-only test (even more
> hours) and neither showed any errors. I did not do a full read/write
> test, though.
>
> I haven't heard any unexpected noises coming from the drives (no
> clicks of doom), though I have so many fans going it may be hard to
> hear.
>
> Thanks.
>
I have nothing to add, but almost 5 months on from my original post I
just had this same error message for the second time. This time it was
on a different physical disk, which -- strangely -- makes me feel
better. Different kernel revision (2.6.35.1 now). Still don't know
what's the cause... it's an odd one that I'll be keeping my eye on.
[ 1435.938398] ata6.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x6 frozen
[ 1435.938403] ata6.00: failed command: READ FPDMA QUEUED
[ 1435.938409] ata6.00: cmd 60/e8:08:38:a5:86/01:00:01:00:00/40 tag 1
ncq 249856 in
[ 1435.938410] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
0x4 (timeout)
[ 1435.938412] ata6.00: status: { DRDY }
[ 1435.938416] ata6: hard resetting link
[ 1436.395618] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
[ 1436.407546] ata6.00: configured for UDMA/133
[ 1436.407552] ata6.00: device reported invalid CHS sector 0
[ 1436.407559] ata6: EH complete
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link
2010-08-12 5:50 ` Paul Hartman
@ 2010-08-13 7:11 ` Mick
0 siblings, 0 replies; 9+ messages in thread
From: Mick @ 2010-08-13 7:11 UTC (permalink / raw
To: gentoo-user
[-- Attachment #1: Type: Text/Plain, Size: 1150 bytes --]
On Thursday 12 August 2010 06:50:22 Paul Hartman wrote:
> I have nothing to add, but almost 5 months on from my original post I
> just had this same error message for the second time. This time it was
> on a different physical disk, which -- strangely -- makes me feel
> better. Different kernel revision (2.6.35.1 now). Still don't know
> what's the cause... it's an odd one that I'll be keeping my eye on.
>
> [ 1435.938398] ata6.00: exception Emask 0x0 SAct 0x2 SErr 0x0 action 0x6
> frozen [ 1435.938403] ata6.00: failed command: READ FPDMA QUEUED
> [ 1435.938409] ata6.00: cmd 60/e8:08:38:a5:86/01:00:01:00:00/40 tag 1
> ncq 249856 in
> [ 1435.938410] res 40/00:00:00:00:00/00:00:00:00:00/00 Emask
> 0x4 (timeout)
> [ 1435.938412] ata6.00: status: { DRDY }
> [ 1435.938416] ata6: hard resetting link
> [ 1436.395618] ata6: SATA link up 3.0 Gbps (SStatus 123 SControl 300)
> [ 1436.407546] ata6.00: configured for UDMA/133
> [ 1436.407552] ata6.00: device reported invalid CHS sector 0
> [ 1436.407559] ata6: EH complete
Have you tried replacing the SATA cable, or at least remove/reinsert?
--
Regards,
Mick
[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 198 bytes --]
^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2010-08-13 19:08 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-03-26 20:17 [gentoo-user] Kernel2.6.33: ATA failed command: READ FPDMA QUEUED, hard resetting link Paul Hartman
2010-03-26 21:31 ` Neil Walker
2010-03-26 22:21 ` Paul Hartman
2010-03-27 3:39 ` Stroller
2010-03-27 16:56 ` Paul Hartman
2010-08-12 5:50 ` Paul Hartman
2010-08-13 7:11 ` Mick
2010-03-27 3:23 ` Stroller
2010-03-27 16:41 ` Paul Hartman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox