public inbox for gentoo-embedded@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-embedded] embedded ext2 and fsck
@ 2010-04-06 14:16 Relson, David
  2010-04-06 17:02 ` Ed W
  2010-04-06 21:20 ` Marcus Priesch
  0 siblings, 2 replies; 9+ messages in thread
From: Relson, David @ 2010-04-06 14:16 UTC (permalink / raw
  To: gentoo-embedded

G'day,

My embedded environment is evolving.  The Disk-On-Module currently has
the following partitions:
	/dev/hda2 - /	- root (ext2)
	/dev/hda1 - /boot	- syslinux boot partition (FAT16)
	/dev/hda3 - /var	- ext2, rw

The system has a 486 and is running kernel 2.6.29.6.

Over the past month I've encountered numerous "Stale NFS file handle"
errors.  The device isn't networked and there's no apparent reason for
them (as best I can tell).

How important is running fsck in an embedded ext2 environment?

I'm considering 
	1) "fsck -C -T -a" on every boot
	2) letting fsck run according to the tune2fs count
	3) using "tune2fs -C 0" to disable checking totally

When do y'all do and recommend?

Thanks.

David



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gentoo-embedded] embedded ext2 and fsck
  2010-04-06 14:16 [gentoo-embedded] embedded ext2 and fsck Relson, David
@ 2010-04-06 17:02 ` Ed W
  2010-04-06 21:20 ` Marcus Priesch
  1 sibling, 0 replies; 9+ messages in thread
From: Ed W @ 2010-04-06 17:02 UTC (permalink / raw
  To: gentoo-embedded

On 06/04/2010 15:16, Relson, David wrote:
> Over the past month I've encountered numerous "Stale NFS file handle"
> errors.  The device isn't networked and there's no apparent reason for
> them (as best I can tell).
>    

I believe (please someone shoot me down) that these types of errors are 
indicative of some on disk corruption - not sure why it refers to NFS 
though.

So I think you have the big problem here that the fsck adds a good chunk 
to the boot time, but disabling it leads to silent corruption and 
potentially big problems down the line... One compromise would perhaps 
be to split the device to read-only and read-write (which I think you 
may have done?) and this then perhaps allows you to split the disk into 
high risk and safe data... Now you can play roulette with less important 
stuff and use ext3, etc with the more important stuff?

Just an idea

Good luck

Ed W



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gentoo-embedded] embedded ext2 and fsck
  2010-04-06 14:16 [gentoo-embedded] embedded ext2 and fsck Relson, David
  2010-04-06 17:02 ` Ed W
@ 2010-04-06 21:20 ` Marcus Priesch
  2010-04-07 14:27   ` Ed W
  2010-04-08 21:21   ` Janusz Syrytczyk
  1 sibling, 2 replies; 9+ messages in thread
From: Marcus Priesch @ 2010-04-06 21:20 UTC (permalink / raw
  To: gentoo-embedded

Hi David,

Am Dienstag, den 06.04.2010, 10:16 -0400 schrieb Relson, David:
[...]
> Over the past month I've encountered numerous "Stale NFS file handle"
> errors.  The device isn't networked and there's no apparent reason for
> them (as best I can tell).

i had the same problems on a (i think it was ext2) root-filesystem which
was never written to, but was also not mounted ro. 

i thought that i would not need the fsck on boot up, because there are
no write accesses to the device ... 

however after some months of operation the device failed to boot with
exact the same error message ... 

the reason i suspect was that due to power failures the ext2 got
inconsistent somehow ... which resulted in "stale NFS file handle
messages" ... not very intuitive ;)

i put in a check & repair of the filesystem (with -y) on every boot and
now those errors are gone ... 

i think the problem encounters when a not cleanly shut down ext2 fs gets
mounted over and over again ... and ... maybe something got written to
it even if i would not know ... it was a mini-itx running gentoo
with /var (mostly) on another (rw) partition - but not very "embedded"
in terms of stripped down ... ;)

hope this helps, 
marcus.




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gentoo-embedded] embedded ext2 and fsck
  2010-04-06 21:20 ` Marcus Priesch
@ 2010-04-07 14:27   ` Ed W
  2010-04-09  0:15     ` Peter Stuge
  2010-04-08 21:21   ` Janusz Syrytczyk
  1 sibling, 1 reply; 9+ messages in thread
From: Ed W @ 2010-04-07 14:27 UTC (permalink / raw
  To: gentoo-embedded

On 06/04/2010 22:20, Marcus Priesch wrote:
> however after some months of operation the device failed to boot with
> exact the same error message ...
>
> the reason i suspect was that due to power failures the ext2 got
> inconsistent somehow ... which resulted in "stale NFS file handle
> messages" ... not very intuitive ;)
>    


It would be interesting to hear if these errors "go away" by switching 
to EXT3?

There seem to be several things happening here:

1) The CF card is quietly shuffling data around, so in theory it might 
move a good sector onto a patch of flash which is worn out, causing it 
to be corrupted on next read.  Similarly when you "write" the card does 
quite a lot of work in the background and theoretically if power was 
lost during the shuffling around of sectors this could also cause data loss?

2) Sudden shutdowns causing the ext2 to be marked dirty and causing 
subsequent problems (ie not fully read-only mounted

To be honest, I don't know a lot about how ext2 is mounted read-only, 
but option 2) above seems unlikely...?

This suggests that there are real problems with CF cards getting old and 
the wear levelling causing data to be shuffled onto worn out sectors.  
And/Or it may prove that the wear leveling causes corruption if power is 
removed during a write and sectors are only partly shuffled (which kind 
of makes sense).  Both ideas don't seem to be well talked about and 
there is huge disagreement about the probable lifetimes of various flash 
devices?  Certainly I haven't ever had a bad device so I have never 
really seen how they fail?  However, I have experienced wierd 
corruptions (on windows!) with certain devices if I unplug them suddenly 
(ie they loose power suddenly) while they are writing - this could 
indicate that certain devices have poor implementations of wear levelling?

Interesting stuff... However, if switching to ext3 fixes things then 
this sounds like an OS issue and not a CF card issue?

Cheers

Ed W



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gentoo-embedded] embedded ext2 and fsck
  2010-04-06 21:20 ` Marcus Priesch
  2010-04-07 14:27   ` Ed W
@ 2010-04-08 21:21   ` Janusz Syrytczyk
  1 sibling, 0 replies; 9+ messages in thread
From: Janusz Syrytczyk @ 2010-04-08 21:21 UTC (permalink / raw
  To: gentoo-embedded

[-- Attachment #1: Type: Text/Plain, Size: 1740 bytes --]

> Hi David,
> 
> Am Dienstag, den 06.04.2010, 10:16 -0400 schrieb Relson, David:
> [...]
> 
> > Over the past month I've encountered numerous "Stale NFS file handle"
> > errors.  The device isn't networked and there's no apparent reason for
> > them (as best I can tell).
> 
> i had the same problems on a (i think it was ext2) root-filesystem which
> was never written to, but was also not mounted ro.
> 
> i thought that i would not need the fsck on boot up, because there are
> no write accesses to the device ...
> 
> however after some months of operation the device failed to boot with
> exact the same error message ...
> 
> the reason i suspect was that due to power failures the ext2 got
> inconsistent somehow ... which resulted in "stale NFS file handle
> messages" ... not very intuitive ;)
> 
> i put in a check & repair of the filesystem (with -y) on every boot and
> now those errors are gone ...
> 
> i think the problem encounters when a not cleanly shut down ext2 fs gets
> mounted over and over again ... and ... maybe something got written to
> it even if i would not know ... it was a mini-itx running gentoo
> with /var (mostly) on another (rw) partition - but not very "embedded"
> in terms of stripped down ... ;)
> 
> hope this helps,
> marcus.

Stale NFS socket you say.... those meant my custom compilation past away, as 
the filesystem became unusable. I've chopped the flash and bought new one.

Until now, I've managed to kill several flashes. Apparently they're not so 
wearproof as vendors say. This could be coincidence, but all of them were 
Kingstons with lifetime warranty.

can I include screenshot here? I mange my Personal Collection of Failures (TM) 
and I've got screenshot of this failure too :-))

[-- Attachment #2: flash-kaput.png --]
[-- Type: image/png, Size: 182385 bytes --]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gentoo-embedded] embedded ext2 and fsck
  2010-04-07 14:27   ` Ed W
@ 2010-04-09  0:15     ` Peter Stuge
  2010-04-09 12:24       ` Relson, David
  2010-04-09 12:28       ` Ed W
  0 siblings, 2 replies; 9+ messages in thread
From: Peter Stuge @ 2010-04-09  0:15 UTC (permalink / raw
  To: gentoo-embedded

Relson, David wrote:
> My embedded environment is evolving.  The Disk-On-Module currently
> has the following partitions:
> 	/dev/hda2 - /	- root (ext2)
> 	/dev/hda1 - /boot	- syslinux boot partition (FAT16)
> 	/dev/hda3 - /var	- ext2, rw
..
> How important is running fsck in an embedded ext2 environment?

For read-only partitions on perfect media it is never needed.


> When do y'all do and recommend?

Since you are having problems related to writes, I would recommending
splitting things up so that you have one physical media which is
exclusively read-only, and another physical media which is
read-write. This is what I use for my customers.


Ed W wrote:
> 1) The CF card is quietly shuffling data around, so in theory it
> might move a good sector onto a patch of flash which is worn out,
> causing it to be corrupted on next read.

This will of course destroy a previously healthy ext2 fs.


> 2) Sudden shutdowns causing the ext2 to be marked dirty and causing
> subsequent problems (ie not fully read-only mounted
>
> To be honest, I don't know a lot about how ext2 is mounted
> read-only, but option 2) above seems unlikely...?

If ext2 is mounted ro then it will never be written to by the kernel
and thus never corrupted by power failure.

Of course, if the media itself gets corrupted for whatever reason,
you lose anyway. Hence; use separate media.


//Peter



^ permalink raw reply	[flat|nested] 9+ messages in thread

* RE: [gentoo-embedded] embedded ext2 and fsck
  2010-04-09  0:15     ` Peter Stuge
@ 2010-04-09 12:24       ` Relson, David
  2010-04-09 13:23         ` Karl Hiramoto
  2010-04-09 12:28       ` Ed W
  1 sibling, 1 reply; 9+ messages in thread
From: Relson, David @ 2010-04-09 12:24 UTC (permalink / raw
  To: gentoo-embedded

<history>For those who didn't read previous related threads, the
underlying problem encountered is "Stale NFS file handle" errors
appearing for no obvious reason.  As best I recall, they have occurred
even when care has been taken to properly use halt, sync, shutdown, etc.
</history>

We're presently running with 3 partitions:

  /dev/hda1 - /boot FAT16,ro - syslinux boot partition
  /dev/hda2 - /     EXT2,fo  - linux system and application program
  /dev/hda3 - /var  EXT2,rw,sync - data partition

The program is calling sync() after every call to close().  This is
slow, but the number of open,write,close,sync cycles is 4 per minute, so
the slowness is livable.  Probably this redundant "belt and suspenders"
approach can be optimized to rw,async and sync().  An alternate idea is
to use FAT16 for the data partition (which would work fine because the
program has been ported from DOS and uses 8.3 filenames).

Regards,
 
David
-----Original Message-----
From: Peter Stuge [mailto:peter@stuge.se] 
Sent: Thursday, April 08, 2010 8:15 PM
To: gentoo-embedded@lists.gentoo.org
Subject: Re: [gentoo-embedded] embedded ext2 and fsck

Relson, David wrote:
> My embedded environment is evolving.  The Disk-On-Module currently
> has the following partitions:
> 	/dev/hda2 - /	- root (ext2)
> 	/dev/hda1 - /boot	- syslinux boot partition (FAT16)
> 	/dev/hda3 - /var	- ext2, rw
..
> How important is running fsck in an embedded ext2 environment?

For read-only partitions on perfect media it is never needed.


> When do y'all do and recommend?

Since you are having problems related to writes, I would recommending
splitting things up so that you have one physical media which is
exclusively read-only, and another physical media which is
read-write. This is what I use for my customers.


Ed W wrote:
> 1) The CF card is quietly shuffling data around, so in theory it
> might move a good sector onto a patch of flash which is worn out,
> causing it to be corrupted on next read.

This will of course destroy a previously healthy ext2 fs.


> 2) Sudden shutdowns causing the ext2 to be marked dirty and causing
> subsequent problems (ie not fully read-only mounted
>
> To be honest, I don't know a lot about how ext2 is mounted
> read-only, but option 2) above seems unlikely...?

If ext2 is mounted ro then it will never be written to by the kernel
and thus never corrupted by power failure.

Of course, if the media itself gets corrupted for whatever reason,
you lose anyway. Hence; use separate media.


//Peter




^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gentoo-embedded] embedded ext2 and fsck
  2010-04-09  0:15     ` Peter Stuge
  2010-04-09 12:24       ` Relson, David
@ 2010-04-09 12:28       ` Ed W
  1 sibling, 0 replies; 9+ messages in thread
From: Ed W @ 2010-04-09 12:28 UTC (permalink / raw
  To: gentoo-embedded


> Ed W wrote:
>    
>> 1) The CF card is quietly shuffling data around, so in theory it
>> might move a good sector onto a patch of flash which is worn out,
>> causing it to be corrupted on next read.
>>      
> This will of course destroy a previously healthy ext2 fs.
>
>
>    
>> 2) Sudden shutdowns causing the ext2 to be marked dirty and causing
>> subsequent problems (ie not fully read-only mounted
>>
>> To be honest, I don't know a lot about how ext2 is mounted
>> read-only, but option 2) above seems unlikely...?
>>      
> If ext2 is mounted ro then it will never be written to by the kernel
> and thus never corrupted by power failure.
>    

Sure - but my theory was that badly implemented wear levelling + power 
failure during writes could perhaps cause data to be lost on a read-only 
partition when writing to another partition on the same media?

I have no basis for this claim, just pondering how wear levelling is 
actually implemented in a random off the shelf device...?

I agree that separate media is an excellent idea, but it's not always 
easy to achieve using off the shelf boards?

Anyway, curious to hear of anyone loosing data on a read-only partition 
in a manner like the above.  It's perhaps only theoretical curiousity, 
but...

Cheers

Ed W



^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [gentoo-embedded] embedded ext2 and fsck
  2010-04-09 12:24       ` Relson, David
@ 2010-04-09 13:23         ` Karl Hiramoto
  0 siblings, 0 replies; 9+ messages in thread
From: Karl Hiramoto @ 2010-04-09 13:23 UTC (permalink / raw
  To: gentoo-embedded

On 04/09/2010 02:24 PM, Relson, David wrote:
> We're presently running with 3 partitions:
>
>    /dev/hda1 - /boot FAT16,ro - syslinux boot partition
>    /dev/hda2 - /     EXT2,fo  - linux system and application program
>    /dev/hda3 - /var  EXT2,rw,sync - data partition
>
> The program is calling sync() after every call to close().  This is
> slow, but the number of open,write,close,sync cycles is 4 per minute, so
> the slowness is livable.  Probably this redundant "belt and suspenders"
> approach can be optimized to rw,async and sync().  An alternate idea is
> to use FAT16 for the data partition (which would work fine because the
> program has been ported from DOS and uses 8.3 filenames).
>
> Regards,
>    

I'm not sure what kind of lifetime you expect from your device, but if 
you want to maximize it, you should have the RW partition on a separate 
physical media.    Partitions don't really mean anything to the hardware 
wear leveling.

If your data media breaks or wears out you just swap it for a new one.

--
Karl




^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2010-04-09 14:05 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-04-06 14:16 [gentoo-embedded] embedded ext2 and fsck Relson, David
2010-04-06 17:02 ` Ed W
2010-04-06 21:20 ` Marcus Priesch
2010-04-07 14:27   ` Ed W
2010-04-09  0:15     ` Peter Stuge
2010-04-09 12:24       ` Relson, David
2010-04-09 13:23         ` Karl Hiramoto
2010-04-09 12:28       ` Ed W
2010-04-08 21:21   ` Janusz Syrytczyk

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox