[gentoo-user] Emerge --sync source

public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed

* [gentoo-user] Emerge --sync source
@ 2019-02-28  8:36 Peter Humphrey
  2019-02-28  8:43 ` Davyd McColl
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Humphrey @ 2019-02-28  8:36 UTC (permalink / raw
  To: gentoo-user

Hello list,

I have a little server box on my LAN, which I use as a git server. I'm having 
a bit of trouble with it pro tem so I decided to switch the git sync source on 
this box.

I removed the entry pointing to the local server in repos.conf/gentoo.conf and 
put in 'sync-uri = https://github.com/gentoo-mirror/gentoo.git'

Emerge --sync still insisted on going to the local server, which was not there 
so it stopped.

I had to remove /usr/portage/.git before the repos.conf/gentoo.conf entry was 
respected. And that meant stripping out the whole of /usr/portage and fetching 
the whole lot again.

Is this expected behaviour?

-- 
Regards,
Peter.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-02-28  8:36 [gentoo-user] Emerge --sync source Peter Humphrey
@ 2019-02-28  8:43 ` Davyd McColl
  2019-02-28 10:38   ` Nils Freydank
  2019-02-28 15:41   ` Peter Humphrey
  0 siblings, 2 replies; 16+ messages in thread
From: Davyd McColl @ 2019-02-28  8:43 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 1355 bytes --]

On 2019/02/28 10:36:35, Peter Humphrey <peter@prh.myzen.co.uk> wrote:
Hello list,

I have a little server box on my LAN, which I use as a git server. I'm having
a bit of trouble with it pro tem so I decided to switch the git sync source on
this box.

I removed the entry pointing to the local server in repos.conf/gentoo.conf and
put in 'sync-uri = https://github.com/gentoo-mirror/gentoo.git'

Emerge --sync still insisted on going to the local server, which was not there
so it stopped.

I had to remove /usr/portage/.git before the repos.conf/gentoo.conf entry was
respected. And that meant stripping out the whole of /usr/portage and fetching
the whole lot again.
Well, that's pretty-much how git works -- that local repo was still pointing to the old remote. Updating your repos.conf won't change that as the old remote is stored in config in the .git folder. However, if you need to to this again, you could:
1) change repos.conf (in case you ever wipe out /usr/portage again -- the url there is only used for initial clone)
1) in /usr/portage, run `git remote set-url origin <new-url>` -- this informs git of the change, and your next fetch should work as expected.

I guess emerge could check this and set it for the user, but currently, it apparently doesn't.

Is this expected behaviour?

--
Regards,
Peter.

[-- Attachment #2: Type: text/html, Size: 2773 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-02-28  8:43 ` Davyd McColl
@ 2019-02-28 10:38   ` Nils Freydank
  2019-02-28 15:41   ` Peter Humphrey
  1 sibling, 0 replies; 16+ messages in thread
From: Nils Freydank @ 2019-02-28 10:38 UTC (permalink / raw
  To: gentoo-user

I filed a bug report https://bugs.gentoo.org/679040.

Yes, currently you need to update your git config manually everytime you 
change your git remote.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-02-28  8:43 ` Davyd McColl
  2019-02-28 10:38   ` Nils Freydank
@ 2019-02-28 15:41   ` Peter Humphrey
  2019-02-28 15:47     ` Rich Freeman
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Humphrey @ 2019-02-28 15:41 UTC (permalink / raw
  To: gentoo-user

On Thursday, 28 February 2019 08:43:13 GMT Davyd McColl wrote:
> > On 2019/02/28 10:36:35, Peter Humphrey <peter@prh.myzen.co.uk> wrote:

> > I have a little server box on my LAN, which I use as a git server. I'm
> > having a bit of trouble with it pro tem so I decided to switch the git
> > sync source on this box.
> > 
> > I removed the entry pointing to the local server in repos.conf/gentoo.conf
> > and put in 'sync-uri = https://github.com/gentoo-mirror/gentoo.git'
> > 
> > Emerge --sync still insisted on going to the local server, which was not
> > there so it stopped.
> > 
> > I had to remove /usr/portage/.git before the repos.conf/gentoo.conf entry
> > was respected. And that meant stripping out the whole of /usr/portage and
> > fetching the whole lot again.

> Well, that's pretty-much how git works -- that local repo was still pointing
> to the old remote. Updating your repos.conf won't change that as the old
> remote is stored in config in the .git folder.

OK. It'd be helpful if the handbook said that, or somewhere else in the docs. 
Without that, the clear impression is that repos.conf is the place to specify 
the remote source.

> However, if you need to to this again, you could: 1) change repos.conf (in
> case you ever wipe out /usr/portage again -- the url there is only used for
> initial clone) 1) in /usr/portage, run `git remote set-url origin <new-url>`
> -- this informs git of the change, and your next fetch should work as
> expected.

Useful tip - thanks.

> I guess emerge could check this and set it for the user, but currently, it
> apparently doesn't.

Good idea. I hope a suitable developer is listening...

-- 
Regards,
Peter.





^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-02-28 15:41   ` Peter Humphrey
@ 2019-02-28 15:47     ` Rich Freeman
  2019-03-01 10:12       ` Peter Humphrey
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Freeman @ 2019-02-28 15:47 UTC (permalink / raw
  To: gentoo-user

On Thu, Feb 28, 2019 at 10:41 AM Peter Humphrey <peter@prh.myzen.co.uk> wrote:
>
> On Thursday, 28 February 2019 08:43:13 GMT Davyd McColl wrote:
>
> > Well, that's pretty-much how git works -- that local repo was still pointing
> > to the old remote. Updating your repos.conf won't change that as the old
> > remote is stored in config in the .git folder.
>
> OK. It'd be helpful if the handbook said that, or somewhere else in the docs.
> Without that, the clear impression is that repos.conf is the place to specify
> the remote source.

If you're going to migrate it in-place you really should set it in
both places.  Otherwise you'll end up with a surprise if you remove
/usr/portage.

In general it is usually simplest to just remove /usr/portage anytime
you change the sync settings.  At least until portage gets smarter
about it.

-- 
Rich


^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-02-28 15:47     ` Rich Freeman
@ 2019-03-01 10:12       ` Peter Humphrey
  2019-03-06 16:31         ` Laurence Perkins
  0 siblings, 1 reply; 16+ messages in thread
From: Peter Humphrey @ 2019-03-01 10:12 UTC (permalink / raw
  To: gentoo-user

On Thursday, 28 February 2019 15:47:41 GMT Rich Freeman wrote:

> In general it is usually simplest to just remove /usr/portage anytime
> you change the sync settings.  At least until portage gets smarter
> about it.

That works well on a sufficiently powerful box; it only took - oh, I don't 
know - maybe a couple of minutes on this workstation. On my little Atom box, 
though, it takes 75 minutes.

[OT]
Evidence is mounting that the Atom box is in terminal decline. I get things 
like batches of files in the portage tree changing owner, and then when I 
correct that, long lists of supposedly locally changed ebuilds preventing 
syncing. And when I boot weekly into its little rescue system to backup the 
main system, the root filesystem remounts itself read-only while tar is 
running. Smartd recognises the SSD and runs daily tests, but reports no 
errors. No amount of wiping and reinstalling has helped so far.

-- 
Regards,
Peter.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-03-01 10:12       ` Peter Humphrey
@ 2019-03-06 16:31         ` Laurence Perkins
  2019-03-06 16:51           ` Rich Freeman
  2019-03-07 10:10           ` Peter Humphrey
  0 siblings, 2 replies; 16+ messages in thread
From: Laurence Perkins @ 2019-03-06 16:31 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org

[-- Attachment #1: Type: text/plain, Size: 2073 bytes --]



On Fri, 2019-03-01 at 10:12 +0000, Peter Humphrey wrote:
> On Thursday, 28 February 2019 15:47:41 GMT Rich Freeman wrote:
> 
> > In general it is usually simplest to just remove /usr/portage
> > anytime
> > you change the sync settings.  At least until portage gets smarter
> > about it.
> 
> That works well on a sufficiently powerful box; it only took - oh, I
> don't 
> know - maybe a couple of minutes on this workstation. On my little
> Atom box, 
> though, it takes 75 minutes.
> 
> [OT]
> Evidence is mounting that the Atom box is in terminal decline. I get
> things 
> like batches of files in the portage tree changing owner, and then
> when I 
> correct that, long lists of supposedly locally changed ebuilds
> preventing 
> syncing. And when I boot weekly into its little rescue system to
> backup the 
> main system, the root filesystem remounts itself read-only while tar
> is 
> running. Smartd recognises the SSD and runs daily tests, but reports
> no 
> errors. No amount of wiping and reinstalling has helped so far.
> 
What filesystem are you running and how old is the SSD?  That sounds
like some of the symptoms EXT4 had on early generation flash media
where its assumptions about what order writes would physically make it
to the disk in were wrong, leading to corruption.  So unless it was
working correctly at some point in the past, try a different
filesystem.  EXT3 or BTRFS didn't have the same problems.

If it's just that the SSD is failing, then get a new one before
something important gets damaged and you have to redo the whole thing. 
The self-test capability of storage media is almost universally
horrible and you generally don't get a failure report until your data
has already been lost.  If your SMART output gives you the raw
statistics on the device instead of just pass/fail then analyzing that
usually gives a better indication of whether something is about to go
wrong.

LMP

[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-03-06 16:31         ` Laurence Perkins
@ 2019-03-06 16:51           ` Rich Freeman
  2019-03-06 17:39             ` Alan Mackenzie
  2019-03-07 10:10           ` Peter Humphrey
  1 sibling, 1 reply; 16+ messages in thread
From: Rich Freeman @ 2019-03-06 16:51 UTC (permalink / raw
  To: gentoo-user

On Wed, Mar 6, 2019 at 11:31 AM Laurence Perkins <lperkins@openeye.net> wrote:
>
> If it's just that the SSD is failing, then get a new one before
> something important gets damaged and you have to redo the whole thing.

IMO any kind of storage device should be treated as if it could fail
at any time without warning.  You should have a plan for what you will
do WHEN this happens, not IF it happens.

If losing a storage device would result in you losing "something
important" then you're doing it wrong.

I keep all my spinning disks in some kind of RAID unless their
contents are completely expendable (ie I won't be upset if I
completely lose it).  For SSDs I generally do frequent rsync or
zfs-send backups to a spinning disk - these are generally used for OS
data which doesn't change as much anyway, and the backups are quick
since they aren't large.  If I had large SSDs I'd run them in some
sort of RAID.

And of course anything I consider really important gets backed up to
the cloud, encrypted.  RAID is more about avoiding downtime and the
inconvenience of an offline restore.

-- 
Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-03-06 16:51           ` Rich Freeman
@ 2019-03-06 17:39             ` Alan Mackenzie
  2019-03-06 19:11               ` Wols Lists
  0 siblings, 1 reply; 16+ messages in thread
From: Alan Mackenzie @ 2019-03-06 17:39 UTC (permalink / raw
  To: gentoo-user

Hello, Rich.

I'd like to say hello again to everybody, just to mark that I'm still
here and still using Gentoo, and thank people for (a lot of) help
rendered in the past.  My system, used mainly for SW development, has
been stable and well behaved, with very occasional exceptions, for some
while now.

On Wed, Mar 06, 2019 at 11:51:47 -0500, Rich Freeman wrote:
> On Wed, Mar 6, 2019 at 11:31 AM Laurence Perkins <lperkins@openeye.net> wrote:

> > If it's just that the SSD is failing, then get a new one before
> > something important gets damaged and you have to redo the whole thing.

> IMO any kind of storage device should be treated as if it could fail
> at any time without warning.  You should have a plan for what you will
> do WHEN this happens, not IF it happens.

> If losing a storage device would result in you losing "something
> important" then you're doing it wrong.

> I keep all my spinning disks in some kind of RAID unless their
> contents are completely expendable (ie I won't be upset if I
> completely lose it).  For SSDs I generally do frequent rsync or
> zfs-send backups to a spinning disk - these are generally used for OS
> data which doesn't change as much anyway, and the backups are quick
> since they aren't large.  If I had large SSDs I'd run them in some
> sort of RAID.

> And of course anything I consider really important gets backed up to
> the cloud, encrypted.  RAID is more about avoiding downtime and the
> inconvenience of an offline restore.

On my new box, from 2017-04, I don't have any spinning disks (other than
the DVD burner).  I just have a pair of NVMe SSDs in a RAID 1
configuration, with everything bar /boot mirrored.

I back up once a week to one of two DVD+RWs (alternately), and encrypted
to a "cloud" service (what used to be known as a "computer bureau").

Up to now, I've never had a HDD or SDD fail on me.  :-)  I hope that
when this does eventually happen, I'll be prepared.

> -- 
> Rich

-- 
Alan Mackenzie (Nuremberg, Germany).

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-03-06 17:39             ` Alan Mackenzie
@ 2019-03-06 19:11               ` Wols Lists
  0 siblings, 0 replies; 16+ messages in thread
From: Wols Lists @ 2019-03-06 19:11 UTC (permalink / raw
  To: gentoo-user

On 06/03/19 17:39, Alan Mackenzie wrote:
> Up to now, I've never had a HDD or SDD fail on me.  :-)  I hope that
> when this does eventually happen, I'll be prepared.

I don't think I've had one of mine fail. I have, however, done recovery
jobs on two drives that did fail that I managed to revive long enough to
get the data off.

And I currently have a second drive that is properly dead, whose owner
has asked me to destroy it to make sure that nothing can be recovered
off it (the first drive I had in that state was a 60 *G*B drive, which
tells you that it was a long time ago).

Drives are reliable. Drives do last a long time, and I think many drives
have been upgraded before they failed. But nowadays, drives are so large
that people don't fill them up and upgrade, so they are used a lot
longer, and you're seeing them fail more often. I think I've handled
five dead drives (friends and acquaintances) in my career and I'm sure
others have seen a lot more.

Cheers,
Wol

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-03-06 16:31         ` Laurence Perkins
  2019-03-06 16:51           ` Rich Freeman
@ 2019-03-07 10:10           ` Peter Humphrey
  2019-03-07 11:32             ` Mick
  1 sibling, 1 reply; 16+ messages in thread
From: Peter Humphrey @ 2019-03-07 10:10 UTC (permalink / raw
  To: gentoo-user

On Wednesday, 6 March 2019 16:31:27 GMT Laurence Perkins wrote:
> On Fri, 2019-03-01 at 10:12 +0000, Peter Humphrey wrote:

> > [OT]
> > Evidence is mounting that the Atom box is in terminal decline. I get
> > things like batches of files in the portage tree changing owner, and then
> > when I correct that, long lists of supposedly locally changed ebuilds
> > preventing syncing. And when I boot weekly into its little rescue system
> > to backup the main system, the root filesystem remounts itself read-only
> > while tar is running. Smartd recognises the SSD and runs daily tests, but
> > reports no errors. No amount of wiping and reinstalling has helped so far.
> 
> What filesystem are you running and how old is the SSD?  That sounds
> like some of the symptoms EXT4 had on early generation flash media
> where its assumptions about what order writes would physically make it
> to the disk in were wrong, leading to corruption. 

The disk is a 64GB SanDisk SDSSDP device, which I bought five years ago to 
replace a failed spinning disk. All partitions are ext4 except /boot, which is 
ext2.

> So unless it was working correctly at some point in the past, try a
> different filesystem.  EXT3 or BTRFS didn't have the same problems.

It was working just fine until recently.

> If it's just that the SSD is failing, then get a new one before
> something important gets damaged and you have to redo the whole thing.

Everything on it is disposable.

The box is getting a bit long in the tooth: I bought it in November 2010. It's 
a single-core, 32-bit Atom N270 (not N2700). It doesn't owe me anything now, 
in spite of having cost £450 at the time. I don't know whether it's worth 
throwing any more money at it. On the other hand, I see Amazon are only asking 
for £20 for a small SSD.

The repeatability of some of the errors it throws makes me question whether 
the disk or something else is at fault. (What would cause a file system to be 
remounted read-only in the middle of its work?)

I have a spare four-core, 64-bit Celeron box (I bought it for a purpose that's 
gone away). I've been wondering what to do with it, so maybe it can replace 
the Atom box. It's powerful enough to compile its own software, whereas the 
Atom needs help. Whichever I use, its job will be as a server of DNS, LAN 
mail, time and git. Maybe print too. Also it will fetch my ISP's POP mail and 
serve it over IMAP to this box.

> The self-test capability of storage media is almost universally
> horrible and you generally don't get a failure report until your data
> has already been lost.  If your SMART output gives you the raw
> statistics on the device instead of just pass/fail then analyzing that
> usually gives a better indication of whether something is about to go
> wrong.

It seems to report only pass/fail, so that's not much help.

Decisions, decisions...

-- 
Regards,
Peter.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-03-07 10:10           ` Peter Humphrey
@ 2019-03-07 11:32             ` Mick
  2019-03-07 14:29               ` [gentoo-user] " Grant Edwards
  2019-03-07 15:37               ` [gentoo-user] " Dale
  0 siblings, 2 replies; 16+ messages in thread
From: Mick @ 2019-03-07 11:32 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 4277 bytes --]

On Thursday, 7 March 2019 10:10:53 GMT Peter Humphrey wrote:
> On Wednesday, 6 March 2019 16:31:27 GMT Laurence Perkins wrote:
> > On Fri, 2019-03-01 at 10:12 +0000, Peter Humphrey wrote:
> > > [OT]
> > > Evidence is mounting that the Atom box is in terminal decline. I get
> > > things like batches of files in the portage tree changing owner, and
> > > then
> > > when I correct that, long lists of supposedly locally changed ebuilds
> > > preventing syncing. And when I boot weekly into its little rescue system
> > > to backup the main system, the root filesystem remounts itself read-only
> > > while tar is running. Smartd recognises the SSD and runs daily tests,
> > > but
> > > reports no errors. No amount of wiping and reinstalling has helped so
> > > far.
> > 
> > What filesystem are you running and how old is the SSD?  That sounds
> > like some of the symptoms EXT4 had on early generation flash media
> > where its assumptions about what order writes would physically make it
> > to the disk in were wrong, leading to corruption.
> 
> The disk is a 64GB SanDisk SDSSDP device, which I bought five years ago to
> replace a failed spinning disk. All partitions are ext4 except /boot, which
> is ext2.
> 
> > So unless it was working correctly at some point in the past, try a
> > different filesystem.  EXT3 or BTRFS didn't have the same problems.
> 
> It was working just fine until recently.
> 
> > If it's just that the SSD is failing, then get a new one before
> > something important gets damaged and you have to redo the whole thing.
> 
> Everything on it is disposable.
> 
> The box is getting a bit long in the tooth: I bought it in November 2010.
> It's a single-core, 32-bit Atom N270 (not N2700). It doesn't owe me
> anything now, in spite of having cost £450 at the time. I don't know
> whether it's worth throwing any more money at it. On the other hand, I see
> Amazon are only asking for £20 for a small SSD.
> 
> The repeatability of some of the errors it throws makes me question whether
> the disk or something else is at fault. (What would cause a file system to
> be remounted read-only in the middle of its work?)

I can think of 3 things, but more learned M/L contributors may add to these:

1. The SATA connection has come loose.  With time and movement it can come 
(slightly) adrift.  Pushing it back in fully fixes this problem - also see No.
2 below.

2. The physical connector's contacts are beginning to oxidise.  Reseat the 
SATA cable connectors both on the drive and any ribbons on the MoBo.  This 
usualy cleans any oxidisation.

3. The AHCI driver is deploying energy saving measures (aka. Aggressive Link 
Power Management - ALPM).  Check the output of:

 cat /sys/class/scsi_host/host*/link_power_management_policy

If it doesn't say 'max_performance' you'll need to revisit your BIOS settings 
and also PCIEASPM settings in the kernel.

4. Finally, there is a chance the PSU is playing up.


1 & 2 above are more noticeable on spinning disks, but it is a matter of scale 
before SSDs are affected too.  If BIOS, kernel settings and drivers were not 
altered recently, then 1 & 2 merit attention in the first instance.


> I have a spare four-core, 64-bit Celeron box (I bought it for a purpose
> that's gone away). I've been wondering what to do with it, so maybe it can
> replace the Atom box. It's powerful enough to compile its own software,
> whereas the Atom needs help. Whichever I use, its job will be as a server
> of DNS, LAN mail, time and git. Maybe print too. Also it will fetch my
> ISP's POP mail and serve it over IMAP to this box.
> 
> > The self-test capability of storage media is almost universally
> > horrible and you generally don't get a failure report until your data
> > has already been lost.  If your SMART output gives you the raw
> > statistics on the device instead of just pass/fail then analyzing that
> > usually gives a better indication of whether something is about to go
> > wrong.
> 
> It seems to report only pass/fail, so that's not much help.
> 
> Decisions, decisions...

Do short/long smartctl tests report no errors, assuming the disk comes with 
S.M.A.R.T. capability?

-- 
Regards,
Mick

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [gentoo-user] Re: Emerge --sync source
  2019-03-07 11:32             ` Mick
@ 2019-03-07 14:29               ` Grant Edwards
  2019-03-07 14:45                 ` Rich Freeman
  2019-03-07 15:37               ` [gentoo-user] " Dale
  1 sibling, 1 reply; 16+ messages in thread
From: Grant Edwards @ 2019-03-07 14:29 UTC (permalink / raw
  To: gentoo-user

On 2019-03-07, Mick <michaelkintzios@gmail.com> wrote:

> I can think of 3 things, but more learned M/L contributors may add to these:
>
> 1. The SATA connection has come loose.  With time and movement it can come 
> (slightly) adrift.  Pushing it back in fully fixes this problem - also see No.
> 2 below.
>
> 2. The physical connector's contacts are beginning to oxidise.  Reseat the 
> SATA cable connectors both on the drive and any ribbons on the MoBo.  This 
> usualy cleans any oxidisation.
>
> 3. The AHCI driver is deploying energy saving measures (aka. Aggressive Link 
> Power Management - ALPM).  Check the output of:
>
>  cat /sys/class/scsi_host/host*/link_power_management_policy
>
> If it doesn't say 'max_performance' you'll need to revisit your BIOS settings 
> and also PCIEASPM settings in the kernel.
>
> 4. Finally, there is a chance the PSU is playing up.

Perhaps it's already been mentioned, but failing RAM can cause all
sorts failures that might appear to be failing disks, failing network
cards, failing video cards whatever.  I'd run memtest86 for at least
12 hours just to make sure...

-- 
Grant Edwards               grant.b.edwards        Yow! Well, O.K.
                                  at               I'll compromise with my
                              gmail.com            principles because of
                                                   EXISTENTIAL DESPAIR!



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Re: Emerge --sync source
  2019-03-07 14:29               ` [gentoo-user] " Grant Edwards
@ 2019-03-07 14:45                 ` Rich Freeman
  2019-03-07 15:11                   ` Mick
  0 siblings, 1 reply; 16+ messages in thread
From: Rich Freeman @ 2019-03-07 14:45 UTC (permalink / raw
  To: gentoo-user

On Thu, Mar 7, 2019 at 9:29 AM Grant Edwards <grant.b.edwards@gmail.com> wrote:
>
> On 2019-03-07, Mick <michaelkintzios@gmail.com> wrote:
>
> > I can think of 3 things, but more learned M/L contributors may add to these:
> >
> > 1. The SATA connection has come loose.  With time and movement it can come
> > (slightly) adrift.  Pushing it back in fully fixes this problem - also see No.
> > 2 below.
> >
> > 2. The physical connector's contacts are beginning to oxidise.  Reseat the
> > SATA cable connectors both on the drive and any ribbons on the MoBo.  This
> > usualy cleans any oxidisation.
> >
> > 3. The AHCI driver is deploying energy saving measures (aka. Aggressive Link
> > Power Management - ALPM).  Check the output of:
> >
> >  cat /sys/class/scsi_host/host*/link_power_management_policy
> >
> > If it doesn't say 'max_performance' you'll need to revisit your BIOS settings
> > and also PCIEASPM settings in the kernel.
> >
> > 4. Finally, there is a chance the PSU is playing up.
>
> Perhaps it's already been mentioned, but failing RAM can cause all
> sorts failures that might appear to be failing disks, failing network
> cards, failing video cards whatever.  I'd run memtest86 for at least
> 12 hours just to make sure...
>

Failing RAM or failing power certainly can cause all manner of
filesystem and other corruption.  I've seen it firsthand and cleaning
up from it is a total mess (usually best to restore from backup).  I
would definitely start with a memory test - if the motherboard is good
then you can work outwards from there.

From what I've heard SSDs can have bizarre failure modes since they
interpose a logical layer between the physical storage media and the
rest of the system.  They're doing wear-leveling and so on behind the
scenes, which means that if something goes wrong all kinds of bizarre
problems can occur.

I've also experienced a spinning hard drive exhibit lots of data
corruption issues due to a faulty SATA interface (not sure where in
the interface it - chipset, port, or cable).  ZFS saved me there with
detection and resolution of errors, and when I moved the drive to a
different HBA it worked fine after a scrub.  I'd never seen anything
like it before but it really made me appreciate ZFS (btrfs should have
also worked) - I don't think mdadm would have had any way to resolve
these errors easily, though maybe if I used a hex editor to figure out
which drive was the bad one I might have been able to move it, wipe
it, then re-add it to the mirror pair and let it rebuild.  With ZFS I
just got an email complaining about errors from zed and it just kept
beating back the hordes until I fixed the connection.  I forget if it
dropped the drive or not - I didn't have any spares but if I did I
suspect it would have swapped it in after enough problems.

-- 
Rich

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Re: Emerge --sync source
  2019-03-07 14:45                 ` Rich Freeman
@ 2019-03-07 15:11                   ` Mick
  0 siblings, 0 replies; 16+ messages in thread
From: Mick @ 2019-03-07 15:11 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 3268 bytes --]

On Thursday, 7 March 2019 14:45:31 GMT Rich Freeman wrote:
> On Thu, Mar 7, 2019 at 9:29 AM Grant Edwards <grant.b.edwards@gmail.com> 
wrote:
> > On 2019-03-07, Mick <michaelkintzios@gmail.com> wrote:
> > > I can think of 3 things, but more learned M/L contributors may add to
> > > these:
> > > 
> > > 1. The SATA connection has come loose.  With time and movement it can
> > > come
> > > (slightly) adrift.  Pushing it back in fully fixes this problem - also
> > > see No. 2 below.
> > > 
> > > 2. The physical connector's contacts are beginning to oxidise.  Reseat
> > > the
> > > SATA cable connectors both on the drive and any ribbons on the MoBo. 
> > > This
> > > usualy cleans any oxidisation.
> > > 
> > > 3. The AHCI driver is deploying energy saving measures (aka. Aggressive
> > > Link> > 
> > > Power Management - ALPM).  Check the output of:
> > >  cat /sys/class/scsi_host/host*/link_power_management_policy
> > > 
> > > If it doesn't say 'max_performance' you'll need to revisit your BIOS
> > > settings and also PCIEASPM settings in the kernel.
> > > 
> > > 4. Finally, there is a chance the PSU is playing up.
> > 
> > Perhaps it's already been mentioned, but failing RAM can cause all
> > sorts failures that might appear to be failing disks, failing network
> > cards, failing video cards whatever.  I'd run memtest86 for at least
> > 12 hours just to make sure...
> 
> Failing RAM or failing power certainly can cause all manner of
> filesystem and other corruption.  I've seen it firsthand and cleaning
> up from it is a total mess (usually best to restore from backup).  I
> would definitely start with a memory test - if the motherboard is good
> then you can work outwards from there.
> 
> From what I've heard SSDs can have bizarre failure modes since they
> interpose a logical layer between the physical storage media and the
> rest of the system.  They're doing wear-leveling and so on behind the
> scenes, which means that if something goes wrong all kinds of bizarre
> problems can occur.
> 
> I've also experienced a spinning hard drive exhibit lots of data
> corruption issues due to a faulty SATA interface (not sure where in
> the interface it - chipset, port, or cable).  ZFS saved me there with
> detection and resolution of errors, and when I moved the drive to a
> different HBA it worked fine after a scrub.  I'd never seen anything
> like it before but it really made me appreciate ZFS (btrfs should have
> also worked) - I don't think mdadm would have had any way to resolve
> these errors easily, though maybe if I used a hex editor to figure out
> which drive was the bad one I might have been able to move it, wipe
> it, then re-add it to the mirror pair and let it rebuild.  With ZFS I
> just got an email complaining about errors from zed and it just kept
> beating back the hordes until I fixed the connection.  I forget if it
> dropped the drive or not - I didn't have any spares but if I did I
> suspect it would have swapped it in after enough problems.

Good points raised re. faulty memory.  Oxidisation can also occur on RAM 
modules' contacts and reseating them works well.  However, I can't recall the 
OP mentioning corrupt data, which is usually the first thing observed with 
faulty memory.

-- 
Regards,
Mick

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [gentoo-user] Emerge --sync source
  2019-03-07 11:32             ` Mick
  2019-03-07 14:29               ` [gentoo-user] " Grant Edwards
@ 2019-03-07 15:37               ` Dale
  1 sibling, 0 replies; 16+ messages in thread
From: Dale @ 2019-03-07 15:37 UTC (permalink / raw
  To: gentoo-user

Mick wrote:
>
> I can think of 3 things, but more learned M/L contributors may add to these:
>
> 1. The SATA connection has come loose.  With time and movement it can come 
> (slightly) adrift.  Pushing it back in fully fixes this problem - also see No.
> 2 below.
>
> 2. The physical connector's contacts are beginning to oxidise.  Reseat the 
> SATA cable connectors both on the drive and any ribbons on the MoBo.  This 
> usualy cleans any oxidisation.
>


I recently had to replace a SATA cable because it was causing errors on
a drive.  I tried reseating it because that usually works but in that
case, it must have been a bad wire somewhere inside the cable.  Maybe at
some point it was bent around to much or something and was weak or
almost broken.  Once I replaced the cable, the drive started working
correctly. 

I mention that to say this.  Just try another cable even if only
temporarily if you can.  It's one sure way to know that isn't the
problem at least. 

Dale

:-)  :-)


^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2019-03-07 15:37 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-02-28  8:36 [gentoo-user] Emerge --sync source Peter Humphrey
2019-02-28  8:43 ` Davyd McColl
2019-02-28 10:38   ` Nils Freydank
2019-02-28 15:41   ` Peter Humphrey
2019-02-28 15:47     ` Rich Freeman
2019-03-01 10:12       ` Peter Humphrey
2019-03-06 16:31         ` Laurence Perkins
2019-03-06 16:51           ` Rich Freeman
2019-03-06 17:39             ` Alan Mackenzie
2019-03-06 19:11               ` Wols Lists
2019-03-07 10:10           ` Peter Humphrey
2019-03-07 11:32             ` Mick
2019-03-07 14:29               ` [gentoo-user] " Grant Edwards
2019-03-07 14:45                 ` Rich Freeman
2019-03-07 15:11                   ` Mick
2019-03-07 15:37               ` [gentoo-user] " Dale

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox