public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] Root on NFS Suspend/Resume support
@ 2018-12-11  3:03 Tsukasa Mcp_Reznor
  2018-12-11  3:14 ` Grant Taylor
  0 siblings, 1 reply; 12+ messages in thread
From: Tsukasa Mcp_Reznor @ 2018-12-11  3:03 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org

Has anyone managed to get suspend/resume to work on diskless machines using NFS as the root?

Suspend works like normal, but resume hard locks, can't seem to get any error's or anything as it's not sending to any log files naturally. 

I have 3 machines currently running this setup, just trying to save some power.  If it helps they are all using Realtek NICs.

My google-fu hasn't turned up anything in the last 5 years.

Thanks

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-11  3:03 [gentoo-user] Root on NFS Suspend/Resume support Tsukasa Mcp_Reznor
@ 2018-12-11  3:14 ` Grant Taylor
  2018-12-11 11:23   ` Tsukasa Mcp_Reznor
  0 siblings, 1 reply; 12+ messages in thread
From: Grant Taylor @ 2018-12-11  3:14 UTC (permalink / raw
  To: gentoo-user

On 12/10/18 8:03 PM, Tsukasa Mcp_Reznor wrote:
> Has anyone managed to get suspend/resume to work on diskless machines 
> using NFS as the root?

~blink~

I haven't tried to suspend / resume diskless machines.  (I've not done 
much with diskless machines, but it's on my to do list.)

But I don't think I would have thought about trying to suspend / resume 
a diskless machine.

Are we talking about a wired Ethernet network connection with static 
IP(s)?  Or something more complex?

Aside: I'm wondering why a diskless machine is using suspend / resume. 
If you're bored, I'd like to have my (apparently limited) world view 
expanded.

> Suspend works like normal, but resume hard locks, can't seem to get any 
> error's or anything as it's not sending to any log files naturally.

Have you tried using any network based logging?

Can syslog log to a network block device?

Doesn't the kernel have some network logging?  Or the ability to log 
debug info somewhere other than a file?

> I have 3 machines currently running this setup, just trying to save 
> some power.  If it helps they are all using Realtek NICs.

Okay.  I conceptually get saving power.

How are you waking them up?  User interaction?  Clock?  Magic packet?

> My google-fu hasn't turned up anything in the last 5 years.

So, you've been working on it for a while.

Are any of your problems related to stale file handles?  I.e. the 
diskless NFS client disagreeing with the NFS server about the state of 
the files?  Is the NFS server closing the files after a timeout?

> Thanks

You're welcome.  But I'm not sure I helped.  I would like to learn what 
you figure out.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-11  3:14 ` Grant Taylor
@ 2018-12-11 11:23   ` Tsukasa Mcp_Reznor
  2018-12-11 18:04     ` Grant Taylor
  2018-12-11 18:37     ` J. Roeleveld
  0 siblings, 2 replies; 12+ messages in thread
From: Tsukasa Mcp_Reznor @ 2018-12-11 11:23 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org

________________________________________
From: Grant Taylor <gtaylor@gentoo.tnetconsulting.net>
Sent: Monday, December 10, 2018 10:14 PM
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Root on NFS Suspend/Resume support

On 12/10/18 8:03 PM, Tsukasa Mcp_Reznor wrote:
&gt; Has anyone managed to get suspend/resume to work on diskless machines
&gt; using NFS as the root?

~blink~

I haven't tried to suspend / resume diskless machines.  (I've not done
much with diskless machines, but it's on my to do list.)

But I don't think I would have thought about trying to suspend / resume
a diskless machine.

Are we talking about a wired Ethernet network connection with static
IP(s)?  Or something more complex?

Aside: I'm wondering why a diskless machine is using suspend / resume.
If you're bored, I'd like to have my (apparently limited) world view
expanded.

&gt; Suspend works like normal, but resume hard locks, can't seem to get any
&gt; error's or anything as it's not sending to any log files naturally.

Have you tried using any network based logging?

Can syslog log to a network block device?

Doesn't the kernel have some network logging?  Or the ability to log
debug info somewhere other than a file?

&gt; I have 3 machines currently running this setup, just trying to save
&gt; some power.  If it helps they are all using Realtek NICs.

Okay.  I conceptually get saving power.

How are you waking them up?  User interaction?  Clock?  Magic packet?

&gt; My google-fu hasn't turned up anything in the last 5 years.

So, you've been working on it for a while.

Are any of your problems related to stale file handles?  I.e. the
diskless NFS client disagreeing with the NFS server about the state of
the files?  Is the NFS server closing the files after a timeout?

&gt; Thanks

You're welcome.  But I'm not sure I helped.  I would like to learn what
you figure out.




You're totally correct, more information would be beneficial, here goes.
All machines are Wired 1Gbps connections.
Uefi IP4 network stack sends dhcp request, gets boot file pxelinux.efi, the default entry sends the linux kernel (no initramfs needed, firmware added to kernel image).
Another good note is the kernel contains the command line built-in for using root on NFS.
Machine loads, mounts the required mount points through NFS4.2 (so much better than the old NFS 3 speeds).
LightDM loads and users are free to work, in this case family members playing Steam/Diablo 3/etc.
I switched to using Root on NFS for alot of reasons.

Maintaining 4 gentoo installs on machines of varying specs and remembering to update each with good updates added a fair amount of administration time. (4, because the server is included)

Using chroots on the server as binary build hosts for each machine solves some problems, but increases space requirements quite a bit, and adds latency if you want to use it while it's emerging anything, plus compiling say Libreoffice or whatever 3+ times in a row is pretty slow.

Side note, If anyone else runs diskless I have a patch for wine I can send out that returns the nfs mount as a fixed hard drive, there are a few apps/games that refuse to install/run on a network share, and a patch for steam that removes the file locking issues so updates run quick and smooth (neither will ever be upstreamable, people have tried in the past)

</gtaylor@gentoo.tnetconsulting.net>

Thanks for your response, I'd love to help if you have any more questions, it's been a fun experience for me for sure. Also, cachefilesd if there's a drive available, makes everything feel like it's not a networked machine at all here.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-11 11:23   ` Tsukasa Mcp_Reznor
@ 2018-12-11 18:04     ` Grant Taylor
  2018-12-11 22:53       ` Tsukasa Mcp_Reznor
  2018-12-11 18:37     ` J. Roeleveld
  1 sibling, 1 reply; 12+ messages in thread
From: Grant Taylor @ 2018-12-11 18:04 UTC (permalink / raw
  To: gentoo-user

On 12/11/2018 04:23 AM, Tsukasa Mcp_Reznor wrote:
> You're totally correct, more information would be beneficial, here goes.

:-)

> All machines are Wired 1Gbps connections.

ACK

That means that you don't have the complications (and performance 
issues) of wireless.

> Uefi IP4 network stack sends dhcp request, gets boot file pxelinux.efi, 
> the default entry sends the linux kernel (no initramfs needed, firmware 
> added to kernel image).

Interesting.

Do you have reservations in the DHCP server?  Or are the addresses truly 
dynamic?

Are you relying on the client's UEFI implementation to provide the menu? 
  Or are you using PXELINUX for the menu?  (I know it's a nuance, but it 
is a difference.)  The latter is much easier to centrally manage than 
the former.

Does the UEFI stack get the same IP via DHCP that the OS gets via DHCP? 
Is there any sort of contention?  Does UEFI release the IP before 
bootstrapping the PXELINUX image?  Does the DHCP server view the 
multiple requests from the same client MAC as a form of a refresh?  Or 
does it just offer the same IP?

> Another good note is the kernel contains the command line built-in for 
> using root on NFS.

Okay.  ~pondering~

Are all clients booting the same kernel, thus using the same command line?

This means that clients must use DHCP to retrieve their IP address.

I guess there is some opportunity to return different files (PXELINUX 
image / config and / or kernel file) to different clients to get 
different behavior.  But that might be more complexity than is necessary.

Would you please share the kernel command line?  I'm quite curious what 
the syntax is for NFS root.

> Machine loads, mounts the required mount points through NFS4.2 (so much 
> better than the old NFS 3 speeds).

Nice.

> LightDM loads and users are free to work, in this case family members 
> playing Steam/Diablo 3/etc.

:-)

> I switched to using Root on NFS for alot of reasons.

:-)

> Maintaining 4 gentoo installs on machines of varying specs and remembering 
> to update each with good updates added a fair amount of administration 
> time. (4, because the server is included)

*nod*

> Using chroots on the server as binary build hosts for each machine 
> solves some problems, but increases space requirements quite a bit, and 
> adds latency if you want to use it while it's emerging anything, plus 
> compiling say Libreoffice or whatever 3+ times in a row is pretty slow.

That makes me think that you are using a separate NFS export for each 
machine's root.

I have wondered about trying to do something similar (likely start in a 
VM) that has (at least) one machine specific export for things like 
/etc, but would then try to use a common export for things like /usr, 
/lib, and maybe /var.

Maybe a common / export and a per machine /etc would accomplish what I'm 
thinking.

> Side note, If anyone else runs diskless I have a patch for wine I can 
> send out that returns the nfs mount as a fixed hard drive, there are a 
> few apps/games that refuse to install/run on a network share, and a patch 
> for steam that removes the file locking issues so updates run quick and 
> smooth

Nice.

> (neither will ever be upstreamable, people have tried in the past)

:-/

> Thanks for your response, I'd love to help if you have any more questions, 
> it's been a fun experience for me for sure. Also, cachefilesd if there's a 
> drive available, makes everything feel like it's not a networked machine 
> at all here.
You're welcome.

Thank you for sharing.

I'd love to know more about how you're doing things.

  - What is common between the diskless clients and what is unique.
     - PXELINUX image / config
     - Kernel image
     - NFS exports
  - What do your exports look like.
  - What sort of configuration you have in your DHCP server that's 
specific to this.
     - Any sticky reservations, possibly with machine specific parameters.
  - Other things that I can't think of at the moment.

Thank you again.  Very interesting stuff.



-- 
Grant. . . .
unix || die


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-11 11:23   ` Tsukasa Mcp_Reznor
  2018-12-11 18:04     ` Grant Taylor
@ 2018-12-11 18:37     ` J. Roeleveld
  2018-12-11 22:59       ` Tsukasa Mcp_Reznor
  1 sibling, 1 reply; 12+ messages in thread
From: J. Roeleveld @ 2018-12-11 18:37 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 4165 bytes --]

On December 11, 2018 11:23:27 AM UTC, Tsukasa Mcp_Reznor <mcp_reznor@hotmail.com> wrote:
>________________________________________
>From: Grant Taylor <gtaylor@gentoo.tnetconsulting.net>
>Sent: Monday, December 10, 2018 10:14 PM
>To: gentoo-user@lists.gentoo.org
>Subject: Re: [gentoo-user] Root on NFS Suspend/Resume support
>
>On 12/10/18 8:03 PM, Tsukasa Mcp_Reznor wrote:
>&gt; Has anyone managed to get suspend/resume to work on diskless
>machines
>&gt; using NFS as the root?
>
>~blink~
>
>I haven't tried to suspend / resume diskless machines.  (I've not done
>much with diskless machines, but it's on my to do list.)
>
>But I don't think I would have thought about trying to suspend / resume
>a diskless machine.
>
>Are we talking about a wired Ethernet network connection with static
>IP(s)?  Or something more complex?
>
>Aside: I'm wondering why a diskless machine is using suspend / resume.
>If you're bored, I'd like to have my (apparently limited) world view
>expanded.
>
>&gt; Suspend works like normal, but resume hard locks, can't seem to
>get any
>&gt; error's or anything as it's not sending to any log files
>naturally.
>
>Have you tried using any network based logging?
>
>Can syslog log to a network block device?
>
>Doesn't the kernel have some network logging?  Or the ability to log
>debug info somewhere other than a file?
>
>&gt; I have 3 machines currently running this setup, just trying to
>save
>&gt; some power.  If it helps they are all using Realtek NICs.
>
>Okay.  I conceptually get saving power.
>
>How are you waking them up?  User interaction?  Clock?  Magic packet?
>
>&gt; My google-fu hasn't turned up anything in the last 5 years.
>
>So, you've been working on it for a while.
>
>Are any of your problems related to stale file handles?  I.e. the
>diskless NFS client disagreeing with the NFS server about the state of
>the files?  Is the NFS server closing the files after a timeout?
>
>&gt; Thanks
>
>You're welcome.  But I'm not sure I helped.  I would like to learn what
>you figure out.
>
>
>
>
>You're totally correct, more information would be beneficial, here
>goes.
>All machines are Wired 1Gbps connections.
>Uefi IP4 network stack sends dhcp request, gets boot file pxelinux.efi,
>the default entry sends the linux kernel (no initramfs needed, firmware
>added to kernel image).
>Another good note is the kernel contains the command line built-in for
>using root on NFS.
>Machine loads, mounts the required mount points through NFS4.2 (so much
>better than the old NFS 3 speeds).
>LightDM loads and users are free to work, in this case family members
>playing Steam/Diablo 3/etc.
>I switched to using Root on NFS for alot of reasons.
>
>Maintaining 4 gentoo installs on machines of varying specs and
>remembering to update each with good updates added a fair amount of
>administration time. (4, because the server is included)
>
>Using chroots on the server as binary build hosts for each machine
>solves some problems, but increases space requirements quite a bit, and
>adds latency if you want to use it while it's emerging anything, plus
>compiling say Libreoffice or whatever 3+ times in a row is pretty slow.
>
>Side note, If anyone else runs diskless I have a patch for wine I can
>send out that returns the nfs mount as a fixed hard drive, there are a
>few apps/games that refuse to install/run on a network share, and a
>patch for steam that removes the file locking issues so updates run
>quick and smooth (neither will ever be upstreamable, people have tried
>in the past)
>
></gtaylor@gentoo.tnetconsulting.net>
>
>Thanks for your response, I'd love to help if you have any more
>questions, it's been a fun experience for me for sure. Also,
>cachefilesd if there's a drive available, makes everything feel like
>it's not a networked machine at all here.

If you want to resume from NFS, you will need an initramfs that correctly passes the swap device for resuming.
I would try the same method as resuming from encrypted swap.
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

[-- Attachment #2: Type: text/html, Size: 4462 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-11 18:04     ` Grant Taylor
@ 2018-12-11 22:53       ` Tsukasa Mcp_Reznor
  2018-12-11 23:26         ` Grant Taylor
  0 siblings, 1 reply; 12+ messages in thread
From: Tsukasa Mcp_Reznor @ 2018-12-11 22:53 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org


Do you have reservations in the DHCP server?  Or are the addresses truly
dynamic?

Dynamic, any "servers" that would require forwarding I just run on my server

Are you relying on the client's UEFI implementation to provide the menu?
  Or are you using PXELINUX for the menu?  (I know it's a nuance, but it
is a difference.)  The latter is much easier to centrally manage than
the former.

----Actually I haven't found the need for a menu at all, dnsmasq serves whatever kernel I have symbolically linked to the clients from their boot folder


Does the UEFI stack get the same IP via DHCP that the OS gets via DHCP?
Is there any sort of contention?  Does UEFI release the IP before
bootstrapping the PXELINUX image?  Does the DHCP server view the
multiple requests from the same client MAC as a form of a refresh?  Or
does it just offer the same IP?

----I've had this same IP for months on this machine, I would assume the same for the others, as far as log messages from the server, it answers the same to each request from the same mac as far back I I've scrolled through



Are all clients booting the same kernel, thus using the same command line?

This means that clients must use DHCP to retrieve their IP address.

---Yes all are booting the same kernel


I guess there is some opportunity to return different files (PXELINUX
image / config and / or kernel file) to different clients to get
different behavior.  But that might be more complexity than is necessary.

Would you please share the kernel command line?  I'm quite curious what
the syntax is for NFS root.

---- ip=dhcp root=/dev/nfs rootfstype=nfs rw nfsroot=ServerIP:/diskless/root,nolock,fsc,tcp,proto=tcp,vers=4,nfsvers=4.2,rsize=1048576,wsize=1048576 raid=noautodetect



That makes me think that you are using a separate NFS export for each
machine's root.

---all the same root, I actually just have a custom bash script in local.d (openrc) for handling specifics for each node (adding dvd/blueray whatever to fstab)

I have wondered about trying to do something similar (likely start in a
VM) that has (at least) one machine specific export for things like
/etc, but would then try to use a common export for things like /usr,
/lib, and maybe /var.

--- anything that conflicts like /var/log I just have as tmpfs on each machine


  - What is common between the diskless clients and what is unique.
     - PXELINUX image / config
     - Kernel image
     - NFS exports
  - What do your exports look like.
  - What sort of configuration you have in your DHCP server that's
specific to this.
     - Any sticky reservations, possibly with machine specific parameters.
  - Other things that I can't think of at the moment.

Thank you again.  Very interesting stuff.

--- https://wiki.gentoo.org/wiki/Diskless_nodes I got my start from reading that, well unless you count doing diskless with ubuntu in the way way past, my hard drive died then and I wasn't about to just use a livedvd unable to really install anything lol


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-11 18:37     ` J. Roeleveld
@ 2018-12-11 22:59       ` Tsukasa Mcp_Reznor
  2018-12-13 21:03         ` J. Roeleveld
  0 siblings, 1 reply; 12+ messages in thread
From: Tsukasa Mcp_Reznor @ 2018-12-11 22:59 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org

_
If you want to resume from NFS, you will need an initramfs that correctly passes the swap device for resuming.
I would try the same method as resuming from encrypted swap.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
</mcp_reznor@hotmail.com></joost@antarean.org>


I appreciate the response, I'm not trying to use hibernate but rather suspend to ram.  I don't use swap over NFS, the machines that do have hard drives installed use them for local swap and cachefilesd (which is amazingly performant)

In the past when I've tried to use an initramfs, it's lead to boot hangs that I haven't quite figured out the root cause for,  I was trying to use genkernel to build them, maybe I'll give dracut a shot and see if that fixes the problem, you could very well be on to something.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-11 22:53       ` Tsukasa Mcp_Reznor
@ 2018-12-11 23:26         ` Grant Taylor
  0 siblings, 0 replies; 12+ messages in thread
From: Grant Taylor @ 2018-12-11 23:26 UTC (permalink / raw
  To: gentoo-user

On 12/11/2018 03:53 PM, Tsukasa Mcp_Reznor wrote:
> Actually I haven't found the need for a menu at all, dnsmasq serves 
> whatever kernel I have symbolically linked to the clients from their 
> boot folder

Nice.

Aside:  I played with a PXELINUX (?) menu to boot a few different 
things.  It's been too long for me to remember details.  But I was quite 
happy with it.  I think I had installers for a couple of different Linux 
distros and a couple of DOS based utilities.

> I've had this same IP for months on this machine, I would assume 
> the same for the others,

Fair enough.

> as far as log messages from the server, it answers the same to each 
> request from the same mac as far back I I've scrolled through

Okay.  So there's no obvious conflict with UEFI and the OS DHCPing from 
the same MAC address.  Relatively clean transition.

> Yes all are booting the same kernel

Nice.

(Kernel parameters moved to individual lines so my brain can absorb them.)

> ip=dhcp

So, there are three things DHCPing.

1)  UEFI firmware
2)  Kernel itself
3)  OS init scripts

Intriguing.

> root=/dev/nfs
> rootfstype=nfs

I had no idea that there was an nfs device.  I am assuming that it's 
specific to the fact that the root file system type is NFS.  -  I must 
research this more.

> rw
> nfsroot=ServerIP:/diskless /root,nolock,fsc,tcp,proto=tcp,vers=4,nfsvers=4.2,rsize=1048576,wsize=1048576

I assume:

"ServerIP" is the NFS server's IP address.

"/diskless" is the NFS export

"/root,nolock,fsc,tcp,proto=tcp,vers=4,nfsvers=4.2,rsize=1048576,wsize=1048576" 
are NFS mount options.

> raid=noautodetect

Why have a raid parameter?  Is there something in the kernel that you 
don't need and are disabling?  Or is this somehow influencing how file 
systems are mounted on boot?

Do OS init scripts try to remount root?  Or is there not an entry in 
/etc/fstab for the root, and just rely on the kernel's mount?

> all the same root, I actually just have a custom bash script in local.d 
> (openrc) for handling specifics for each node (adding dvd/blueray whatever 
> to fstab)

Hum.

How are you handling the hostnames?  Or is that dynamic?

What about user accounts?  Are all your client systems using the same 
password & group files?

What about SSH host keys?

> anything that conflicts like /var/log I just have as tmpfs on 
> each machine

I can see that for logs.  But I don't think that an empty tmpfs is 
sufficient for things like passwd / group files or ssh host keys.

> https://wiki.gentoo.org/wiki/Diskless_nodes I got my start from 
> reading that, well unless you count doing diskless with ubuntu in the 
> way way past, my hard drive died then and I wasn't about to just use a 
> livedvd unable to really install anything lol
$ReadingList++

Thank you for the link and kindling something that's been a latent 
interest of mine.



-- 
Grant. . . .
unix || die


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-11 22:59       ` Tsukasa Mcp_Reznor
@ 2018-12-13 21:03         ` J. Roeleveld
  2018-12-13 21:08           ` Tsukasa Mcp_Reznor
  0 siblings, 1 reply; 12+ messages in thread
From: J. Roeleveld @ 2018-12-13 21:03 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 1311 bytes --]

On December 11, 2018 10:59:47 PM UTC, Tsukasa Mcp_Reznor <mcp_reznor@hotmail.com> wrote:
>_
>If you want to resume from NFS, you will need an initramfs that
>correctly passes the swap device for resuming.
>I would try the same method as resuming from encrypted swap.
>--
>Sent from my Android device with K-9 Mail. Please excuse my brevity.
></mcp_reznor@hotmail.com></joost@antarean.org>
>
>
>I appreciate the response, I'm not trying to use hibernate but rather
>suspend to ram.  I don't use swap over NFS, the machines that do have
>hard drives installed use them for local swap and cachefilesd (which is
>amazingly performant)
>
>In the past when I've tried to use an initramfs, it's lead to boot
>hangs that I haven't quite figured out the root cause for,  I was
>trying to use genkernel to build them, maybe I'll give dracut a shot
>and see if that fixes the problem, you could very well be on to
>something.

I believe "suspend to ram" might switch off the network (and kill a NFS connection in the process). This might be the cause of the issue.
Do the nodes have enough memory to load the filesystem into RAM and run from there? (Like sysresccd can do)
If yes, that might allow this to work.

--
Joost
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.

[-- Attachment #2: Type: text/html, Size: 1590 bytes --]

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-13 21:03         ` J. Roeleveld
@ 2018-12-13 21:08           ` Tsukasa Mcp_Reznor
  2018-12-14 14:58             ` Daniel Frey
  0 siblings, 1 reply; 12+ messages in thread
From: Tsukasa Mcp_Reznor @ 2018-12-13 21:08 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org

________________________________________
From: J. Roeleveld <joost@antarean.org>
Sent: Thursday, December 13, 2018 4:03 PM
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Root on NFS Suspend/Resume support

On December 11, 2018 10:59:47 PM UTC, Tsukasa Mcp_Reznor <mcp_reznor@hotmail.com> wrote:

_
If you want to resume from NFS, you will need an initramfs that correctly passes the swap device for resuming.
I would try the same method as resuming from encrypted swap.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
</mcp_reznor@hotmail.com></joost@antarean.org>


I appreciate the response, I'm not trying to use hibernate but rather suspend to ram.  I don't use swap over NFS, the machines that do have hard drives installed use them for local swap and cachefilesd (which is amazingly performant)

In the past when I've tried to use an initramfs, it's lead to boot hangs that I haven't quite figured out the root cause for,  I was trying to use genkernel to build them, maybe I'll give dracut a shot and see if that fixes the problem, you could very well be on to something.

I believe "suspend to ram" might switch off the network (and kill a NFS connection in the process). This might be the cause of the issue.
Do the nodes have enough memory to load the filesystem into RAM and run from there? (Like sysresccd can do)
If yes, that might allow this to work.

--
Joost
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.

Probably not enough ram, the lowest machine has 4 gigs, as an update I installed and tried out dracut, that didn't make any difference but each system booted fine an initrd which is a change for sure.  If I manually suspend for up to say 10 seconds, they resume just fine.  I like the idea S2ram killing the network as the cause, I thought enabling wake on lan would keep it from being switched off, I'll see if I can research the suspending/resuming routines and blacklist or whatever to keep it running, thanks for the tip :)


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-13 21:08           ` Tsukasa Mcp_Reznor
@ 2018-12-14 14:58             ` Daniel Frey
  2018-12-14 20:37               ` Tsukasa Mcp_Reznor
  0 siblings, 1 reply; 12+ messages in thread
From: Daniel Frey @ 2018-12-14 14:58 UTC (permalink / raw
  To: gentoo-user

On 12/10/18 7:03 PM, Tsukasa Mcp_Reznor wrote:> Has anyone managed to 
get suspend/resume to work on diskless machines using NFS as the root?
 >
 > Suspend works like normal, but resume hard locks, can't seem to get 
any error's or anything as it's not sending to any log files naturally.
 >

On 12/13/18 1:08 PM, Tsukasa Mcp_Reznor wrote:
> If I manually suspend for up to say 10 seconds, they resume just fine.  
> 

Have you checked the power supply?

I don't use a diskless setup but last year (nah, maybe many years ago) I 
had this strange resume problem after suspend. As in, I'd wake the 
machine and it'd sit there with a blinking text cursor in text mode, 
quite stuck. I am pretty sure I posted about it on the list here.

It turned out that when my machine was running the power supply was 
fine. However, when I suspended it, the 5V rail would bleed voltage. So, 
I discovered if I resumed within, say, 5 minutes after suspending my 
machine it would wake normally. After that though, I'd get the blinking 
cursor and it would hang resuming.

I confirmed that the 5V rail was bleeding voltage when in suspend with 
my voltmeter. It turned out to be bad capacitors in the power supply.

Just a suggestion...

Dan


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [gentoo-user] Root on NFS Suspend/Resume support
  2018-12-14 14:58             ` Daniel Frey
@ 2018-12-14 20:37               ` Tsukasa Mcp_Reznor
  0 siblings, 0 replies; 12+ messages in thread
From: Tsukasa Mcp_Reznor @ 2018-12-14 20:37 UTC (permalink / raw
  To: gentoo-user@lists.gentoo.org

Have you checked the power supply?

I don't use a diskless setup but last year (nah, maybe many years ago) I
had this strange resume problem after suspend. As in, I'd wake the
machine and it'd sit there with a blinking text cursor in text mode,
quite stuck. I am pretty sure I posted about it on the list here.

It turned out that when my machine was running the power supply was
fine. However, when I suspended it, the 5V rail would bleed voltage. So,
I discovered if I resumed within, say, 5 minutes after suspending my
machine it would wake normally. After that though, I'd get the blinking
cursor and it would hang resuming.

I confirmed that the 5V rail was bleeding voltage when in suspend with
my voltmeter. It turned out to be bad capacitors in the power supply.

Just a suggestion...

Dan


I appreciate the tip, if I boot off a hard drive on my main desktop it does indeed sleep/resume just fine, and it was the source of every file that got sent to the network when I started converting to diskless,  maybe I'll throw in small livedvd install and check again.

If it helps, when I'm in LXDE and have just a terminal open with top running, when the screen comes back on, top will update just ONCE before freezing, I can move the mouse cursor, num lock toggles, I can drag the terminal window around, if I try to switch vt2 or anything else like load a previously uncached menu from the taskbar then it never loads or switches.  So it's definately (to my eyes at least) I problem with the nfs connection, I don't believe the NIC is powering down as I turned on wake on lan, but I'll test and make sure tonight,  and aside from blacklisting kernel modules I have yet to find a way to tweak the resume/suspend functions but I'm still looking for more information when I have free time.


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2018-12-14 20:37 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2018-12-11  3:03 [gentoo-user] Root on NFS Suspend/Resume support Tsukasa Mcp_Reznor
2018-12-11  3:14 ` Grant Taylor
2018-12-11 11:23   ` Tsukasa Mcp_Reznor
2018-12-11 18:04     ` Grant Taylor
2018-12-11 22:53       ` Tsukasa Mcp_Reznor
2018-12-11 23:26         ` Grant Taylor
2018-12-11 18:37     ` J. Roeleveld
2018-12-11 22:59       ` Tsukasa Mcp_Reznor
2018-12-13 21:03         ` J. Roeleveld
2018-12-13 21:08           ` Tsukasa Mcp_Reznor
2018-12-14 14:58             ` Daniel Frey
2018-12-14 20:37               ` Tsukasa Mcp_Reznor

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox