public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
* [gentoo-user] Machine hangs up with out of memory
@ 2021-04-28  9:58 Kai Peter
  2021-04-30  1:30 ` Adam Carter
  0 siblings, 1 reply; 6+ messages in thread
From: Kai Peter @ 2021-04-28  9:58 UTC (permalink / raw
  To: Gentoo User

Hi,

I have an issue with a machine where I'm not able to detect the real 
root cause. It hangs up totally. It seems like it was running out of 
memory - but why? Hopefully somebody can give me some insight. As far I 
can see right now, it hangs up a few hours after an `emerge --update 
--newuse --deep --with-bdeps=y @world`.

The machine is an Intel Atom with 8 GB RAM (physical, max) and 24 GB 
swap (a file). So 32 GB RAM in total. It has a 250GB SSD. It runs 
gentoo-sources-4.14.83 build with genkernel. Portage uses the stable 
tree only. It basically provides the hardware for a qemu VM which does 
the network management: primary ns, dhcp, apache ssl proxy. This VM uses 
4 GB RAM and has 8 GB swap file. The VM works smoothly. The atom machine 
itself acts further as basic nfs server to an independent dedicated 
server - which does the (re)exports - and as secondary ns. For this I'm 
convinced that 32GB RAM total have to be enough - correct me if I'm 
wrong!

The issue starts round about in February (IIRC). The update of gcc-10.2 
did fail. I have /var/tmp/portage on tmpfs - I did increase the size in 
fstab from 8 to 16 GB. Afterwards gcc build successfully.

After two hang-ups I did increase the swap from 8 to 24 GB. It doesn't 
help. Here is a complete log from /var/log/messages:

Apr 28 05:35:57 Syrin kernel: [1454017.499919] isc-net-0000 invoked 
oom-killer: gfp_mask=0x14201ca(GFP_HIGHUSER_MOVABLE|__GFP_COLD), 
nodemask=(null),  order=0, oom_score_adj=0
Apr 28 05:35:57 Syrin kernel: [1454017.499925] isc-net-0000 cpuset=/ 
mems_allowed=0
Apr 28 05:35:57 Syrin kernel: [1454017.499933] CPU: 0 PID: 27685 Comm: 
isc-net-0000 Not tainted 4.14.83-gentoo #1
Apr 28 05:35:57 Syrin kernel: [1454017.499935] Hardware name: MSI 
MS-7877/J1900I, BIOS V1.2 03/25/2014
Apr 28 05:35:57 Syrin kernel: [1454017.499936] Call Trace:
Apr 28 05:35:57 Syrin kernel: [1454017.499948]  dump_stack+0x67/0x98
Apr 28 05:35:57 Syrin kernel: [1454017.499954]  dump_header+0x94/0x20c
Apr 28 05:35:57 Syrin kernel: [1454017.499958]  
oom_kill_process+0x24a/0x420
Apr 28 05:35:57 Syrin kernel: [1454017.499962]  ? 
oom_badness.part.9+0xd3/0x150
Apr 28 05:35:57 Syrin kernel: [1454017.499965]  out_of_memory+0xf9/0x290
Apr 28 05:35:57 Syrin kernel: [1454017.499968]  
__alloc_pages_nodemask+0xf48/0xff0
Apr 28 05:35:57 Syrin kernel: [1454017.499974]  
filemap_fault+0x294/0x4c0
Apr 28 05:35:57 Syrin kernel: [1454017.499979]  
ext4_filemap_fault+0x2c/0x40
Apr 28 05:35:57 Syrin kernel: [1454017.499983]  __do_fault+0x1f/0xb0
Apr 28 05:35:57 Syrin kernel: [1454017.499986]  
__handle_mm_fault+0x3ed/0xad0
Apr 28 05:35:57 Syrin kernel: [1454017.499991]  
handle_mm_fault+0xaa/0x1f0
Apr 28 05:35:57 Syrin kernel: [1454017.499996]  
__do_page_fault+0x250/0x4f0
Apr 28 05:35:57 Syrin kernel: [1454017.500000]  ? page_fault+0x2f/0x50
Apr 28 05:35:57 Syrin kernel: [1454017.500003]  page_fault+0x45/0x50
Apr 28 05:35:57 Syrin kernel: [1454017.500005] RIP: 0000:          
(null)
Apr 28 05:35:57 Syrin kernel: [1454017.500007] RSP: 
12f83750:0000000000000001 EFLAGS: 7ffa12f837a0
Apr 28 05:35:57 Syrin kernel: [1454017.500010] Mem-Info:
Apr 28 05:35:57 Syrin kernel: [1454017.500017] active_anon:1694713 
inactive_anon:211859 isolated_anon:0
Apr 28 05:35:57 Syrin kernel: [1454017.500017]  active_file:328 
inactive_file:344 isolated_file:32
Apr 28 05:35:57 Syrin kernel: [1454017.500017]  unevictable:1374 dirty:0 
writeback:0 unstable:0
Apr 28 05:35:57 Syrin kernel: [1454017.500017]  slab_reclaimable:4480 
slab_unreclaimable:7449
Apr 28 05:35:57 Syrin kernel: [1454017.500017]  mapped:1071 shmem:3 
pagetables:16352 bounce:0
Apr 28 05:35:57 Syrin kernel: [1454017.500017]  free:11655 free_pcp:534 
free_cma:0
Apr 28 05:35:57 Syrin kernel: [1454017.500021] Node 0 
active_anon:6778852kB inactive_anon:847436kB active_file:1312kB 
inactive_file:1376kB unevictable:5496kB isolated(anon):0kB 
isolated(file):128kB mapped:4284kB dirty:0kB writeback:0kB shmem:12kB 
writeback_tmp:0kB unstable:0kB all_unreclaimable? yes
Apr 28 05:35:57 Syrin kernel: [1454017.500026] DMA free:15836kB min:20kB 
low:32kB high:44kB active_anon:0kB inactive_anon:0kB active_file:0kB 
inactive_file:0kB unevictable:0kB writepending:0kB present:15920kB 
managed:15836kB mlocked:0kB kernel_stack:0kB pagetables:0kB bounce:0kB 
free_pcp:0kB local_pcp:0kB free_cma:0kB
Apr 28 05:35:57 Syrin kernel: [1454017.500027] lowmem_reserve[]: 0 2664 
7647 7647
Apr 28 05:35:57 Syrin kernel: [1454017.500036] DMA32 free:23732kB 
min:3892kB low:6620kB high:9348kB active_anon:2319992kB 
inactive_anon:363260kB active_file:0kB inactive_file:92kB 
unevictable:0kB writepending:0kB present:2825512kB managed:2734888kB 
mlocked:0kB kernel_stack:140kB pagetables:22572kB bounce:0kB 
free_pcp:1136kB local_pcp:476kB free_cma:0kB
Apr 28 05:35:57 Syrin kernel: [1454017.500037] lowmem_reserve[]: 0 0 
4982 4982
Apr 28 05:35:57 Syrin kernel: [1454017.500045] Normal free:7052kB 
min:7284kB low:12384kB high:17484kB active_anon:4458860kB 
inactive_anon:484176kB active_file:1368kB inactive_file:1568kB 
unevictable:5496kB writepending:0kB present:5242880kB managed:5102420kB 
mlocked:5496kB kernel_stack:2276kB pagetables:42836kB bounce:0kB 
free_pcp:1000kB local_pcp:724kB free_cma:0kB
Apr 28 05:35:57 Syrin kernel: [1454017.500046] lowmem_reserve[]: 0 0 0 0
Apr 28 05:35:57 Syrin kernel: [1454017.500050] DMA: 1*4kB (U) 1*8kB (U) 
1*16kB (U) 0*32kB 1*64kB (U) 1*128kB (U) 1*256kB (U) 0*512kB 1*1024kB 
(U) 1*2048kB (M) 3*4096kB (M) = 15836kB
Apr 28 05:35:57 Syrin kernel: [1454017.500070] DMA32: 268*4kB (UE) 
1562*8kB (UE) 645*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 
0*1024kB 0*2048kB 0*4096kB = 23888kB
Apr 28 05:35:57 Syrin kernel: [1454017.500084] Normal: 293*4kB (UME) 
490*8kB (UME) 152*16kB (UE) 0*32kB 0*64kB 0*128kB 0*256kB 0*512kB 
0*1024kB 0*2048kB 0*4096kB = 7524kB
Apr 28 05:35:57 Syrin kernel: [1454017.500100] Node 0 hugepages_total=0 
hugepages_free=0 hugepages_surp=0 hugepages_size=2048kB
Apr 28 05:35:57 Syrin kernel: [1454017.500101] 1818 total pagecache 
pages
Apr 28 05:35:57 Syrin kernel: [1454017.500110] 86 pages in swap cache
Apr 28 05:35:57 Syrin kernel: [1454017.500112] Swap cache stats: add 
13659627, delete 13659513, find 1129210/1793572
Apr 28 05:35:57 Syrin kernel: [1454017.500113] Free swap  = 0kB
Apr 28 05:35:57 Syrin kernel: [1454017.500114] Total swap = 25165820kB
Apr 28 05:35:57 Syrin kernel: [1454017.500115] 2021078 pages RAM
Apr 28 05:35:57 Syrin kernel: [1454017.500116] 0 pages 
HighMem/MovableOnly
Apr 28 05:35:57 Syrin kernel: [1454017.500117] 57792 pages reserved
Apr 28 05:35:57 Syrin kernel: [1454017.500118] 0 pages hwpoisoned
Apr 28 05:35:57 Syrin kernel: [1454017.500119] [ pid ]   uid  tgid 
total_vm      rss nr_ptes nr_pmds swapents oom_score_adj name
Apr 28 05:35:57 Syrin kernel: [1454017.500125] [ 4009]     0  4009    
21207      320       7       3       20             0 apcupsd
Apr 28 05:35:57 Syrin kernel: [1454017.500128] [ 4043]     0  4043    
54371       48      12       3      220             0 rsyslogd
Apr 28 05:35:57 Syrin kernel: [1454017.500131] [ 4084]     0  4084     
1938      178       6       3       18             0 fcron
Apr 28 05:35:57 Syrin kernel: [1454017.500133] [ 4400]     0  4400    
17733     1376       8       3        0             0 ntpd
Apr 28 05:35:57 Syrin kernel: [1454017.500136] [ 4429]     0  4429     
2789        0       7       3      103             0 rsync
Apr 28 05:35:57 Syrin kernel: [1454017.500139] [ 4460]     0  4460     
1689      241       6       3      112         -1000 sshd
Apr 28 05:35:57 Syrin kernel: [1454017.500142] [ 4693]     0  4693  
1067352    80035    1546       7   664044             0 qemu-system-x86
Apr 28 05:35:57 Syrin kernel: [1454017.500145] [ 4863]     0  4863     
1905      465       7       3       43             0 agetty
Apr 28 05:35:57 Syrin kernel: [1454017.500148] [ 4864]     0  4864     
1905      433       7       3       44             0 agetty
Apr 28 05:35:57 Syrin kernel: [1454017.500151] [ 4865]     0  4865     
1905      431       7       3       43             0 agetty
Apr 28 05:35:57 Syrin kernel: [1454017.500153] [ 4866]     0  4866     
1905      443       7       3       43             0 agetty
Apr 28 05:35:57 Syrin kernel: [1454017.500156] [ 4867]     0  4867     
1905      453       7       3       43             0 agetty
Apr 28 05:35:57 Syrin kernel: [1454017.500159] [ 4868]     0  4868     
1905      433       8       3       43             0 agetty
Apr 28 05:35:57 Syrin kernel: [1454017.500162] [27439]     0 27439      
675      295       5       3       60             0 rpcbind
Apr 28 05:35:57 Syrin kernel: [1454017.500164] [27509]     0 27509      
750      419       5       3       80             0 rpc.idmapd
Apr 28 05:35:57 Syrin kernel: [1454017.500167] [27520] 65534 27520      
693      421       5       3       53             0 rpc.statd
Apr 28 05:35:57 Syrin kernel: [1454017.500170] [27583]     0 27583      
819      365       5       3      108             0 rpc.mountd
Apr 28 05:35:57 Syrin kernel: [1454017.500173] [27684]    40 27684  
7570889  1823291   14584      32  5626031             0 named
Apr 28 05:35:57 Syrin kernel: [1454017.500176] [10479]     0 10479     
3262      449       7       3      182             0 udevd
Apr 28 05:35:57 Syrin kernel: [1454017.500179] [ 2923]     0  2923     
1938      318       6       3       10             0 fcron
Apr 28 05:35:57 Syrin kernel: [1454017.500182] [ 2924]     0  2924      
981      406       5       3        0             0 backup.cron
Apr 28 05:35:57 Syrin kernel: [1454017.500184] [ 2930]     0  2930      
981      462       5       3        0             0 rsync.sh
Apr 28 05:35:57 Syrin kernel: [1454017.500187] [ 2965]     0  2965    
13185     1261      30       3        0             0 rsync
Apr 28 05:35:57 Syrin kernel: [1454017.500190] [ 2966]     0  2966      
574      158       5       3        0             0 tee
Apr 28 05:35:57 Syrin kernel: [1454017.500192] [ 2968]     0  2968     
2038      482       7       3        0             0 rsync
Apr 28 05:35:57 Syrin kernel: [1454017.500195] [ 2974]     0  2974    
14142     1189      31       3        0             0 rsync
Apr 28 05:35:57 Syrin kernel: [1454017.500198] [ 2995]     0  2995     
1938      271       6       3        7             0 fcron
Apr 28 05:35:57 Syrin kernel: [1454017.500201] [ 2996]     0  2996      
981       68       6       3        0             0 sh
Apr 28 05:35:57 Syrin kernel: [1454017.500203] Out of memory: Kill 
process 27684 (named) score 904 or sacrifice child
Apr 28 05:35:57 Syrin kernel: [1454017.500234] Killed process 27684 
(named) total-vm:30283556kB, anon-rss:7293164kB, file-rss:0kB, 
shmem-rss:0kB
Apr 28 05:36:00 Syrin kernel: [1454019.937636] oom_reaper: reaped 
process 27684 (named), now anon-rss:0kB, file-rss:0kB, shmem-rss:0kB

All the processes after udevd are just from this day (backup job).

For comparison, the last call trace before was nearly the same:

Apr 15 18:26:37 Syrin kernel: [376620.710330] Call Trace:
Apr 15 18:26:37 Syrin kernel: [376620.710341]  dump_stack+0x67/0x98
Apr 15 18:26:37 Syrin kernel: [376620.710347]  dump_header+0x94/0x20c
Apr 15 18:26:37 Syrin kernel: [376620.710352]  
oom_kill_process+0x24a/0x420
Apr 15 18:26:37 Syrin kernel: [376620.710355]  ? 
oom_badness.part.9+0xd3/0x150
Apr 15 18:26:37 Syrin kernel: [376620.710358]  out_of_memory+0xf9/0x290
Apr 15 18:26:37 Syrin kernel: [376620.710361]  
__alloc_pages_nodemask+0xf48/0xff0
Apr 15 18:26:37 Syrin kernel: [376620.710367]  filemap_fault+0x294/0x4c0
Apr 15 18:26:37 Syrin kernel: [376620.710372]  
ext4_filemap_fault+0x2c/0x40
Apr 15 18:26:37 Syrin kernel: [376620.710376]  __do_fault+0x1f/0xb0
Apr 15 18:26:37 Syrin kernel: [376620.710380]  
__handle_mm_fault+0x3ed/0xad0
Apr 15 18:26:37 Syrin kernel: [376620.710385]  
handle_mm_fault+0xaa/0x1f0
Apr 15 18:26:37 Syrin kernel: [376620.710390]  
__do_page_fault+0x250/0x4f0
Apr 15 18:26:37 Syrin kernel: [376620.710394]  ? page_fault+0x2f/0x50
Apr 15 18:26:37 Syrin kernel: [376620.710396]  page_fault+0x45/0x50
Apr 15 18:26:37 Syrin kernel: [376620.710400] RIP: 
21e50088:0x7f41aedee150
Apr 15 18:26:37 Syrin kernel: [376620.710401] RSP: 
21e50070:0000000000000000 EFLAGS: 7f41aedee150
Apr 15 18:26:37 Syrin kernel: [376620.710405] Mem-Info:
Apr 15 18:26:37 Syrin kernel: [376620.710411] active_anon:1695970 
inactive_anon:212020 isolated_anon:0
Apr 15 18:26:37 Syrin kernel: [376620.710411]  active_file:339 
inactive_file:320 isolated_file:0
Apr 15 18:26:37 Syrin kernel: [376620.710411]  unevictable:1374 dirty:0 
writeback:0 unstable:0
Apr 15 18:26:37 Syrin kernel: [376620.710411]  slab_reclaimable:4311 
slab_unreclaimable:7552
Apr 15 18:26:37 Syrin kernel: [376620.710411]  mapped:1059 shmem:3 
pagetables:16264 bounce:0
Apr 15 18:26:37 Syrin kernel: [376620.710411]  free:11840 free_pcp:0 
free_cma:0

Unfortunately I don't understand all the details. Any help is highly 
appreciated.

I assume it has something to do with tmpfs which will not be freed. Just 
an assumption, I'm searching for clarity, not try and error.

Thanks
Kai




^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-user] Machine hangs up with out of memory
       [not found] ` <d9456416-d744-43a4-a9d0-6a70eb8ea0d6@iname.com>
@ 2021-04-29 18:01   ` Kai Peter
  0 siblings, 0 replies; 6+ messages in thread
From: Kai Peter @ 2021-04-29 18:01 UTC (permalink / raw
  To: gentoo-user

On 2021-04-29 08:59, J.O. Aho wrote:
> Your named is taking up 7G or memory, are you sure your configuration
> is correct?
Thanks, good point. There were errors in the logs. I will investigate 
and monitor it.

> --
> 
>  //Aho


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-user] Machine hangs up with out of memory
  2021-04-28  9:58 Kai Peter
@ 2021-04-30  1:30 ` Adam Carter
  2021-04-30 10:09   ` Michael
  0 siblings, 1 reply; 6+ messages in thread
From: Adam Carter @ 2021-04-30  1:30 UTC (permalink / raw
  To: Gentoo User

[-- Attachment #1: Type: text/plain, Size: 921 bytes --]

On Wed, Apr 28, 2021 at 7:58 PM Kai Peter <kp@lists.qware.org> wrote:

> Hi,
>
> I have an issue with a machine where I'm not able to detect the real
> root cause. It hangs up totally. It seems like it was running out of
> memory - but why? Hopefully somebody can give me some insight. As far I
> can see right now, it hangs up a few hours after an `emerge --update
> --newuse --deep --with-bdeps=y @world`.
>
> The machine is an Intel Atom with 8 GB RAM (physical, max) and 24 GB
> swap (a file). So 32 GB RAM in total.
>

Might be worth adding zswap using zstd or lz4 to your config (uses more CPU
for less IO)
https://www.kernel.org/doc/html/latest/vm/zswap.html

If you want to see which processes are using the most memory, run top then
type 'M' to have top sort by memory instead of CPU. You can also type 'm'
to make top show the memory numbers as an ascii bar graph. The man page
explains what VIRT RES SHR mean.

[-- Attachment #2: Type: text/html, Size: 1394 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-user] Machine hangs up with out of memory
  2021-04-30  1:30 ` Adam Carter
@ 2021-04-30 10:09   ` Michael
  2021-04-30 23:07     ` Kai Peter
  2021-05-02  0:39     ` Adam Carter
  0 siblings, 2 replies; 6+ messages in thread
From: Michael @ 2021-04-30 10:09 UTC (permalink / raw
  To: gentoo-user

[-- Attachment #1: Type: text/plain, Size: 2238 bytes --]

On Friday, 30 April 2021 02:30:51 BST Adam Carter wrote:
> On Wed, Apr 28, 2021 at 7:58 PM Kai Peter <kp@lists.qware.org> wrote:
> > Hi,
> > 
> > I have an issue with a machine where I'm not able to detect the real
> > root cause. It hangs up totally. It seems like it was running out of
> > memory - but why? Hopefully somebody can give me some insight. As far I
> > can see right now, it hangs up a few hours after an `emerge --update
> > --newuse --deep --with-bdeps=y @world`.
> > 
> > The machine is an Intel Atom with 8 GB RAM (physical, max) and 24 GB
> > swap (a file). So 32 GB RAM in total.
> 
> Might be worth adding zswap using zstd or lz4 to your config (uses more CPU
> for less IO)
> https://www.kernel.org/doc/html/latest/vm/zswap.html

Zswap will help optimise the swapping of cold pages out of RAM into the 
compressed zswap cache and eventually push these out to the disk.  This won't 
help with the problem of not having enough RAM to start with for compiling 
packages with large resource requirements, like e.g. Chromium.

With monster packages where more RAM is needed even for a single compilation 
job, after all cold pages have been swapped out, hot pages will start being 
swapped out/in.  I think at this point the same I/O race condition will ensue 
as if no zswap were available, plus a compress/decompress CPU load.  I guess 
the point of starting to thrash the on-disk swap trying to free RAM may start 
sooner, since RAM which would otherwise be available for the single 
prioritised compilation job is now being used by zswap.  So, there would be a 
sweet spot in using zswap to improve I/O performance, but it could end up 
becoming a disbenefit on compilation of RAM hungry packages.

Please correct my reasoning above if I have misunderstood how zswap works.

However, the OP problem here seems to be with a leaky BIND?

I found this mentioned upstream - but have not check BGO:

https://gitlab.isc.org/isc-projects/bind9/-/issues/446
 

> If you want to see which processes are using the most memory, run top then
> type 'M' to have top sort by memory instead of CPU. You can also type 'm'
> to make top show the memory numbers as an ascii bar graph. The man page
> explains what VIRT RES SHR mean.


[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-user] Machine hangs up with out of memory
  2021-04-30 10:09   ` Michael
@ 2021-04-30 23:07     ` Kai Peter
  2021-05-02  0:39     ` Adam Carter
  1 sibling, 0 replies; 6+ messages in thread
From: Kai Peter @ 2021-04-30 23:07 UTC (permalink / raw
  To: gentoo-user

On 2021-04-30 12:09, Michael wrote:
> 
> However, the OP problem here seems to be with a leaky BIND?
> 
> I found this mentioned upstream - but have not check BGO:
> 
> https://gitlab.isc.org/isc-projects/bind9/-/issues/446

Thanks for reply. rndc runs on other machines daily, but not on this 
one.

Right now I have corrected my bind config and the memory usage is stable 
by 700MB. The issue here was the statement "listen-on-v6 { any; };". 
bind was trying repeatedly to bind ipv6 to br0 and tap{0,1} devices - 
and fails.

The memory usage is stable since 3 days:

$>ps axuww | grep named
named    14414  0.2  1.4 735040 116952 ?       Ssl  Apr29   4:38 
/usr/sbin/named -u named

$>free -h
                total        used        free      shared  buff/cache   
available
Mem:           7.5Gi       480Mi        55Mi       0.0Ki       7.0Gi     
   6.9Gi
Swap:           23Gi       378Mi        23Gi

Lets see what happens after the next update cycle.


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [gentoo-user] Machine hangs up with out of memory
  2021-04-30 10:09   ` Michael
  2021-04-30 23:07     ` Kai Peter
@ 2021-05-02  0:39     ` Adam Carter
  1 sibling, 0 replies; 6+ messages in thread
From: Adam Carter @ 2021-05-02  0:39 UTC (permalink / raw
  To: Gentoo User

[-- Attachment #1: Type: text/plain, Size: 2163 bytes --]

On Fri, Apr 30, 2021 at 8:09 PM Michael <confabulate@kintzios.com> wrote:

> On Friday, 30 April 2021 02:30:51 BST Adam Carter wrote:
> > On Wed, Apr 28, 2021 at 7:58 PM Kai Peter <kp@lists.qware.org> wrote:
> > > Hi,
> > >
> > > I have an issue with a machine where I'm not able to detect the real
> > > root cause. It hangs up totally. It seems like it was running out of
> > > memory - but why? Hopefully somebody can give me some insight. As far I
> > > can see right now, it hangs up a few hours after an `emerge --update
> > > --newuse --deep --with-bdeps=y @world`.
> > >
> > > The machine is an Intel Atom with 8 GB RAM (physical, max) and 24 GB
> > > swap (a file). So 32 GB RAM in total.
> >
> > Might be worth adding zswap using zstd or lz4 to your config (uses more
> CPU
> > for less IO)
> > https://www.kernel.org/doc/html/latest/vm/zswap.html
>
> Zswap will help optimise the swapping of cold pages out of RAM into the
> compressed zswap cache and eventually push these out to the disk.  This
> won't
> help with the problem of not having enough RAM to start with for compiling
> packages with large resource requirements, like e.g. Chromium.
>

Yep, but the system is already at its full RAM, so we're in
workaround/tactical solution territory. Any system that needs 24gig swap to
support 8gig ram is going to really suck.

>
> With monster packages where more RAM is needed even for a single
> compilation
> job, after all cold pages have been swapped out, hot pages will start
> being
> swapped out/in.  I think at this point the same I/O race condition will
> ensue
> as if no zswap were available, plus a compress/decompress CPU load.  I
> guess
> the point of starting to thrash the on-disk swap trying to free RAM may
> start
> sooner, since RAM which would otherwise be available for the single
> prioritised compilation job is now being used by zswap.  So, there would
> be a
> sweet spot in using zswap to improve I/O performance, but it could end up
> becoming a disbenefit on compilation of RAM hungry packages.
>
> Please correct my reasoning above if I have misunderstood how zswap works.
>

What you've said sounds reasonable to me.

[-- Attachment #2: Type: text/html, Size: 3011 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2021-05-02  0:39 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <C8P97-495-1@gated-at.bofh.it>
     [not found] ` <d9456416-d744-43a4-a9d0-6a70eb8ea0d6@iname.com>
2021-04-29 18:01   ` [gentoo-user] Machine hangs up with out of memory Kai Peter
2021-04-28  9:58 Kai Peter
2021-04-30  1:30 ` Adam Carter
2021-04-30 10:09   ` Michael
2021-04-30 23:07     ` Kai Peter
2021-05-02  0:39     ` Adam Carter

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox