From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.43) id 1EKsgF-0001yB-5a for garchives@archives.gentoo.org; Thu, 29 Sep 2005 07:18:11 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.5/8.13.5) with SMTP id j8T79THx019235; Thu, 29 Sep 2005 07:09:29 GMT Received: from ciao.gmane.org (main.gmane.org [80.91.229.2]) by robin.gentoo.org (8.13.5/8.13.5) with ESMTP id j8T79Sag000137 for ; Thu, 29 Sep 2005 07:09:29 GMT Received: from list by ciao.gmane.org with local (Exim 4.43) id 1EKsdg-0000OW-8I for gentoo-amd64@lists.gentoo.org; Thu, 29 Sep 2005 09:15:32 +0200 Received: from ip68-230-97-182.ph.ph.cox.net ([68.230.97.182]) by main.gmane.org with esmtp (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 29 Sep 2005 09:15:32 +0200 Received: from 1i5t5.duncan by ip68-230-97-182.ph.ph.cox.net with local (Gmexim 0.1 (Debian)) id 1AlnuQ-0007hv-00 for ; Thu, 29 Sep 2005 09:15:32 +0200 X-Injected-Via-Gmane: http://gmane.org/ To: gentoo-amd64@lists.gentoo.org From: Duncan <1i5t5.duncan@cox.net> Subject: [gentoo-amd64] Re: oom killer problems Date: Thu, 29 Sep 2005 00:14:56 -0700 Organization: Sometimes Message-ID: References: <200509282235.32195.volker.armin.hemmann@tu-clausthal.de> Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-amd64@gentoo.org Reply-to: gentoo-amd64@lists.gentoo.org Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 8bit X-Complaints-To: usenet@sea.gmane.org X-Gmane-NNTP-Posting-Host: ip68-230-97-182.ph.ph.cox.net User-Agent: Pan/0.14.2.91 (As She Crawled Across the Table) Sender: news X-Archives-Salt: 2bea6d8a-1fa5-4972-a1b9-d81cf86a65c1 X-Archives-Hash: 6727e2acc4a85f0b5e25aaedb7c6b901 Hemmann, Volker Armin posted <200509282235.32195.volker.armin.hemmann@tu-clausthal.de>, excerpted below, on Wed, 28 Sep 2005 22:35:32 +0200: > Hi, > when I try to emerge kdepim-3.4.2 with the kdeenablefinal use-flag I get > a lot of oom-kills. > I got them with 512mb, so I upgraded to 1gig and still have them. What > puzzles me is, that I have a lot of swap free when it happens.. could > someone please tell me, why the oom-killer becomes active, when there is > still a lot of free swap? > I am just an user, so using easy words would be much appreciated ;) > [snip] > > kernel is 2.6.13-r2 > I have 1gb of ram, and approximatly 1gb of swap. > > I emerged kdepim without kdeenablefinal, so there is no big pressure, > I am > just curious There's something about the "lots of swap left" thing below. However, that's theory, I'll cover the practical stuff first, leaving that aspect for later. kdeenablefinal requires HUGE amounts of memory, no doubt about it. I've not had serious issues with my gig of memory (dual Opterons as you seem to have), using kdeenablefinal here, but I've been doing things rather different than you probably have, and any one of the things I've done different may be the reason I haven't had the memory issue to the severity you have. 1. I have swap entirely disabled. Here was my reasosning (apart from the issue at hand). I was reading an explanation of some of the aspects of the kernel VMM (virtual memory manager) on LWN (Linux Weekly News, lwn.net), when I suddenly realized that all the complexity they were describing I could probably do without, by turning off swap, since I'd recently upgraded to a gig of RAM. I reasoned that I normally ran a quarter to a third of that in application memory, so even if I doubled normal use at times, I'd still have a third of a gig of free memory available for cache. Further, I reasoned that if something should use all that memory and STILL run out, it was likely a runaway process, gobbling all the memory available, and that I might as well have it activate the OOM killer at a gig, without further lugging the system down, than at 2 G (or whatever), lugging the system down with a swap storm so I couldn't do anything about it anyway. For the most part, I've been quite happy with my decision, altho now that suspend is starting to look like it'll work for dual CPU systems (suspend to RAM sort of worked, for the first time here, early in the .13 rcs, but they reverted it for .13 release, as it needed more work), I may enable swap again, if only to get suspend to disk functionality. Of course, I'm not saying disabling swap is the right thing for you, but I've been happy with it, here. Anyway, a gig of RAM, swap disabled, so the VMM complexity that's part of managing swap also disabled. It's possible that's a factor, tho I'm guessing the stuff below is more likely. 2. Possibly the biggest factor is the KDE packages used. I'm using the split-ebuilds, NOT the monolithic category packages. It's possible that's the difference. Further, I don't have all the split-packages that compose kdepim-meta merged. I have kmail and knode merged, with dependencies of course, but don't have a handheld to worry about syncing to, so skipped all those split-ebuilds that form part of kdepim-meta (and are part of the monolithic ebuild), except where kmail/knode etc had them as dependencies. Thus, no kitchensync, korn, kandy, kdepim-kresources, etc. There are therefore two possibilities here. One is that one of the individual apps I skipped requires more memory. The other is that the monolithic ebuild you used does several things at once (possibly due to your jobs setting, see below) where the split ebuilds do them in series, therefore limiting the maximum memory required at a given moment. 3. I'm NOT using unsermake. For some reason, it hasn't worked for me since KDE 3.2 or so. I've tried different versions, but always had either an error, or despite my settings, the ebuild doesn't seem to register unsermake and thus uses the normal make system. Unsermake is better at parallellizing the various jobs, making more efficient use of multiple CPUs, but also, given the memory required for enable final, likely causing higher memory stress than ordinary gnu-make does. If you are using that and it's otherwise working for you, that may be the difference. The rest of the possibilities may or may not apply. You didn't include the output of emerge info, so I can't compare the relevant info from your system to mine. However, I suspect they /do/ apply, for reasons which should be clear as I present them, below. 4. It appears (from the snipped stuff) you are running dual CPU (or a single dual-core CPU). How many jobs do you have portage configured for? With my dual-CPU system, I originally had four set, but after seeing what KDE compiling with kdeenablefinal did to my memory resources, even a gig, I decided I better reduce that to three! If you have four or more parallel jobs set, THAT could very possibly be your problem, right there. You can probably do four or more jobs OR kdeenablefinal, but not BOTH, at least not BOTH, while running X and KDE at the same time! I should mention that I sometimes run multiple emerges (each with three jobs) in parallel. I *DID* run into OOM issues when trying to do that with kmail and another large KDE package. Kmail is of course part of kdepim, and my experience DOES confirm that it's one of the largest in memory requirements, with kdeenablefinal set. I could emerge small things in parallel with it, stuff like kworldwatch, say, but nothing major, like konqueror. Thus, I can almost certainly say that six jobs will trigger the OOM killer, when some of them are kmail, and could speculate that five jobs would do it, at some point in the kmail compilation. Four jobs may or may not work, but three did, for me, under the conditions explained in the other six points, of course. (Note that the unsermake thing could compound the issue here, because as I said, it's better at finding things to run in parallel than the normal make system is.) 5. I'm now running gcc-4.0.1, and have been compiling kde with gcc-4.0.0-preX or later since kde-3.4.0. gcc-4.x is still package.mask-ed on Gentoo, because some packages still don't compile with it. Of course, that's easily worked around because Gentoo slots gcc, so I have the latest gcc-3.4.x installed, in addition to gcc-4.x, and can (and do) easily switch between them using gcc-config. However, the fact that gcc-4 is still masked for Gentoo, means you probably aren't running it, while I am, and that's what I compile kde with. The 4.x version is enough different from 3.4.x that memory use can be expected to be rather different as well. It's quite possible that the kdeenablefinal stuff requires even more memory with gcc-3.x than it does with the 4.x I've been successfully using. 6. It's also possible something else in the configuration affects compile-time memory usage. There are CFLAGS, of course, and I'm also running newer (and still masked, AFAIK) versions of binutils and glibc, with patches specifically for gcc-4. 7. I don't do my kernels thru Gentoo, preferring instead to use the kernel straight off of kernel.org, You say kernel 2.6.13-r2, the r2 indicating a Gentoo revision, but you don't say /which/ Gentoo kernel you are running. The VMM is complex enough and has a wide enough variety of patches circulating for it, that it's possible you hit a bug that wasn't in the mainline kernel.org kernel that I'm running. Or... it may be some other factor in our differing kernel configs. ... Now to the theory. Why would OOM trigger when you had all that free swap? There are two possible explanations I am aware of and maybe others that I'm not. 1. "Memory allocation" is a verb as well as a noun. We know that enablefinal uses lots of memory. The USE flag description mentions that and we've discovered it to be /very/ true. If you run ksysguard on your panel as I do, and monitor memory using it as I do (or run a VT with a top session running if compiling at the text console), you are also aware that memory use during compile sessions, particularly KDE compile sessions with enablefinal set, varies VERY drastically! From my observations, each "job" will at times eat more and more memory, until with kmail in particular, multiple jobs are taking well over 200MB of memory a piece! (See why I mentioned parallel jobs above? At 200, possibly 300+ MB apiece, multiple parallel jobs eat up the memory VERY fast!) After grabbing more and more memory for awhile, a job will suddenly complete and release it ALL at once. The memory usage graph will suddenly drop multiple hundreds of megabytes -- for ONE job! Well, during the memory usage increase phase, each job will allocate more and more memory, a chunk at a time. It's possible (tho not likely from my observations of this particular usage pattern) that an app could want X MB of memory all at once, in ordered to complete the task. Until it gets that memory it can't go any further, the task it is trying to do is half complete so it can't release any memory either, without losing what it has already done. If the allocation request is big enough, (or you have several of them in parallel all at the same time that together are big enough), it can cause the OOM to trigger even with what looks like quite a bit of free memory left, because all available cache and other memory that can be freed has already been freed, and no app can continue to the point of being able to release memory, without grabbing some memory first. If one of them is wanting a LOT of memory, and the OOM killer isn't killing it off first (there are various OOM killer algorithms out there, some using different factors for picking the app to die than others), stuff will start dieing to allow the app wanting all that memory to get it. Of course, it could also be very plainly a screwed up VMM or OOM killer, as well. These things aren't exactly simple to get right... and if gcc took an unexpected optimization that has side effects... 2. There is memory and there is "memory", and then there is 'memory' and "'memory'" and '"memory"' as well. There is of course the obvious difference between real/physical and swap/virtual memory, with real memory being far faster (while at the same time being slower than L2 cache, which is slower than L1 cache, which is slower than the registers, which can be accessed at full CPU speed, but that's beside the point for this discussion). That's only the tip of the iceberg, however. From the software's perspective, that division mainly affects locked memory vs swappable memory. The kernel is always locked memory -- it cannot be swapped, even drivers that are never used, the reason it makes sense to keep your kernel as small as possible, leaving more room in real memory for programs to use. Depending on your kernel and its configuration, various forms of RAMDISK, ramfs vs tmpfs vs ... may be locked (or not). Likewise, some kernel patches and configs make it easier or harder for applications to lock memory as well. Maybe a complicating factor here is that you had a lot of locked memory and the compile process required more locked memory than was left? I'm not sure how much locked memory a normal process on a normal kernel can have, if any, but given both that and the fact that the kernel you were running is unknown, it's a possibility. Then there are the "memory zones". Fortunately, amd64 is less complicated in this respect than x86. However, various memory zones do still exist, and not only do some things require memory in a specific zone, but it can be difficult to transfer in-use memory from one zone to another, even where it COULD be placed in a different zone. Up until earlier this year, it was often impossible to transfer memory between zones without using the backing store (swap). That was the /only/ way possible! However, as I said, amd64 is less complicated in this respect than x86, so memory zones weren't likely the issue here -- unless something was going wrong, of of course. Finally, there's the "contiguous memory" issue. Right after boot, your system has lots of free memory, in large blobs of contiguous pages. It's easy to get contiguous memory allocated in blocks of 256, 512, and 1024 pages at once. As uptime increases, however, memory gets fragmented thru normal use. A system that has been up awhile will have far fewer 1024 page blocks immediately available for use, and fewer 512 and 256 page blocks as well. Total memory available may be the same, but if it's all in 1 and 2 page blocks, it'll take some serious time to move stuff around to allocate a 1024 page contiguous block -- if it's even possible to do at all. Given the type of memory access patterns I've observed during kde merges with enablefinal on, while I'm not technically skilled enough to verify my suspicions, of the listed possibilities which are those I know, I believe this to be the most likely culprit, the reason the OOM killer was activating even while swap (and possibly even main memory) was still free. I'm sure there are other variations on the theme, however, other memory type restrictions, and it may have been one of /those/ that it just so happened came up short at the time you needed it. In any case, as should be quite plain by now, a raw "available memory" number doesn't give /anything/ /even/ /close/ to the entire picture, at the detail needed to fully grok why the OOM killer was activating, when overall memory wasn't apparently in short supply at all. I should also mention those numbers I snipped. I know enough to just begin to make a bit of sense out of them, but not enough to /understand/ them, at least to the point of understanding what they are saying is wrong. You can see the contiguous memory block figures for each of the DMA and normal memory zones. 4kB pages, so the 1024 page blocks are 4MB. I just don't understand enough about the internals to grok either them or this log snip, however. I know the general theories and hopefully explained them well enough, but don't know how they apply concretely. Perhaps someone else does. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman in http://www.linuxdevcenter.com/pub/a/linux/2004/12/22/rms_interview.html -- gentoo-amd64@gentoo.org mailing list