From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from lists.gentoo.org ([140.105.134.102] helo=robin.gentoo.org) by nuthatch.gentoo.org with esmtp (Exim 4.43) id 1EL1IS-0000BH-S7 for garchives@archives.gentoo.org; Thu, 29 Sep 2005 16:30:13 +0000 Received: from robin.gentoo.org (localhost [127.0.0.1]) by robin.gentoo.org (8.13.5/8.13.5) with SMTP id j8TGKT9r015536; Thu, 29 Sep 2005 16:20:29 GMT Received: from amun.rz.tu-clausthal.de (amun.rz.tu-clausthal.de [139.174.2.12]) by robin.gentoo.org (8.13.5/8.13.5) with ESMTP id j8TGKT9b010801 for ; Thu, 29 Sep 2005 16:20:29 GMT Received: from amun.rz.tu-clausthal.de (localhost [127.0.0.1]) by localhost (Postfix) with SMTP id 71F552A7826 for ; Thu, 29 Sep 2005 18:27:53 +0200 (CEST) Received: from tu-clausthal.de (hathor.rz.tu-clausthal.de [139.174.2.1]) by amun.rz.tu-clausthal.de (Postfix) with ESMTP id A2FF127E052 for ; Thu, 29 Sep 2005 18:27:50 +0200 (CEST) Received: from energy.heim10.tu-clausthal.de ([139.174.241.94] verified) by tu-clausthal.de (CommuniGate Pro SMTP 4.3.6) with ESMTP id 8063661 for gentoo-amd64@lists.gentoo.org; Thu, 29 Sep 2005 18:27:50 +0200 From: "Hemmann, Volker Armin" To: gentoo-amd64@lists.gentoo.org Subject: Re: [gentoo-amd64] Re: oom killer problems Date: Thu, 29 Sep 2005 18:27:50 +0200 User-Agent: KMail/1.8.2 References: <200509282235.32195.volker.armin.hemmann@tu-clausthal.de> In-Reply-To: Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-amd64@gentoo.org Reply-to: gentoo-amd64@lists.gentoo.org MIME-Version: 1.0 Content-Type: text/plain; charset="iso-8859-1" Content-Disposition: inline Message-Id: <200509291827.50297.volker.armin.hemmann@tu-clausthal.de> X-Virus-Scanned: by PureMessage V4.7 at tu-clausthal.de X-Spam-Level: * (26%, 'BAYES_90_100 3, __CD 0, __CT 0, __CTE 0, __CTYPE_CHARSET_QUOTED 0, __CT_TEXT_PLAIN 0, __HAS_MSGID 0, __MIME_TEXT_ONLY 0, __MIME_VERSION 0, __PORN_PHRASE_15_0 0, __SANE_MSGID 0, __USER_AGENT 0') X-MIME-Autoconverted: from quoted-printable to 8bit by robin.gentoo.org id j8TGKT9b010801 Content-Transfer-Encoding: quoted-printable X-MIME-Autoconverted: from 8bit to quoted-printable by robin.gentoo.org id j8TGKTAt015536 X-Archives-Salt: e1b97e1f-6fb4-4be8-8dad-4aca1d33eb00 X-Archives-Hash: 285b32d2b33dce5606c20712edb3bfd6 On Thursday 29 September 2005 09:14, Duncan wrote: > Hemmann, Volker Armin posted > <200509282235.32195.volker.armin.hemmann@tu-clausthal.de>, excerpted > > kdeenablefinal requires HUGE amounts of memory, no doubt about it. I'v= e > not had serious issues with my gig of memory (dual Opterons as you seem= to > have), using kdeenablefinal here, but I've been doing things rather > different than you probably have, and any one of the things I've done > different may be the reason I haven't had the memory issue to the sever= ity > you have. > yeah, but with my 32bit system even 512mb were enough for building kdepim= with=20 kdeenablefinal > > The rest of the possibilities may or may not apply. You didn't include > the output of emerge info, so I can't compare the relevant info from > your system to mine. However, I suspect they /do/ apply, for reasons > which should be clear as I present them, below. > > 4. It appears (from the snipped stuff) you are running dual CPU (or a > single dual-core CPU). How many jobs do you have portage configured fo= r? > With my dual-CPU system, I originally had four set, but after seeing wh= at > KDE compiling with kdeenablefinal did to my memory resources, even a gi= g, > I decided I better reduce that to three! If you have four or more > parallel jobs set, THAT could very possibly be your problem, right ther= e. > You can probably do four or more jobs OR kdeenablefinal, but not BOTH, = at > least not BOTH, while running X and KDE at the same time! > no, single cpu, single core. Here is my emerge info: ortage 2.0.52-r1 (default-linux/amd64/2005.1, gcc-3.4.4, glibc-2.3.5-r1,=20 2.6.13-gentoo-r2 x86_64) =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D= =3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D=3D System uname: 2.6.13-gentoo-r2 x86_64 AMD Athlon(tm) 64 Processor 3200+ Gentoo Base System version 1.12.0_pre8 ccache version 2.4 [disabled] dev-lang/python: 2.3.5, 2.4.2 sys-apps/sandbox: 1.2.13 sys-devel/autoconf: 2.13, 2.59-r7 sys-devel/automake: 1.4_p6, 1.5, 1.6.3, 1.7.9-r1, 1.8.5-r3, 1.9.6 sys-devel/binutils: 2.16.1 sys-devel/libtool: 1.5.20 virtual/os-headers: 2.6.11-r2 ACCEPT_KEYWORDS=3D"amd64 ~amd64" AUTOCLEAN=3D"yes" CBUILD=3D"x86_64-pc-linux-gnu" CFLAGS=3D"-march=3Dk8 -O2 -fweb -ftracer -fpeel-loops -msse3 -pipe" CHOST=3D"x86_64-pc-linux-gnu" CONFIG_PROTECT=3D"/etc /usr/kde/2/share/config /usr/kde/3.4/env /usr/kde/= 3.4/share/config /usr/kde/3.4/shutdown /usr/kde/3/share/config /usr/lib/X= 11/xkb /usr/share/config /var/qmail/control" CONFIG_PROTECT_MASK=3D"/etc/gconf /etc/terminfo /etc/env.d" CXXFLAGS=3D"-march=3Dk8 -O2 -fweb -ftracer -fpeel-loops -msse3 -pipe" DISTDIR=3D"/usr/portage/distfiles" FEATURES=3D"autoconfig distlocks sandbox sfperms strict" GENTOO_MIRRORS=3D"ftp://ftp.tu-clausthal.de/pub/linux/gentoo/" LC_ALL=3D"de_DE@euro" LINGUAS=3D"de" MAKEOPTS=3D"-j2" PKGDIR=3D"/usr/portage/packages" PORTAGE_TMPDIR=3D"/var/tmp" PORTDIR=3D"/usr/portage" SYNC=3D"rsync://rsync.gentoo.org/gentoo-portage" USE=3D"amd64 S3TC X acpi alsa audiofile avi bash-completion berkdb bitmap= -fonts=20 bluetooth bzip2 cairo cdparanoia cdr cpudetection crypt curl dvd dvdr dvd= read=20 emboss emul-linux-x86 encode exif ffmpeg fftw foomaticdb fortran ftp gif = gimp=20 glitz glut glx gnokii gpm gstreamer gtk gtk2 icq id3 imagemagick imlib ir= mc=20 jabber java javascrip jp2 jpeg jpeg2k kde kdeenablefinal kdepim lame less= tif=20 libwww lm_sensors lzo lzw lzw-tiff mad matroska mjpeg mmap mng motif mp3 = mpeg=20 mpeg2 mplayer mysql ncurses nls no-old-linux nocd nosendmail nowin nptl=20 nsplugin nvidia offensive ogg openal opengl oscar pam pdflib perl player = png=20 posix python qt quicktime rar readline reiserfs scanner sdl sendfile=20 sharedmem sms sndfile sockets spell ssl stencil-buffer subtitles svg sysf= s=20 tcpd tga theora tiff transcode truetype truetype-fonts type1 type1-fonts=20 unicode usb userlocales v4l v4l2 vcd videos visualization vorbis wmf xani= m=20 xine xml xml2 xpm xrandr xsl xv xvid xvmc yv12 zlib zvbi linguas_de=20 userland_GNU kernel_linux elibc_glibc" Unset: ASFLAGS, CTARGET, LANG, LDFLAGS, PORTDIR_OVERLAY as you can see, makeopts is at -j2 > > (Note that the unsermake thing could compound the issue here, because a= s I > said, it's better at finding things to run in parallel than the normal > make system is.) > > 5. I'm now running gcc-4.0.1, and have been compiling kde with > gcc-4.0.0-preX or later since kde-3.4.0. gcc-4.x is still package.mask= -ed > on Gentoo, because some packages still don't compile with it. Of cours= e, > that's easily worked around because Gentoo slots gcc, so I have the lat= est > gcc-3.4.x installed, in addition to gcc-4.x, and can (and do) easily > switch between them using gcc-config. However, the fact that gcc-4 is > still masked for Gentoo, means you probably aren't running it, while I = am, > and that's what I compile kde with. The 4.x version is enough differen= t > from 3.4.x that memory use can be expected to be rather different as we= ll. > It's quite possible that the kdeenablefinal stuff requires even more > memory with gcc-3.x than it does with the 4.x I've been successfully > using. hm, I read some stuff on anandtech, that shows, that the apps compiled wi= th=20 gcc4 are a LOT slower than apps compiled with 3.4 on the amd64 platform. = So I=20 stay away from it, until I see some numbers, that convince me to the oppo= site=20 - and until I can be sure, that almost everything builds with it ;) > > 7. I don't do my kernels thru Gentoo, preferring instead to use the > kernel straight off of kernel.org, You say kernel 2.6.13-r2, the r2 > indicating a Gentoo revision, but you don't say /which/ Gentoo kernel y= ou > are running. The VMM is complex enough and has a wide enough variety o= f > patches circulating for it, that it's possible you hit a bug that wasn'= t > in the mainline kernel.org kernel that I'm running. Or... it may be so= me > other factor in our differing kernel configs. yes I said, at the bottom of my mail: kernel is 2.6.13-r2 > ... > > Now to the theory. Why would OOM trigger when you had all that free sw= ap? > There are two possible explanations I am aware of and maybe others that > I'm not. > > 1. "Memory allocation" is a verb as well as a noun. > > We know that enablefinal uses lots of memory. The USE flag description > mentions that and we've discovered it to be /very/ true. If you run > ksysguard on your panel as I do, and monitor memory using it as I do (o= r > run a VT with a top session running if compiling at the text console), = you > are also aware that memory use during compile sessions, particularly KD= E > compile sessions with enablefinal set, varies VERY drastically! From m= y > observations, each "job" will at times eat more and more memory, until > with kmail in particular, multiple jobs are taking well over 200MB of > memory a piece! (See why I mentioned parallel jobs above? At 200, > possibly 300+ MB apiece, multiple parallel jobs eat up the memory VERY > fast!) After grabbing more and more memory for awhile, a job will > suddenly complete and release it ALL at once. The memory usage graph w= ill > suddenly drop multiple hundreds of megabytes -- for ONE job! i watched the memory consumption with gkrellm2. At first, there were several hundered mb free, dropping fast to ~150mb fr= ee,=20 which droppend slower to 20-50mb free. There it was 'locked' for some tim= e,=20 when suddenly the oom-killer sprang in (I did not watch gkrellm continous= ly,=20 even with a 3200+ kdepim takes more time to built, than I can watch gkrel= lm=20 without a break). But the behaviour was the same for 512mb or 1 gb of ram. > Well, during the memory usage increase phase, each job will allocate mo= re > and more memory, a chunk at a time. It's possible (tho not likely from= my > observations of this particular usage pattern) that an app could want X= MB > of memory all at once, in ordered to complete the task. Until it gets > that memory it can't go any further, the task it is trying to do is hal= f > complete so it can't release any memory either, without losing what it = has > already done. If the allocation request is big enough, (or you have > several of them in parallel all at the same time that together are big > enough), it can cause the OOM to trigger even with what looks like quit= e a > bit of free memory left, because all available cache and other memory t= hat > can be freed has already been freed, and no app can continue to the poi= nt > of being able to release memory, without grabbing some memory first. I= f > one of them is wanting a LOT of memory, and the OOM killer isn't killin= g > it off first (there are various OOM killer algorithms out there, some > using different factors for picking the app to die than others), stuff > will start dieing to allow the app wanting all that memory to get it. > > Of course, it could also be very plainly a screwed up VMM or OOM killer= , > as well. These things aren't exactly simple to get right... and if gcc > took an unexpected optimization that has side effects... > > 2. There is memory and there is "memory", and then there is 'memory' a= nd > "'memory'" and '"memory"' as well. > > There is of course the obvious difference between real/physical and > swap/virtual memory, with real memory being far faster (while at the sa= me > time being slower than L2 cache, which is slower than L1 cache, which i= s > slower than the registers, which can be accessed at full CPU speed, but > that's beside the point for this discussion). > > That's only the tip of the iceberg, however. From the software's > perspective, that division mainly affects locked memory vs swappable > memory. The kernel is always locked memory -- it cannot be swapped, ev= en > drivers that are never used, the reason it makes sense to keep your ker= nel > as small as possible, leaving more room in real memory for programs to > use. Depending on your kernel and its configuration, various forms of > RAMDISK, ramfs vs tmpfs vs ... may be locked (or not). Likewise, some > kernel patches and configs make it easier or harder for applications to > lock memory as well. Maybe a complicating factor here is that you had = a > lot of locked memory and the compile process required more locked memor= y > than was left? I'm not sure how much locked memory a normal process on= a > normal kernel can have, if any, but given both that and the fact that t= he > kernel you were running is unknown, it's a possibility. I don't use ramdisks, and the only tempfs user is udev - with ~180kb used. > > Then there are the "memory zones". Fortunately, amd64 is less complica= ted > in this respect than x86. However, various memory zones do still exist= , > and not only do some things require memory in a specific zone, but it c= an > be difficult to transfer in-use memory from one zone to another, even > where it COULD be placed in a different zone. Up until earlier this > year, it was often impossible to transfer memory between zones without > using the backing store (swap). That was the /only/ way possible! > However, as I said, amd64 is less complicated in this respect than x86,= so > memory zones weren't likely the issue here -- unless something was goin= g > wrong, of of course. > > Finally, there's the "contiguous memory" issue. Right after boot, your > system has lots of free memory, in large blobs of contiguous pages. It= 's > easy to get contiguous memory allocated in blocks of 256, 512, and 1024 > pages at once. As uptime increases, however, memory gets fragmented th= ru > normal use. A system that has been up awhile will have far fewer 1024 > page blocks immediately available for use, and fewer 512 and 256 page > blocks as well. Total memory available may be the same, but if it's all= in > 1 and 2 page blocks, it'll take some serious time to move stuff around = to > allocate a 1024 page contiguous block -- if it's even possible to do at > all. Given the type of memory access patterns I've observed during kde > merges with enablefinal on, while I'm not technically skilled enough to > verify my suspicions, of the listed possibilities which are those I kno= w, > I believe this to be the most likely culprit, the reason the OOM killer > was activating even while swap (and possibly even main memory) was stil= l > free. > > I'm sure there are other variations on the theme, however, other memory > type restrictions, and it may have been one of /those/ that it just so > happened came up short at the time you needed it. In any case, as shou= ld > be quite plain by now, a raw "available memory" number doesn't give > /anything/ /even/ /close/ to the entire picture, at the detail needed t= o > fully grok why the OOM killer was activating, when overall memory wasn'= t > apparently in short supply at all. > > I should also mention those numbers I snipped. I know enough to just > begin to make a bit of sense out of them, but not enough to /understand= / > them, at least to the point of understanding what they are saying is > wrong. You can see the contiguous memory block figures for each of th= e > DMA and normal memory zones. 4kB pages, so the 1024 page blocks are 4M= B. > I just don't understand enough about the internals to grok either them = or > this log snip, however. I know the general theories and hopefully > explained them well enough, but don't know how they apply concretely. > Perhaps someone else does. > thanks for your time - I will try vanilla kernel.org kernels this weekend= and=20 if there is any difference, I will post again. Gl=FCck Auf Volker --=20 gentoo-amd64@gentoo.org mailing list