From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (unknown [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id DE5021381FA for ; Fri, 9 May 2014 15:30:00 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 56DFFE08E1; Fri, 9 May 2014 15:29:59 +0000 (UTC) Received: from mail.gomersbach.nl (unknown [217.198.27.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id 6DC45E08D9 for ; Fri, 9 May 2014 15:29:58 +0000 (UTC) Received: from [192.168.0.4] (tech.ams0.true.nl [87.233.232.233]) (using TLSv1 with cipher ECDHE-RSA-AES128-SHA (128/128 bits)) (No client certificate requested) (Authenticated sender: mark) by mail.gomersbach.nl (Postfix) with ESMTPSA id B09562E0008 for ; Fri, 9 May 2014 18:22:38 +0200 (CEST) Message-ID: <536CF473.9050103@gomersbach.nl> Date: Fri, 09 May 2014 17:29:55 +0200 From: Mark Gomersbach User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:24.0) Gecko/20100101 Thunderbird/24.5.0 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-hardened@lists.gentoo.org Reply-to: gentoo-hardened@lists.gentoo.org MIME-Version: 1.0 To: gentoo-hardened@lists.gentoo.org Subject: Re: [gentoo-hardened] Weird coincidental PAX crashes References: <536CF0F7.3030602@gentoo.org> In-Reply-To: <536CF0F7.3030602@gentoo.org> X-Enigmail-Version: 1.6 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit X-Archives-Salt: 978bf945-f4e8-41c9-8ed7-36078fb128a5 X-Archives-Hash: dd6fc1fa7659cb075b81ef4a79669b47 Maybe a bug somewhere else too, which combination kernel/grsec/pax was used? On 05/09/2014 05:15 PM, Michael Orlitzky wrote: > Last week, the LMTP daemon on our mail server (HP DL360 G6) crashed. > People noticed that the mail stopped coming in, so I SSHed in to check > on it, and there were some weird traces in the dmesg. While trying to > investigate, I noticed some more badness: > > # emerge -1 openntpd > Calculating dependencies... done! > > >>> Verifying ebuild manifests > Killed > > At that point I'm thinking, "hardware problem, there goes the weekend." > Most of my tools are committing suicide so I surrender and reboot. The > thing comes up fine and has been working ever since. > > Today, another one of our web servers (HP DL360 G5?) does the same > thing. The nightly log report was empty, because there's no syslog > daemon running. This morning dmesg shows: > >> [Fri May 9 11:00:42 2014] PAX: refcount overflow detected in: syslog-ng:21823, uid/euid: 0/0 >> [Fri May 9 11:00:42 2014] CPU: 2 PID: 21823 Comm: syslog-ng Not tainted 3.11.7-hardened-r1 #1 >> [Fri May 9 11:00:42 2014] task: ffff8802cffca080 ti: ffff8802cffca488 task.ti: ffff8802cffca488 >> [Fri May 9 11:00:42 2014] RIP: 0010:[] [] 0xffffffff810e311e >> [Fri May 9 11:00:42 2014] RSP: 0018:ffff880416f21c78 EFLAGS: 00000a96 >> [Fri May 9 11:00:42 2014] RAX: ffff88041f0048a0 RBX: ffff88041a1edf00 RCX: 0000000040276333 >> [Fri May 9 11:00:42 2014] RDX: 0000000040276332 RSI: 0000000000000000 RDI: ffff88041d858720 >> [Fri May 9 11:00:42 2014] RBP: 0000000000000008 R08: 0000000000010bc0 R09: ffff88042fb10bc0 >> [Fri May 9 11:00:42 2014] R10: 8000000000000000 R11: ffffea000fec3040 R12: ffff88041f0048a0 >> [Fri May 9 11:00:42 2014] R13: ffff88026628ef00 R14: ffff88041d858720 R15: ffff88041a1edf10 >> [Fri May 9 11:00:42 2014] FS: 0000000000000000(0000) GS:ffff88042fb00000(0000) knlGS:0000000000000000 >> [Fri May 9 11:00:42 2014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [Fri May 9 11:00:42 2014] CR2: 0000035fb5abf850 CR3: 000000000138a000 CR4: 00000000000006b0 >> [Fri May 9 11:00:42 2014] Stack: >> [Fri May 9 11:00:42 2014] 0000000000000000 ffffffff818dde60 ffff8804140ac100 ffff8802cffca570 >> [Fri May 9 11:00:42 2014] ffff8802cffca080 ffff880416eb4200 ffff8802cffca080 ffffffff81052750 >> [Fri May 9 11:00:42 2014] 0000000000000000 0000000000000001 ffff88038e6260d8 ffff8802cffca598 >> [Fri May 9 11:00:42 2014] Call Trace: >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff81052750 >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff81036e10 >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff810371e8 >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff810449cc >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff8100241f >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff81002a89 >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff8137c212 >> [Fri May 9 11:00:42 2014] Code: e9 68 fd 01 00 0f 1f 84 00 00 00 00 00 48 8b 43 18 48 8b 7b 10 48 8b 40 30 f0 ff 88 30 01 00 00 71 09 f0 ff 80 30 01 00 00 cd 04 <0f> b7 00 89 c2 66 81 e2 00 b0 66 81 fa 00 20 0f 84 53 ff ff ff >> [Fri May 9 11:00:42 2014] PAX: refcount overflow detected in: syslog-ng:21823, uid/euid: 0/0 >> [Fri May 9 11:00:42 2014] CPU: 2 PID: 21823 Comm: syslog-ng Not tainted 3.11.7-hardened-r1 #1 >> [Fri May 9 11:00:42 2014] task: ffff8802cffca080 ti: ffff8802cffca488 task.ti: ffff8802cffca488 >> [Fri May 9 11:00:42 2014] RIP: 0010:[] [] 0xffffffff810e311e >> [Fri May 9 11:00:42 2014] RSP: 0018:ffff880416f21c78 EFLAGS: 00000a96 >> [Fri May 9 11:00:42 2014] RAX: ffff88041f0048a0 RBX: ffff88041a1edc00 RCX: 0000000040c384f8 >> [Fri May 9 11:00:42 2014] RDX: 0000000040c384f7 RSI: 0000000000000000 RDI: ffff88041d858720 >> [Fri May 9 11:00:42 2014] RBP: 0000000000000008 R08: 0000000000010b60 R09: ffff88042fb10b60 >> [Fri May 9 11:00:42 2014] R10: 8000000000000000 R11: ffffea000f26a840 R12: ffff88041f0048a0 >> [Fri May 9 11:00:42 2014] R13: ffff88026628e000 R14: ffff88041d858720 R15: ffff88041a1edc10 >> [Fri May 9 11:00:42 2014] FS: 0000000000000000(0000) GS:ffff88042fb00000(0000) knlGS:0000000000000000 >> [Fri May 9 11:00:42 2014] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [Fri May 9 11:00:42 2014] CR2: 0000035fb5abf850 CR3: 000000000138a000 CR4: 00000000000006b0 >> [Fri May 9 11:00:42 2014] Stack: >> [Fri May 9 11:00:42 2014] 0000000000000000 ffffffff818dde60 ffff88041a1ed400 ffff8802cffca570 >> [Fri May 9 11:00:42 2014] ffff8802cffca080 ffff880416eb4200 ffff8802cffca080 ffffffff81052750 >> [Fri May 9 11:00:42 2014] 0000000000000000 0000000000000001 ffff88038e6260d8 ffff8802cffca598 >> [Fri May 9 11:00:42 2014] Call Trace: >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff81052750 >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff81036e10 >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff810371e8 >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff810449cc >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff8100241f >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff81002a89 >> [Fri May 9 11:00:42 2014] [] ? 0xffffffff8137c212 >> [Fri May 9 11:00:42 2014] Code: e9 68 fd 01 00 0f 1f 84 00 00 00 00 00 48 8b 43 18 48 8b 7b 10 48 8b 40 30 f0 ff 88 30 01 00 00 71 09 f0 ff 80 30 01 00 00 cd 04 <0f> b7 00 89 c2 66 81 e2 00 b0 66 81 fa 00 20 0f 84 53 ff ff ff > > > And things are segfaulting randomly. These machines have been running > 3.11.7-hardened-r1 since 2014-01-03 without issue until now -- all of > our servers have. So the timing seems a little coincidental. > > If it's not hardware (two different machines...), does this look like a > kernel bug? Should I upgrade over the weekend and pray? >