From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <gentoo-server+bounces-3846-garchives=archives.gentoo.org@lists.gentoo.org>
Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80])
	by finch.gentoo.org (Postfix) with ESMTP id 56D8A1381F3
	for <garchives@archives.gentoo.org>; Sun, 28 Jul 2013 14:01:46 +0000 (UTC)
Received: from pigeon.gentoo.org (localhost [127.0.0.1])
	by pigeon.gentoo.org (Postfix) with SMTP id 15FC8E09AA;
	Sun, 28 Jul 2013 14:01:34 +0000 (UTC)
Received: from out1-smtp.messagingengine.com (out1-smtp.messagingengine.com [66.111.4.25])
	(using TLSv1 with cipher AECDH-AES256-SHA (256/256 bits))
	(No client certificate requested)
	by pigeon.gentoo.org (Postfix) with ESMTPS id 6351CE0815
	for <gentoo-server@lists.gentoo.org>; Sun, 28 Jul 2013 14:01:33 +0000 (UTC)
Received: from compute3.internal (compute3.nyi.mail.srv.osa [10.202.2.43])
	by gateway1.nyi.mail.srv.osa (Postfix) with ESMTP id 8D51B20D38
	for <gentoo-server@lists.gentoo.org>; Sun, 28 Jul 2013 10:01:32 -0400 (EDT)
Received: from frontend1 ([10.202.2.160])
  by compute3.internal (MEProxy); Sun, 28 Jul 2013 10:01:32 -0400
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=fastmail.co.uk; h=
	message-id:date:from:mime-version:to:subject:references
	:in-reply-to:content-type:content-transfer-encoding; s=mesmtp;
	 bh=Gvl9uYzDLS/OoCOfK0l8tdSk1k4=; b=KQcT//vATm7ZalKFRLRjJqcWklFN
	99lH2c2U2EJcvHMCtOMUo/YoMyJ+k4Y/Ne07nKD8loNjOAfWYD70/s8mxVI064l1
	Bwzk1L90qkYD2Z4pJyYc0qXYYCPQWuLbZCQI959eXiHqd/iIfgC0+Sl+OfVlBcWI
	pB2J2PPENqq3tGI=
DKIM-Signature: v=1; a=rsa-sha1; c=relaxed/relaxed; d=
	messagingengine.com; h=message-id:date:from:mime-version:to
	:subject:references:in-reply-to:content-type
	:content-transfer-encoding; s=smtpout; bh=Gvl9uYzDLS/OoCOfK0l8td
	Sk1k4=; b=ZW+58smRwi/nI29kdmwSAbb9L381+TVOHrLNp9zlEeNj1eiMAtAwTS
	R9DkMs5QR3fOAgoP4H/deFIJBbv++ZWIgiNTI06Y3iTy7F4rf/+MSmTEutNshK4Q
	Oe9CB3KCmZU6zezqgDUrcaVUd4N+lJHcL2hAOF5Nalhx9HpX9Vdmc=
X-Sasl-enc: 30PxMMBKWG+hyvRt3AWb9PKbA7ADQ1nDRp+0U7+7weuv 1375020092
Received: from [192.168.1.100] (unknown [94.170.82.148])
	by mail.messagingengine.com (Postfix) with ESMTPA id 19A19C00E7F
	for <gentoo-server@lists.gentoo.org>; Sun, 28 Jul 2013 10:01:31 -0400 (EDT)
Message-ID: <51F5243A.2080208@fastmail.co.uk>
Date: Sun, 28 Jul 2013 15:01:30 +0100
From: Kerin Millar <kerframil@fastmail.co.uk>
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:17.0) Gecko/20130620 Thunderbird/17.0.7
Precedence: bulk
List-Post: <mailto:gentoo-server@lists.gentoo.org>
List-Help: <mailto:gentoo-server+help@lists.gentoo.org>
List-Unsubscribe: <mailto:gentoo-server+unsubscribe@lists.gentoo.org>
List-Subscribe: <mailto:gentoo-server+subscribe@lists.gentoo.org>
List-Id: Gentoo Linux mail <gentoo-server.gentoo.org>
X-BeenThere: gentoo-server@lists.gentoo.org
Reply-to: gentoo-server@lists.gentoo.org
MIME-Version: 1.0
To: gentoo-server@lists.gentoo.org
Subject: Re: [gentoo-server] DoS Analysis and Prevemption
References: <CA+qvzFOR1RJZUmDET-_3Sq86vD-hnX+95JTP9P0Q+Pa4Gu4Sqg@mail.gmail.com>
In-Reply-To: <CA+qvzFOR1RJZUmDET-_3Sq86vD-hnX+95JTP9P0Q+Pa4Gu4Sqg@mail.gmail.com>
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
X-Archives-Salt: d5bf2524-835d-4ae0-a34e-ab3a8e8d9e6a
X-Archives-Hash: af06dd73080f5ddca75bafb1cd8a4b76

On 15/04/2013 16:07, Christian Parpart wrote:
> Hey all,
>
> we hit some nice traffic last night that took our main gateway down.
> Pacemaker was configured to failover to our second one, but that one
> died aswell.
>
> In a little post-analysis, I found the following in the logs:
>
> Apr 14 21:42:11 cesar1 kernel: [27613652.439846] BUG: soft lockup -
> CPU#4 stuck for 22s! [swapper/4:0]
> Apr 14 21:42:11 cesar1 kernel: [27613652.440319] Stack:
> Apr 14 21:42:11 cesar1 kernel: [27613652.440446] Call Trace:
> Apr 14 21:42:11 cesar1 kernel: [27613652.440595]  <IRQ>
> Apr 14 21:42:12 cesar1 kernel: [27613652.440828]  <EOI>
> Apr 14 21:42:12 cesar1 kernel: [27613652.440979] Code: c1 51 da 03 81 48
> c7 c2 4e da 03 81 e9 dd fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90
> 55 b8 00 00 01 00 48 89 e5 f0 0f c1 07 <89> c2
> Apr 14 21:42:12 cesar1 CRON[13599]: nss_ldap: could not connect to any
> LDAP server as cn=admin,dc=rz,dc=dawanda,dc=com - Can't contact LDAP server
> Apr 14 21:42:12 cesar1 CRON[13599]: nss_ldap: could not search LDAP
> server - Server is unavailable
> Apr 14 21:42:24 cesar1 crmd: [7287]: ERROR: process_lrm_event: LRM
> operation management-gateway-ip1_stop_0 (917) Timed Out (timeout=20000ms)
> Apr 14 21:42:48 cesar1 kernel: [27613688.611501] BUG: soft lockup -
> CPU#7 stuck for 22s! [named:32166]
> Apr 14 21:42:48 cesar1 kernel: [27613688.611914] Stack:
> Apr 14 21:42:48 cesar1 kernel: [27613688.612036] Call Trace:
> Apr 14 21:42:48 cesar1 kernel: [27613688.612200]  <IRQ>
> Apr 14 21:42:48 cesar1 kernel: [27613688.612408]  <EOI>
> Apr 14 21:42:48 cesar1 kernel: [27613688.612626] Code: c1 51 da 03 81 48
> c7 c2 4e da 03 81 e9 dd fe ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90
> 55 b8 00 00 01 00 48 89 e5 f0 0f c1 07 <89> c2
> Apr 14 21:42:55 cesar1 kernel: [27613695.946295] BUG: soft lockup -
> CPU#0 stuck for 21s! [ksoftirqd/0:3]
> Apr 14 21:42:55 cesar1 kernel: [27613695.946785] Stack:
> Apr 14 21:42:55 cesar1 kernel: [27613695.946917] Call Trace:
> Apr 14 21:42:55 cesar1 kernel: [27613695.947137] Code: c4 00 00 81 a8 44
> e0 ff ff ff 01 00 00 48 63 80 44 e0 ff ff a9 00 ff ff 07 74 36 65 48 8b
> 04 25 c8 c4 00 00 83 a8 44 e0 ff ff 01 <5d> c3
>
> We're using irqbalance to not only hit the first CPU for ethernet card
> hardware interrupts when traffic comes in (learned from last much more
> intensive DDoS).

To use irqbalance is wise. You could also try using receive packet 
steering [1] [2]:

#!/bin/bash
iface='eth*'
flow=16384
echo $flow > /proc/sys/net/core/rps_sock_flow_entries
queues=(/sys/class/net/${iface}/queues/rx-*)
for rx in "${queues[@]}"; do
         echo $(sed -e 's/0/f/g' < $rx/rps_cpus) > $rx/rps_cpus
         echo $flow > $rx/rps_flow_cnt
done

I have found this to be beneficial on systems running networking 
applications that are subject to a high load, but not for systems that 
are simply forwarding packets and processing them entirely in kernel space.

> However, since this not helped, I'd like to find out what else we can
> do. Our gateway has to do NAT and has a few other iptables rules it
> needs in order to run OpenStack behind,
> so I can't just drop it.
>
> Regarding the logs, I can see, that something caused the CPU cores to
> get stuck for a number of different processes.
> Has anyone ever encountered such error messages I quoted above or knows

I used to encounter them but they cleared up at some point during the 
3.4 (longterm) kernel series. If you also use the 3.4 series, I would 
advise upgrading if running < 3.4.51. If you are not using a longterm 
kernel, consider doing so unless there is a feature in a later kernel 
that you cannot do without. My experience of the later 'stable' kernels 
lately is that they have a tendency to introduce serious regressions.

> other things one might want to do in order to prevent hugh unsocialized
> incoming traffic from bringing a Linux node down?

If you can, talk with your upstream to see if there is a way in which 
such traffic can be throttled there.

Be sure to use good quality NICs. In particular, it should support 
multiqueue and adjustable interrupt coalescing (preferably on a dynamic 
basis). For what it's worth, I'm using Intel 82576 based cards for busy 
hosts. These support dynamic interrupt throttling. Even without such a 
feature, some cards will allow their behaviour to be altered via ethtool 
-C. Google will turn up a lot of information on this topic.

I should add that the stability of the driver is of paramount 
performance. Though my Intel cards have been solid, the igb driver 
bundled with the 3.4 kernel is not, which took me a long time to figure 
out. I now use a local ebuild to compile the igb driver from upstream. 
Not only did it improve performance, but it resolved all stability 
issues that I had experienced up until then.

In the event that you are also using the igb driver, ensure that it is 
configured optimally for multiqueue. Here's an example for the upstream 
driver (my NIC has 4 ports, each with 8 queues):

# cat /etc/modprobe.d/igb.conf
options igb RSS=8,8,8,8

Enable I/OAT if your hardware supports it. Some hardware will support it 
but fail to expose a BIOS option to enable it, in which case you can try 
using dca_force [3] (YMMV). Similarly, make use of x2APIC if supported, 
but do not make use of the IOMMU provided by Intel as of Nehalem (boot 
with intel_iommu=off if in doubt).

Consider fine-tuning sysctl.conf, especially those pertaining to buffer 
sizes/limits. I would consider this essential if operating at gigabit 
speeds or higher. Examples are widespread, such as in section 3.1 of the 
Mellanox performance tuning guide [4].

--Kerin

[1] https://lwn.net/Articles/361440/
[2] http://thread.gmane.org/gmane.linux.network/179883/focus=179976
[3] https://github.com/ice799/dca_force
[4] 
http://www.mellanox.com/related-docs/prod_software/Performance_Tuning_Guide_for_Mellanox_Network_Adapters_rev_1_0.pdf