From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from lists.gentoo.org (pigeon.gentoo.org [208.92.234.80]) by finch.gentoo.org (Postfix) with ESMTP id C5DC3138CA3 for ; Fri, 6 Mar 2015 19:35:59 +0000 (UTC) Received: from pigeon.gentoo.org (localhost [127.0.0.1]) by pigeon.gentoo.org (Postfix) with SMTP id 562C9E0962; Fri, 6 Mar 2015 19:35:54 +0000 (UTC) Received: from mail-wi0-f182.google.com (mail-wi0-f182.google.com [209.85.212.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by pigeon.gentoo.org (Postfix) with ESMTPS id EAD50E0943 for ; Fri, 6 Mar 2015 19:35:52 +0000 (UTC) Received: by widex7 with SMTP id ex7so5284786wid.1 for ; Fri, 06 Mar 2015 11:35:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=message-id:date:from:user-agent:mime-version:to:subject:references :in-reply-to:content-type:content-transfer-encoding; bh=/ixav/MZs5w/cZKVvHgDbbKhVOfXcJDszVurninKZIo=; b=vd2j+yLBgLckOC5hnOLgHrpqW65XOMtiqfrrLz/YRi2htQWN2mL3oCW1LhCDSUMaeR K8KEYmqeJuBShw4PUwZ5ydV6IUxVh42JHoShQNZ6muDBrC5EtCJc8Y1m/QoczLaBU/eN duZjb7KV9TFLvHTWaXqhyRVle1AcPRb5yzarduzCzmy3t58SNcyu3jzuA1DCfJKlKxBY pomQ0YtO6mBZFm3L0ctiezLyQg25lqRkiEHBPc5XqOFSVAZJNE6rJszqkdDh00EFpt28 fP1C/1tfsYe7SoxKGofgXiMt7xn8BMQMJnaIBXntKMJrh8jMOk+/nn+/Cex7Up0F9uyK v7zA== X-Received: by 10.180.189.37 with SMTP id gf5mr79748150wic.86.1425670551946; Fri, 06 Mar 2015 11:35:51 -0800 (PST) Received: from [172.20.0.41] (105-237-229-251.access.mtnbusiness.co.za. [105.237.229.251]) by mx.google.com with ESMTPSA id lb6sm16257514wjb.22.2015.03.06.11.35.48 for (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 06 Mar 2015 11:35:50 -0800 (PST) Message-ID: <54FA0191.6090900@gmail.com> Date: Fri, 06 Mar 2015 21:35:45 +0200 From: Alan McKinnon User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 Precedence: bulk List-Post: List-Help: List-Unsubscribe: List-Subscribe: List-Id: Gentoo Linux mail X-BeenThere: gentoo-user@lists.gentoo.org Reply-to: gentoo-user@lists.gentoo.org MIME-Version: 1.0 To: gentoo-user@lists.gentoo.org Subject: Re: [gentoo-user] Strange network behaviour: NIC goes down, DHCP lease renewal fails (WORKED AROUND) References: <20150305104625.2d88242a@marcec.fritz.box> <20150305183323.GA7189@ns1.bonedaddy.net> <20150306080144.71ad2a3c@hobbit> <20150306194535.3727ccb6@marcec.fritz.box> In-Reply-To: <20150306194535.3727ccb6@marcec.fritz.box> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-Archives-Salt: 7a3ce78a-fbcc-45ad-b03d-ca3c9c64fa69 X-Archives-Hash: 83bfab961157f03fee35112ddcd5797a On 06/03/2015 20:45, Marc Joliet wrote: > First of all, thanks to everybody who responded so far. > > I wanted preface my reply to Alan by mentioning that the local sysadmin made > changes to the DHCP server that appear to have worked around whatever the issue > is. > > I don't fully understand the error analysis (something to do with the DHCP > client reaching a particular state and sending DHCP packets that something > in-between it and the DHCP server doesn't like and that might result in vendor > dependent behaviour), but what the DHCP server now does is tell the client to > use the broadcast address as the DHCP server address (which is weird, because > the DHCP clients always switch to the broadcast address after a timeout, but of > course I'm no DHCP expert). The affected PCs have been working normally all > day today. In light of what you say below: I'd be interested to hear what your sysadmin has to say; dhcp is one of those things that JustWork(tm) - it uses regular tcp and nothing funny about it at all. The only thing normally between your NIC and the dhcp server is a switch, so that's what I'd be looking at. > > So the current resolution is "it works", but we still don't understand (or at > least me and my boss don't) what the underlying issue is. Hence I'm still > curious what people who know these technologies better than me think. > > Also, I suppose it was confusing to say that the switch never saw the packets. > The way this was determined was by post-mortem log inspection; AFAIK we didn't > do any live inspection on the switch. Based on the workaround, the conclusion > we came to is that the switch must have dropped the packets (for whatever > reason) without logging that it did. > > Am Fri, 6 Mar 2015 08:01:44 +0200 > schrieb Alan McKinnon : > > [...] >> I've seen similar things many times myself (but nevr on Intel network >> kit so far) >> >> A lot of reading and Googling usually leads to the solution: >> >> - firmware upgrade for the hardware > > OK, I can look into that. > >> - use the correct driver (this is often non-obvious) >> - try the in-kernel driver vs any out-of-tree vendor driver > > All PCs run with the e1000e in-kernel module. I think the Fedora systems run > 3.18.7, so it's about as current as it can be, too. Could it really be that the > kernel selects the wrong driver? > >> - apply driver parameters designed to work around buggy hardware (this >> often involves (much reading) > > I will also consider that. I see that the kernel sources contains > documentation for the e1000e driver that I can look at. I wasn't aware you had e1000e hardware - those are about as reliable as they come. I've used many of them and never had the slightest trouble at all. By all means study up on firmware and driver options - if you don;t know much about that area it's very illuminating to find out more. But based on experience I'd say the chances of finding an oddity with e1000e are slim, and I'd be looking at a misconfigured switch. There are some strange switches out there that let you make crazy configuration, like eg blanket drop all broadcast traffic on one or more ports. That's where I'd be looking first. -- Alan McKinnon alan.mckinnon@gmail.com