public inbox for gentoo-user@lists.gentoo.org
 help / color / mirror / Atom feed
From: "J. Roeleveld" <joost@antarean.org>
To: gentoo-user@lists.gentoo.org
Subject: Re: [gentoo-user] Networking trouble
Date: Thu, 15 Oct 2015 15:54:34 +0200	[thread overview]
Message-ID: <1637330.AMsFmt32R0@andromeda> (raw)
In-Reply-To: <561FAA59.5080707@gc-24.de>

On Thursday, October 15, 2015 03:30:01 PM hw wrote:
> Hi,
> 
> I have a xen host with some HV guests which becomes unreachable via
> the network after apparently random amount of times.  I have already
> switched the network card to see if that would make a difference,
> and with the card currently installed, it worked fine for over 20 days
> until it become unreachable again.  Before switching the network card,
> it would run a week or two before becoming unreachable.  The previous
> card was the on-board BCM5764M which uses the tg3 driver.
> 
> There are messages like this in the log file:
> 
> 
> Oct 14 20:58:02 moonflo kernel: ------------[ cut here ]------------
> Oct 14 20:58:02 moonflo kernel: WARNING: CPU: 10 PID: 0 at
> net/sched/sch_generic.c:303 dev_watchdog+0x259/0x270() Oct 14 20:58:02
> moonflo kernel: NETDEV WATCHDOG: enp55s4 (r8169): transmit queue 0 timed
> out Oct 14 20:58:02 moonflo kernel: Modules linked in: arc4 ecb md4 hmac
> nls_utf8 cifs fscache xt_physdev br_netfilter iptable_filter ip_tables
> xen_pciback xen_gntalloc xen_gntdev bridge stp llc zfs(PO) nouveau
> snd_hda_codec_realtek snd_hda_codec_generic zunicode(PO) zavl(PO)
> zcommon(PO) znvpair(PO) spl(O) zlib_deflate video backlight drm_kms_helper
> ttm snd_hda_intel snd_hda_controller snd_hda_codec snd_pcm snd_timer snd
> soundcore r8169 mii xts aesni_intel glue_helper lrw gf128mul ablk_helper
> cryptd aes_x86_64 sha256_generic hid_generic usbhid uhci_hcd usb_storage
> ehci_pci ehci_hcd usbcore usb_common Oct 14 20:58:02 moonflo kernel: CPU:
> 10 PID: 0 Comm: swapper/10 Tainted: P           O    4.0.5-gentoo #3 Oct 14
> 20:58:02 moonflo kernel: Hardware name: Hewlett-Packard HP Z800
> Workstation/0AECh, BIOS 786G5 v03.57 07/15/2013 Oct 14 20:58:02 moonflo
> kernel:  ffffffff8175a77d ffff880124d43d98 ffffffff814da8d8
> 0000000000000001 Oct 14 20:58:02 moonflo kernel:  ffff880124d43de8
> ffff880124d43dd8 ffffffff81088850 ffff880124d43dd8 Oct 14 20:58:02 moonflo
> kernel:  0000000000000000 ffff8800d45f2000 0000000000000001
> ffff8800d5294880 Oct 14 20:58:02 moonflo kernel: Call Trace:
> Oct 14 20:58:02 moonflo kernel:  <IRQ>  [<ffffffff814da8d8>]
> dump_stack+0x45/0x57 Oct 14 20:58:02 moonflo kernel:  [<ffffffff81088850>]
> warn_slowpath_common+0x80/0xc0 Oct 14 20:58:02 moonflo kernel: 
> [<ffffffff810888d1>] warn_slowpath_fmt+0x41/0x50 Oct 14 20:58:02 moonflo
> kernel:  [<ffffffff812b31c5>] ? add_interrupt_randomness+0x35/0x1e0 Oct 14
> 20:58:02 moonflo kernel:  [<ffffffff8145b819>] dev_watchdog+0x259/0x270 Oct
> 14 20:58:02 moonflo kernel:  [<ffffffff8145b5c0>] ?
> dev_graft_qdisc+0x80/0x80 Oct 14 20:58:02 moonflo kernel: 
> [<ffffffff8145b5c0>] ? dev_graft_qdisc+0x80/0x80 Oct 14 20:58:02 moonflo
> kernel:  [<ffffffff810d4047>] call_timer_fn.isra.30+0x17/0x70 Oct 14
> 20:58:02 moonflo kernel:  [<ffffffff810d42a6>]
> run_timer_softirq+0x176/0x2b0 Oct 14 20:58:02 moonflo kernel: 
> [<ffffffff8108bd0a>] __do_softirq+0xda/0x1f0 Oct 14 20:58:02 moonflo
> kernel:  [<ffffffff8108c04e>] irq_exit+0x7e/0xa0 Oct 14 20:58:02 moonflo
> kernel:  [<ffffffff8130e075>] xen_evtchn_do_upcall+0x35/0x50 Oct 14
> 20:58:02 moonflo kernel:  [<ffffffff814e1e8e>]
> xen_do_hypervisor_callback+0x1e/0x40 Oct 14 20:58:02 moonflo kernel:  <EOI>
>  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 Oct 14 20:58:02
> moonflo kernel:  [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20 Oct
> 14 20:58:02 moonflo kernel:  [<ffffffff810459e0>] ? xen_safe_halt+0x10/0x20
> Oct 14 20:58:02 moonflo kernel:  [<ffffffff81053979>] ?
> default_idle+0x9/0x10 Oct 14 20:58:02 moonflo kernel:  [<ffffffff810542da>]
> ? arch_cpu_idle+0xa/0x10 Oct 14 20:58:02 moonflo kernel: 
> [<ffffffff810bd170>] ? cpu_startup_entry+0x190/0x2f0 Oct 14 20:58:02
> moonflo kernel:  [<ffffffff81047cd5>] ? cpu_bringup_and_idle+0x25/0x40 Oct
> 14 20:58:02 moonflo kernel: ---[ end trace 98d961bae351244d ]--- Oct 14
> 20:58:02 moonflo kernel: r8169 0000:37:04.0 enp55s4: link up
> 
> 
> After that, there are lots of messages about the link being up, one message
> every 12 seconds.  When you unplug the network cable, you get a message that
> the link is down, and no message when you plug it in again.
> 
> I was hoping that switching the network card (to one that uses a different
> driver) might solve the problem, and it did not.  Now I can only guess that
> the network card goes to sleep and sometimes cannot be woken up again.
> 
> I tried to reduce the connection speed to 100Mbit and found that accessing
> the VMs (via RDP) becomes too slow to use them.  So I disabled the power
> management of the network card (through sysfs) and will have to see if the
> problem persists.
> 
> We'll be getting decent network cards in a couple days, but since the
> problem doesn't seem to be related to a particular card/model/manufacturer,
> that might not fix it, either.
> 
> This problem seems to only occur on machines that operate as a xen server.
> Other machines, identical Z800s, not running xen, run just fine.
> 
> What would you suggest?

More info required:

- Which version of Xen
- Does this only occur with HVM guests?
- Which network-driver are you using inside the guest
- Can you connect to the "local" console of the guest?
- If yes, does it still have no connectivity?

I saw the same on my lab machine, which was related to:
- Not using correct drivers inside HVM guests
- Switch hardware not keeping the MAC/IP/Port lists long enough

--
Joost


  reply	other threads:[~2015-10-15 13:54 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-10-15 13:30 [gentoo-user] Networking trouble hw
2015-10-15 13:54 ` J. Roeleveld [this message]
2015-10-15 15:46   ` hw
2015-10-16  5:32     ` J. Roeleveld
2015-10-29 10:29       ` hw
2015-10-29 17:25         ` J. Roeleveld
2015-10-30  9:34           ` hw
2015-11-05 12:51             ` [gentoo-user] Re: update xen networking trouble hw

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1637330.AMsFmt32R0@andromeda \
    --to=joost@antarean.org \
    --cc=gentoo-user@lists.gentoo.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox