[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] outgoing domU network dies after 135-194 minutes



Done some more testing and found that
* it seems like it always dies after 2h15m-2h45min
* when it dies it dies for all domU and on all ports at the same time
(or at least within 1 minute)
* external traffic to & from dom0 works fine all the time
* traffic peth0->vif1.0->domU eth0 works fine (tcpdump in domU shows
packages)
* traffic domU ->eth0 ->vif1.0 dies directly, eth0 TX counter doesn't
change and tcpdump on vif1.0 shows outgoing no traffic (only incoming)
* I restarted one domU after an hour but it died at the same time as the
others so it seems tied to uptime of dom0

* besides that firewall rules doesn't change after a while I have
everything open
* all domU are paravirtualized

I'm at loss as to where to look. I have started to move over some things
to a second system but it can't handle a full failover (not enough disk
and no backup tape) so I need to figure out what's going on here.

what is the common sw that can effect all 4 domU on all 5 network ports
(vif[1-4].*) but not dom0

/ps

On Wed, 2008-05-21 at 07:56 -0400, Peter Sjoberg wrote:
> I have a OpenSuse 10.3 with xen 3.1.0 running and it's been running fine
> for a few months. 
> This past weekend it suddenly started to act up and after some
> troubleshooting I can now say that it seems like the guests(domU) loose
> the outgoing network pipe, from the console I can see that the TX
> counter is stuck at the same value but it's no errors. It behaves as if
> whatever I try to connect to isn't there.
> I can reboot the guest but the problem stays, TX stays at 0 while RX
> counts up.
> Rebooting the host(dom0) solves the problem for a few hours (seems to be
> 2-6h).
> 
> I tried to look for what the problem can be but don't know where to
> look. The closest I got was when I narrowed it down to that it doesn't
> send any network traffic out from any domU and once it happens the domU
> mac is no where to be found outside the domU (checked brctl showmacs &
> on the switch)
> What bothers me most is that it worked fine up until Sunday. I was even
> out of town for a few days before so I didn't change anything.
> Also, why does it work for a while after reboot?
> 
> My setup is not that strange. I have one domu as firewall and another in
> two DMZs so I have my own network-bridge script that calls the stock
> opensuse script 
> 
> for i in $(seq 0 4); do
>       $dir/network-bridge "$@" vifnum=$i netdev=eth$i bridge=xenbr$i
>       /usr/sbin/ethtool -K eth$i tx off
> done
> 
> and this gives
> # brctl show
> bridge name     bridge id               STP enabled     interfaces
> xenbr0          8000.fefffffff000       no              vif0.0
>                                                         peth0
>                                                         vif2.0
>                                                         vif4.0
> xenbr1          8000.fefffffff001       no              vif0.1
>                                                         peth1
>                                                         vif2.1
>                                                         vif3.0
> xenbr2          8000.fefffffff002       no              vif0.2
>                                                         peth2
>                                                         vif1.0
>                                                         vif2.2
> xenbr3          8000.fefffffff003       no              vif0.3
>                                                         peth3
>                                                         vif2.3
> xenbr4          8000.00508bcfd44d       no              eth4
>                                                         vif2.4
> The kernel and xen running is stock opensuse
> 
> # xm info
> host                   : enterprise
> release                : 2.6.22.17-0.1-xen
> version                : #1 SMP 2008/02/10 20:01:04 UTC
> machine                : x86_64
> nr_cpus                : 2
> nr_nodes               : 1
> sockets_per_node       : 1
> cores_per_socket       : 2
> threads_per_core       : 1
> cpu_mhz                : 2611
> hw_caps                : 
> 178bfbff:ebd3fbff:00000000:00000010:00002001:00000000:0000001f
> total_memory           : 4031
> free_memory            : 0
> max_free_memory        : 1106
> max_para_memory        : 1102
> max_hvm_memory         : 1091
> xen_major              : 3
> xen_minor              : 1
> xen_extra              : .0_15042-51.3
> xen_caps               : xen-3.0-x86_64 xen-3.0-x86_32p hvm-3.0-x86_32 
> hvm-3.0-x86_32p hvm-3.0-x86_64
> xen_scheduler          : credit
> xen_pagesize           : 4096
> platform_params        : virt_start=0xffff800000000000
> xen_changeset          : 15042
> cc_compiler            : gcc version 4.2.1 (SUSE Linux)
> cc_compile_by          : abuild
> cc_compile_domain      : suse.de
> cc_compile_date        : Thu Dec 20 19:57:34 UTC 2007
> xend_config_format     : 4
> 
> 
> So, where should I look for problems?
> 
> /ps
> 
> 
> _______________________________________________
> Xen-users mailing list
> Xen-users@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.