[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] new netfront and occasional receive path lockup
On 09/10/2010 04:50 AM, Pasi Kärkkäinen wrote: > On Wed, Aug 25, 2010 at 08:51:09AM +0800, Xu, Dongxiao wrote: >> Hi Christophe, >> >> Thanks for finding and checking the problem. >> I will try to reproduce the issue and check what caused the problem. >> > Hello, > > Was this issue resolved? Some users have been complaining > "network freezing up" issues recently on ##xen on irc.. Yeah, I'll add a command-line parameter to disable smartpoll (and leave it off by default). J > -- Pasi > >> Thanks, >> Dongxiao >> >> Jeremy Fitzhardinge wrote: >>> On 08/22/2010 09:43 AM, Christophe Saout wrote: >>>> Hi, >>>> >>>> I've been playing with some of the new pvops code, namely DomU guest >>>> code. What I've been observing on one of the virtual machines is >>>> that >>>> the network (vif) is dying after about ten to sixty minutes of >>>> uptime. >>>> The unfortunate thing here is that I can only repoduce it on a >>>> production VM and have been unlucky so far to trigger the bug on a >>>> test machine. While this has not been tragic - rebooting fixed the >>>> issue, unfortunately I can't spend very much time on debugging after >>>> the issue pops up. >>> Ah, OK. I've seen this a couple of times as well. And it just >>> happened to me then... >>> >>> >>>> Now, what is happening is that the receive path goes dead. The DomU >>>> can send packets to Dom0 and those are visible using tcpdump on the >>>> Dom0 on the virtual interface, but not the other way around. >>> I hadn't got to that level of diagnosis, but I can confirm that >>> that's what seems to be happening here too. >>> >>>> Now, I have done more than one change at a time (I'd like to avoid >>>> going into pinning it down since I can only reproduce it on a >>>> production machine, as I said, so suggestions are welcome), but my >>>> suspicion is that it might have to do with the new "smart polling" >>>> feature in xen/netfront. Note that I have also updated Dom0 to pull >>>> in the latest dom0/backend and netback changes, just to make sure >>>> it's >>>> not due to an issue that has been fixed there, but I'm still seeing >>>> the same. >>> I agree. I think I started seeing this once I merged smartpoll into >>> netfront. >>> >>> J >>> >>>> The production machine is a machine that doesn't have much network >>>> load, but deals with a lot of small network requests (DNS and smtp >>>> mostly). A workload which is hard to reproduce on the test machine. >>>> Heavy network load (NFS, FTP and so on) for days hasn't triggered the >>>> problem. Also, segmentation offloading and similar settings don't >>>> have any effect. >>>> >>>> The machine has 2 physical and the VM 2 virtual CPUs, DomU has >>>> PREEMPT >>>> enabled. >>>> >>>> I've been looking at the code, if there might be a race condition >>>> somewhere, something like where one could run into a situation where >>>> the hrtimer doesn't run and Dom0 believes the DomU should be polling >>>> and doesn't emit an interrupt or something, but I'm afraid I don't >>>> know enough to judge this (I mean, there are spinlocks which look >>>> safe >>>> to me). >>>> >>>> Do you have any suggestions what to try? I can trigger the issue on >>>> the production VM again, but debugging should not take more than a >>>> few >>>> minutes if it happens. Access is only possible via the console. >>>> Neither Dom0 nor the guest show anything unusual in the kernel >>>> message >>>> and continue to behave normally after the network goes dead (also >>>> able >>>> to shut down the guest normally). >>>> >>>> Thanks, >>>> Christophe >>>> >>>> >>>> >>>> _______________________________________________ >>>> Xen-devel mailing list >>>> Xen-devel@xxxxxxxxxxxxxxxxxxx >>>> http://lists.xensource.com/xen-devel >> >> _______________________________________________ >> Xen-devel mailing list >> Xen-devel@xxxxxxxxxxxxxxxxxxx >> http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |