[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC] e1000: Don't save writes to ICS/ICR masked by IMS



On Thu, Sep 15, 2016 at 2:15 AM, Denis V. Lunev <den@xxxxxxxxxx> wrote:
> On 09/13/2016 11:59 PM, Konrad Rzeszutek Wilk wrote:
> > On Thu, Sep 01, 2016 at 10:57:48AM -0700, Ed Swierk wrote:
> >> Windows 8, 10 and Server 2012 guests hang intermittently while booting
> >> on Xen 4.5.3 with 1 vCPU and 4 e1000 vNICs, shortly after the Windows
> >> logo appears and the little dots start spinning.
> >>
> >> Running strace on qemu shows its main thread doing the following every
> >> couple of milliseconds:
> >>
> >>  ppoll([..., {fd=30</dev/xen/evtchn>, events=POLLIN|POLLERR|POLLHUP},
> >>         ...], ...) = 1 ([{fd=30, revents=POLLIN}], ...)
> >>  read(30</dev/xen/evtchn>, "^\0\0\0", 4) = 4
> >>  write(30</dev/xen/evtchn>, "^\0\0\0", 4) = 4
> >>  ioctl(30</dev/xen/evtchn>, IOCTL_EVTCHN_NOTIFY, 0x7f1f9449d310) = 0
> >>  clock_gettime(CLOCK_MONOTONIC, {6937, 449468262}) = 0
> >>  clock_gettime(CLOCK_MONOTONIC, {6937, 449582903}) = 0
> >>  gettimeofday({1472251376, 673434}, NULL) = 0
> >>  clock_gettime(CLOCK_MONOTONIC, {6937, 449856205}) = 0
> >>  gettimeofday({1472251376, 673679}, NULL) = 0
> >>
> >> The event channel (identified by '^' or 94 in this example) is always
> >> the third of the domain's four channels.
> >>
> >> Two recent qemu patches (http://git.qemu.org/?p=qemu.git;h=9596ef7c and
> >> http://git.qemu.org/?p=qemu.git;h=74004e8c) seem to address similar
> >> issues, but don't help in this case.
> >>
> >> The proposed fix from
> >> https://bugzilla.redhat.com/show_bug.cgi?id=874406#c78 makes the hang
> >> go away. It's not clear to me why it works, or if it's just papering
> >> over a bug elsewhere, or if there are any possible side effects.
> > CC-ing Denis.
> >
> >
> > Is the fix below based on reading the spec or more of instrumenting?
> >
> > Thanks.
>
> hmm. I have looked in our older code (completely separate
> from QEMU). It does not have this trick.
>
> 2012r2 (as far as I remember) has a bug in the driver
> when LSC interrupt was raised unexpectedly during driver
> initialization. The original bug was about TXQE. Can you
> pls confirm which interrupt causes the storm?

How can I tell which interrupt is firing? Instrument the QEMU e1000 code?

--Ed

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.