[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On 09/24/2010 08:50 PM, Jeremy Fitzhardinge wrote: > On 09/24/2010 12:14 AM, Andrew Jones wrote: >> On 09/23/2010 08:36 PM, Jeremy Fitzhardinge wrote: >>> On 09/23/2010 09:38 AM, Paolo Bonzini wrote: >>>> On 09/23/2010 06:23 PM, Jeremy Fitzhardinge wrote: >>>>>> Any developments with this? I've got a report of the exact same >>>>>> warnings >>>>>> on RHEL6 guest. See >>>>>> >>>>>> https://bugzilla.redhat.com/show_bug.cgi?id=632802 >>>>>> >>>>>> RHEL6 doesn't have the 'Move blkif_interrupt into a tasklet' patch, so >>>>>> that can be ruled out. Unfortunately I don't have this reproducing on a >>>>>> test machine, so it's difficult to debug. The report I have showed >>>>>> that >>>>>> in at least one case it occurred on boot up, right after initting the >>>>>> block device. I'm trying to get confirmation if that's always the case. >>>>>> >>>>>> Thanks in advance for any pointers you might have. >>>>> Yes, I see it even after reverting that change as well. However I only >>>>> see it on my domain with an XFS filesystem, but I haven't dug any deeper >>>>> to see if that's relevant. >>>>> >>>>> Do you know when this appeared? Is it recent? What changes are in the >>>>> rhel6 kernel in question? >>>> It's got pretty much everything in stable-2.6.32.x, up to the 16 patch >>>> blkfront series you posted last July. There are some RHEL-specific >>>> workarounds for PV-on-HVM, but for PV domains everything matches >>>> upstream. >>> Have you tried bisecting to see when this particular problem appeared? >>> It looks to me like something is accidentally re-enabling interrupts - >>> perhaps a stack overrun is corrupting the "flags" argument between a >>> spin_lock_irqsave()/restore pair. >>> >> Unfortunately I don't have a test machine where I can do a bisection >> (yet). I'm looking for one. I only have this one report so far, and it's >> on a production machine. > > The report says that its repeatedly killing the machine though? In my > testing, it seems to hit the warning once at boot, but is OK after that > (not that I'm doing anything very stressful on the domain). > It looks like the crash is from failing to read swap due to a bad page map. It's possibly another issue, but I wanted to try and clean this issue up first to see what happens. >>> Is it only on 32-bit kernels? >>> >> This one report I have is a 32b guest on a 64b host. > > Is it using XFS by any chance? So far I've traced the re-enable to > xfs_buf_bio_end_io(). However, my suspicion is that it might be related > to the barrier changes we did. > I'll check on the xfs and let you know. > J > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxx > http://lists.xensource.com/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |