[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: [PATCH][HVM] fix VNIF restore failure on HVM guest with heavy workload



On 11/4/07 08:20, "Zhai, Edwin" <edwin.zhai@xxxxxxxxx> wrote:

>> What happens if an interrupt is being processed during save/restore? It
>> would be nice to know what the underlying bug is!
> 
> If an pseudo PCI intr occurred after xen_suspend on cpu0, there is definitely
> a crash. I copy this code from original PV driver code.

Yeah, but in that case: (a) it's for a different reason [make sure no
interrupt handler runs that might look at machine addresses in page tables,
mainly]; and (b) it's backed up by the fact that all other CPUs have been
offlined or stop_machined().

Do you have a crash oops message? I'm just a little concerned we may end up
masking a real save/restore bug here, which we may as well fix while you can
repro.

> SMP is a headache for PV drv save/restore on HVM. Even we disable intr on all
> cpus, PV driver on other cpu may still access low level service after
> xen_suspend on cpu0. smp_suspend is used for PV drv on PV domain, which is not
> suitable for HVM as we need the transparency to guest.
> 
> Do we need lightweight stop_machine_run in this case, i.e. make other cpu
> sleep?

I'm thinking irq_disable() of the pci-flatform irq, coupled with a
smp_call_function() to make the other CPUs spin with interrupts disabled.

 -- Keir


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.