[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: A race condition introduced by changeset 15175: Re-init hypercall stubs page after HVM save/restore

Keir Fraser <keir.fraser <at> eu.citrix.com> writes:

> Hi Dexuan,
> Are you really sure that this is the problem? The suspend_lock was
> introduced specifically to solve this problem. Note that the BSP takes this
> lock before messing with the hypercall page.
>  -- Keir

I'm also looking at this now (I'm on 3.1.4 BTW). I see both hang and panic. it
appears I see the hang because the "master" vcpu is trying to catch other vcpus
right at the cpu_relax so it can grab the lock in write mode. With many VCPUs
it's just not happening..... Not sure i like the design of this very much... i'm
gonna try to modify it a bit .... 


> On 7/10/08 11:08, "Cui, Dexuan" <dexuan.cui <at> intel.com> wrote:
> > For an SMP Linux HVM guest with PV drivers inserted, when we do save/restore
> > (or LiveMigration) for the guest, it might panic after it's restored.
> > The panic point is inside ap_suspend():
> >  ....
> >     while (info->do_spin) {
> >         cpu_relax();
> >         read_lock(&suspend_lock);
> >         HYPERVISOR_yield();      ----> guest might panic on the invocation 
> > of
> > this function.
> >         read_unlock(&suspend_lock);
> >     }
> > ...
> > 

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.