[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] Re: A race condition introduced by changeset 15175: Re-init hypercall stubs page after HVM save/restore

Hi Dexuan,

Are you really sure that this is the problem? The suspend_lock was
introduced specifically to solve this problem. Note that the BSP takes this
lock before messing with the hypercall page.

 -- Keir

On 7/10/08 11:08, "Cui, Dexuan" <dexuan.cui@xxxxxxxxx> wrote:

> For an SMP Linux HVM guest with PV drivers inserted, when we do save/restore
> (or LiveMigration) for the guest, it might panic after it's restored.
> The panic point is inside ap_suspend():
>  ....
>     while (info->do_spin) {
>         cpu_relax();
>         read_lock(&suspend_lock);
>         HYPERVISOR_yield();      ----> guest might panic on the invocation of
> this function.
>         read_unlock(&suspend_lock);
>     }
> ...
> The root cause is: ap might be invoking the hypercall while bsp is asking the
> hypervisor to re-initialize the hypercall page when the guest has been just
> restored!
> What's the purpose of re-initializing the hypercall page here? To improve the
> compatibility in the case the src/target hosts have different hypercall stub
> codes?
> PS, I'm using c/s 18353 to debug the issue. (the latest xen-unstable.hg's S/R
> and L/M are broken by 18383.
> Thanks,
> -- Dexuan

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.