[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: [Xen-devel] [PATCH] Fix softlockup issue after vcpu hotplug


> Hi, Simon,
>       You case should be different as what I saw, which may be fixed
> by the original patch I posted which however doesn't apply to latest.
> In 2.6.16 version, it's do_timer to call softlock_tick instead of
> run_local_timers. So the check on "stolen > 5s" is a bit late to still
> allow warning jumped out though adjusted later. Could you try
> attached patch to see whether fixing for your live migration case?

So, I tried this last night - I don't see any problems following live
migration but I am still seeing soft lockups all of which are related to
cases where there is a large stolen value - I haven't looked at all the
logs yet, but I did see a couple of things:

1. There were a ton of occasions when the test for stolen > 5s fired but
the value of stolen
   was actually negative - is a -ve stolen value expected? I think the
patch needs to
   be modified to define stolen_threshold as s64 instead of u64 if this
is expected...

2. Following save/restore, I see absolutely massive positive values of
stolen of the order of the
   time the domain was saved (seems reasonable) but then I immediately
see a soft lockup even though
   we touched the watchdog. Shouldn't this patch also fix soft lockup
after save/restore?

3. I actually saw a bunch of cases where there was a mongo stolen value
during apparently normal
   operation (in the ones I've looked at, the system as a whole was not
particularly stressed); I
   need to work on exactly why the domain is not being secheduled, but
in the meantime, shouldn't
   this patch stop the incorrect soft lockup in DomU when the hypervisor
fails to schedule the
   domain for a long period? (not exactly related to VCPU hotplug I


Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.