[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] x86/hvm: fix corrupt ACPI PM-Timer during live migration



On 02/16/2013 01:45 AM, Jan Beulich wrote:
On 14.02.13 at 07:09, Kouya Shimura <kouya@xxxxxxxxxxxxxx> wrote:
@@ -244,21 +245,13 @@ static int handle_pmt_io(
static int pmtimer_save(struct domain *d, hvm_domain_context_t *h)
{
     PMTState *s = &d->arch.hvm_domain.pl_time.vpmt;
-    uint32_t x, msb = s->pm.tmr_val & TMR_VAL_MSB;
     int rc;

     spin_lock(&s->lock);

-    /* Update the counter to the guest's current time.  We always save
-     * with the domain paused, so the saved time should be after the
-     * last_gtime, but just in case, make sure we only go forwards */

So you lose this property of guaranteeing no backward move.

Exactly.
But the backward move is impossible except delay_for_missed_ticks mode.
Since the monotonicity of hvm_get_guest_time() is guaranteed for other modes.
About the problem of delay_for_missed_ticks mode, I slightly mentioned
in the previous mail.

The backward move can be happend as the following:
1) pt_freeze_time     ... real-time:100 guest-time:100   last_gtime:90
2) pmt_timer_callback ... real-time:110 guest-time:110   last_gtime:110
3) pt_thaw_time       ... real-time:120 guest-time:100   last_gtime:110
4) pmt_update_time    ... real-time:125 guest-time:105 < last_gtime:110

So, it can be happend not only in pmtimer_save() but also
in handle_pmt_io().

Maybe pmt_udpate_time() should check the backward move.



-    x = ((s->vcpu->arch.hvm_vcpu.guest_time - s->last_gtime) * s->scale) >> 32;
-    if ( x < 1UL<<31 )
-        s->pm.tmr_val += x;
-    if ( (s->pm.tmr_val & TMR_VAL_MSB) != msb )
-        s->pm.pm1a_sts |= TMR_STS;
     /* No point in setting the SCI here because we'll already have saved the
      * IRQ and *PIC state; we'll fix it up when we restore the domain */
+    pmt_update_time(s, 0);

And using this function you also have the new side effect of
s->last_gtime being updated.

Perhaps the new parameter should be renamed (to, say,
"saving"), and then allow suppressing all these behavioral
changes.


The new side effect is also a bug fix, I think.
Since updating s->pm.tmr_val and s->last_gtime should be
done at the same time. Updating only s->pm.tmr_val is wrong.
Or else, when the vm is resumed on the same machine (migration failure,
check-pointing, etc), the PM-timer value is double counted.



Also, in delay_for_missed_ticks mode you now use a slightly
different time for updating s->pm - did you double check that
this is not going to be a problem? Or else, the flag above could
similarly be used to circumvent this, or hvm_get_guest_time()
could be made return the frozen time (I suppose, but didn't
verify - as it appears to be an assumption already before your
patch -, that pt_freeze_time() runs before pmtimer_save()).

Modifying hvm_get_guest_time() is my thought, too.
But I don't like the idea because the delay_for_missed_ticks mode
is unusual. I wonder who uses it.

Let me think it over.

Thanks,
Kouya


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.