[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] RE: Bizarre pv kernel ultra-high frequency rdtsc?!?



One big clue: Looking at /proc/interrupts inside
the PV guest, the number of timer0 interrupts is
about 300K/second.

Not remembering well how timer interrupts are handled
in a PV guest... could this high frequency be happening
because the Linux-side PV code is setting a timer
or because the Xen-side interrupt delivery code is
getting confused?

> -----Original Message-----
> From: Dan Magenheimer 
> Sent: Friday, November 20, 2009 4:45 PM
> To: Keir Fraser; Jeremy Fitzhardinge
> Cc: Xen-Devel (E-mail)
> Subject: Bizarre pv kernel ultra-high frequency rdtsc?!?
> 
> 
> Hi Jeremy/Keir (and any other PV time experts out there) --
> 
> Working on tsc_mode stuff I've run into a roadblock where
> there is some time-related interaction between xen and a
> PV kernel that I don't understand.  I'm hoping you
> might provide a clue.  There's also a reasonable chance
> that this might be uncovering a significant bug that's
> been around awhile, but never noticed as other than
> a barely noticeable vague slowdown because rdtsc was
> unemulated (and "fast").
> 
> The problem:
> 
> In order to preserve TSC across save/restore/migrate, I
> have implemented a "tsc offset" (and also a "tsc scale"
> but that isn't used yet).
> 
> The result is that the PV kernel starts doing rdtsc's at
> a VERY high frequency (1 MILLION / sec).  I suspect this
> may be a variation of what Jeremy reported at one point
> when emulated rdtsc was first in-tree, but seemed to go away.
> 
> By adding some debug code (and confirmed with xenctx)
> I can see that the millions of rdtsc's are half in
> get_nsec_offset() and half in do_gettimeofday() (presumably
> inlined from get_usec_offset()).  This is a 32-bit 2.6.18-based
> PV kernel, not upstream.  Poring through the 2.6.18 PV time
> code, I can find several places where an essentially infinite
> loop might happen if the version fields are wacko, but
> none where the timestamp contents make any difference
> in control flow, so don't see how modifying these
> values (by adding the offset) could cause a behavioral
> change in Linux, but obviously a big change is happening!
> 
> I can reproduce the problem with a very simple patch
> on xen-unstable that adds a fake fixed offset in the
> three places I add the "tsc offset", see attached.
> By changing BIG_OFFSET to 0, in this patch, the
> frequency of rdtsc's becomes normal again.
> 
> Suspecting some interaction with wallclock time, I
> tried shutting off ntpd and with/without independent
> wallclock in the PV guest.  No difference.
> 
> I also added debug code to see if the Xen-side code
> was churning through version numbers... it is not.
> 
> Any ideas?  (And, sorry, I know you're on a trans-
> hemisphere trip right now.)
> 
> Thanks,
> Dan
> 
> diff -r bec27eb6f72c xen/arch/x86/time.c
> --- a/xen/arch/x86/time.c     Sat Nov 14 10:32:59 2009 +0000
> +++ b/xen/arch/x86/time.c     Fri Nov 20 16:58:18 2009 -0500
> @@ -813,6 +813,8 @@ s_time_t get_s_time(void)
>  #define version_update_begin(v) (((v)+1)|1)
>  #define version_update_end(v)   ((v)+1)
>  
> +#define BIG_OFFSET 10000000000ULL
> +
>  static void __update_vcpu_system_time(struct vcpu *v, int force)
>  {
>      struct cpu_time       *t;
> @@ -827,7 +829,7 @@ static void __update_vcpu_system_time(st
>  
>      /* Don't bother unless timestamps have changed or we are 
> forced. */
>      if ( !force && (u->tsc_timestamp == (v->domain->arch.vtsc
> -                                         ? t->stime_local_stamp
> +                                         ? 
> t->stime_local_stamp - BIG_OFFSET
>                                           : t->local_tsc_stamp)) )
>          return;
>  
> @@ -835,8 +837,8 @@ static void __update_vcpu_system_time(st
>  
>      if ( v->domain->arch.vtsc )
>      {
> -        _u.tsc_timestamp     = t->stime_local_stamp;
> -        _u.system_time       = t->stime_local_stamp;
> +        _u.tsc_timestamp     = t->stime_local_stamp - BIG_OFFSET;
> +        _u.system_time       = t->stime_local_stamp - BIG_OFFSET;
>          _u.tsc_to_system_mul = 0x80000000u;
>          _u.tsc_shift         = 1;
>      }
> @@ -1598,6 +1600,8 @@ void pv_soft_rdtsc(struct vcpu *v, struc
>  
>      spin_unlock(&v->domain->arch.vtsc_lock);
>  
> +    now -= BIG_OFFSET;
> +
>      regs->eax = (uint32_t)now;
>      regs->edx = (uint32_t)(now >> 32);
>  }
> 
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.