[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v9] new config option vtsc_tolerance_khz to avoid TSC emulation
On 07/06/18 14:08, Olaf Hering wrote: > Add an option to control when vTSC emulation will be activated for a > domU with tsc_mode=default. Without such option each TSC access from > domU will be emulated, which causes a significant perfomance drop for > workloads that make use of rdtsc. > > One option to avoid the TSC option is to run domUs with tsc_mode=native. > This has the drawback that migrating a domU from a "2.3GHz" class host > to a "2.4GHz" class host may change the rate at wich the TSC counter > increases, the domU may not be prepared for that. > > With the new option the host admin can decide how a domU should behave > when it is migrated across systems of the same class. Since there is > always some jitter when Xen calibrates the cpu_khz value, all hosts of > the same class will most likely have slightly different values. As a > result vTSC emulation is unavoidable. Data collected during the incident > which triggered this change showed a jitter of up to 200 KHz across > systems of the same class. Do you have any further details of the systems involved? If they are identical systems, they should all have the same real TSC frequency, and its a known issue that Xen isn't very good at working out the frequency. TBH, fixing that would be far better overall. > > Existing padding fields are reused to store vtsc_khz_tolerance as u16. > The padding is sent as zero in write_tsc_info to the receving host. > The padding is undefined if the changed code runs as receiver. I'm not sure what you mean by this final sentence. > handle_tsc_info has no code to verify that padding is indeed zero. Due > to the lack of a version field it is impossible to know if the sender > already has the newly introduced vtsc_tolerance field. In the worst > case the receiving domU will get an unemulated TSC. The lack of padding verification is deliberate, for forwards compatibility. Why does the sending code matter? One way or another, if the field is 0, the option wasn't present or wasn't configured. Neither of these situations affect the decision-making that the receiving side needs to perform. > > Signed-off-by: Olaf Hering <olaf@xxxxxxxxx> > Reviewed-by: Wei Liu <wei.liu2@xxxxxxxxxx> (v07/v08) > Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> (v08) I'm still -0.5 for this patch. I can appreciate why you want it, but it is a gross hack which only works when you don't skew time more than NTP in the guest can cope with. My gut feeling is that there will be other more subtle fallout. As for the implementation itself, a few trivial comments. > diff --git a/docs/man/xen-tscmode.pod.7 b/docs/man/xen-tscmode.pod.7 > index 3bbc96f201..122ae36679 100644 > --- a/docs/man/xen-tscmode.pod.7 > +++ b/docs/man/xen-tscmode.pod.7 > @@ -99,6 +99,9 @@ whether or not the VM has been saved/restored/migrated > > =back > > +If the tsc_mode is set to "default" the decision to emulate TSC can be > +tweaked further with the "vtsc_tolerance_khz" option. > + > To understand this in more detail, the rest of this document must > be read. > > @@ -211,6 +214,19 @@ is emulated. Note that, though emulated, the "apparent" > TSC frequency > will be the TSC frequency of the initial physical machine, even after > migration. > > +Since the calibration of the TSC frequency may not be 100% accurate, the > +exact value of the frequency can change even across reboots. It can change across reboots for other reasons, e.g. firmware settings. I'd phrase this as "Since the calibration of the TSC frequency isn't 100% accurate, the value measured by Xen can vary across reboots". > This means > +also several otherwise identical systems can have a slightly different > +TSC frequency. As a result TSC access will be emulated if a domU is > +migrated from one host to another, identical host. To avoid the > +performance impact of TSC emulation a certain tolerance of the measured > +host TSC frequency can be specified with "vtsc_tolerance_khz". If the > +measured "cpu_khz" value is within the tolerance range, TSC access > +remains native. Otherwise it will be emulated. This allows to migrate > +domUs between identical hardware. If the domU will be migrated to a > +different kind of hardware, say from a "2.3GHz" to a "2.5GHz" system, > +TSC will be emualted to maintain the TSC frequency expected by the domU. > + > For environments where both TSC-safeness AND highest performance > even across migration is a requirement, application code can be specially > modified to use an algorithm explicitly designed into Xen for this purpose. > diff --git a/docs/man/xl.cfg.pod.5.in b/docs/man/xl.cfg.pod.5.in > index 47d88243b1..995277794f 100644 > --- a/docs/man/xl.cfg.pod.5.in > +++ b/docs/man/xl.cfg.pod.5.in > @@ -1898,6 +1898,16 @@ determined in a similar way to that of B<default> TSC > mode. > > Please see B<xen-tscmode(7)> for more information on this option. > > +=item B<vtsc_tolerance_khz="KHZ"> > + > +B<(x86 only, relevant only for tsc_mode=default)> > +When a domU is started, the CPU frequency of the host is used by the domU for > +TSC related time measurement. Once the domU is either migrated or > +saved/restored on another host that CPU frequency has to be emulated to avoid > +timedrift. To avoid the performance penalty of the TSC emulation, allow a > +certain amount of jitter of the measured CPU frequency on the hosts the domU > +is supposed to run on. Default value is 0, i.e. no tolerance. In one of these two paragraphs, I think there needs to be a warning about clock drift in the guest. > + > =item B<localtime=BOOLEAN> > > Set the real time clock to local time or to UTC. False (0) by default, > diff --git a/docs/specs/libxc-migration-stream.pandoc > b/docs/specs/libxc-migration-stream.pandoc > index 73421ff393..0d0f17edb1 100644 > --- a/docs/specs/libxc-migration-stream.pandoc > +++ b/docs/specs/libxc-migration-stream.pandoc > @@ -3,7 +3,7 @@ > Andrew Cooper <<andrew.cooper3@xxxxxxxxxx>> > Wen Congyang <<wency@xxxxxxxxxxxxxx>> > Yang Hongyang <<hongyang.yang@xxxxxxxxxxxx>> > -% Revision 2 > +% Revision 3 > > Introduction > ============ > @@ -472,7 +472,7 @@ XEN\_DOMCTL\_{get,set}tscinfo hypercall sub-ops. > +------------------------+------------------------+ > | nsec | > +------------------------+------------------------+ > - | incarnation | (reserved) | > + | incarnation | tolerance | (reserved) | > +------------------------+------------------------+ > > -------------------------------------------------------------------- > @@ -485,6 +485,8 @@ khz TSC frequency, in kHz. > nsec Elapsed time, in nanoseconds. > > incarnation Incarnation. > + > +tolerance Amount of Jitter the domU can handle after migration Measurement units? > -------------------------------------------------------------------- > > \clearpage > diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c > index c342d00732..4a9c43b718 100644 > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -2148,8 +2153,25 @@ void tsc_set_info(struct domain *d, > * When a guest is created, gtsc_khz is passed in as zero, making > * d->arch.tsc_khz == cpu_khz. Thus no need to check incarnation. > */ > + disable_vtsc = d->arch.tsc_khz == cpu_khz; > + > + if ( tsc_mode == TSC_MODE_DEFAULT && gtsc_khz && > + d->arch.vtsc_tolerance_khz ) > + { > + long khz_diff; > + > + khz_diff = ABS((long)(cpu_khz - gtsc_khz)); > + disable_vtsc = khz_diff <= d->arch.vtsc_tolerance_khz; > + > + printk(XENLOG_G_INFO "d%d: host has %lu kHz," > + " domU expects %u kHz," > + " difference of %ld is %s tolerance of %u\n", > + d->domain_id, cpu_khz, gtsc_khz, khz_diff, > + disable_vtsc ? "within" : "outside", > + d->arch.vtsc_tolerance_khz); > + } Newline here. > if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() && > - (d->arch.tsc_khz == cpu_khz || > + (disable_vtsc || > (is_hvm_domain(d) && > hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) ) ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |