[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v13] tolerate jitter in cpu_khz calculation to avoid TSC emulation
>>> On 13.03.19 at 09:28, <olaf@xxxxxxxxx> wrote: > --- a/xen/arch/x86/time.c > +++ b/xen/arch/x86/time.c > @@ -43,6 +43,23 @@ static char __initdata opt_clocksource[10]; > string_param("clocksource", opt_clocksource); > > unsigned long __read_mostly cpu_khz; /* CPU clock frequency in kHz. */ > +/* > + * NTP implementations running within the domU can handle a certain > + * difference of the system clockspeed, compared to an external > + * clocksource. This is ususally described as "drift". How much drift an > + * OS can handle is described in its documentation. For NTP in Linux the > + * value is 500 PPM, which is the lowest compared to other OS. > + */ > +#define VTSC_NTP_PPM_TOLERANCE 500UL > +/* > + * The measurement of cpu_khz is not accurate. Its accuracy depends on the > + * hardware. A bunch of systems with supposedly identical frequencies will > + * measure different frequencies, which will also vary accross reboots. > + * This variable tries to cover a range of frequencies seen in the wild. > + * The range is substracted from the PPM value above. subtracted I also don't think the inaccuracy is only because of measurement being imprecise. I don't think any two crystals will ever provide the exact same frequencies. > @@ -1885,6 +1902,27 @@ void __init early_time_init(void) > printk("Detected %lu.%03lu MHz processor.\n", > cpu_khz / 1000, cpu_khz % 1000); > > + /* > + * How many kHz (in other words: drift) is ntpd in domU expected to > handle? > + * freq tolerated freq difference > + * ------- = ------------------------- > + * Million Million + PPM > + */ > + tmp = 1000 * 1000; > + tmp += VTSC_NTP_PPM_TOLERANCE; > + tmp *= cpu_khz; > + tmp /= 1000 * 1000; > + > + tmp -= cpu_khz; > + > + /* > + * Reduce the theoretical upper limit by the assumed measuring > inaccuracy. > + */ > + if ( tmp >= VTSC_MEASUREMENT_INACCURACY_RANGE_KHZ ) > + tmp -= VTSC_MEASUREMENT_INACCURACY_RANGE_KHZ; The discontinuity is still there, and so far you've failed to explain why a discontinuity is what you want here. > @@ -2193,6 +2231,8 @@ int tsc_set_info(struct domain *d, > > switch ( tsc_mode ) > { > + bool disable_vtsc; > + > case TSC_MODE_DEFAULT: > case TSC_MODE_ALWAYS_EMULATE: > d->arch.vtsc_offset = get_s_time() - elapsed_nsec; > @@ -2201,13 +2241,30 @@ int tsc_set_info(struct domain *d, > > /* > * In default mode use native TSC if the host has safe TSC and > - * host and guest frequencies are the same (either "naturally" or > - * - for HVM/PVH - via TSC scaling). > + * host and guest frequencies are (almost) the same (either > "naturally" > + * or - for HVM/PVH - via TSC scaling). > * When a guest is created, gtsc_khz is passed in as zero, making > * d->arch.tsc_khz == cpu_khz. Thus no need to check incarnation. > */ > + disable_vtsc = d->arch.tsc_khz == cpu_khz; > + > + if ( tsc_mode == TSC_MODE_DEFAULT && gtsc_khz && vtsc_tolerance_khz ) > + { > + long khz_diff; > + > + khz_diff = ABS(((long)cpu_khz - gtsc_khz)); This could easily be the initializer of the variable. And there's one too many pair of parentheses. I also don't see why the variable needs to be of a signed type. Then again maybe the ABS() would better move ... > + disable_vtsc = khz_diff <= vtsc_tolerance_khz; ... here anyway, such that ... > + printk(XENLOG_G_INFO "d%d: host has %lu kHz," > + " domU expects %u kHz," > + " difference of %ld is %s tolerance of %u\n", > + d->domain_id, cpu_khz, gtsc_khz, khz_diff, > + disable_vtsc ? "within" : "outside", > + vtsc_tolerance_khz); ... this logs the properly signed value? > + } > + > if ( tsc_mode == TSC_MODE_DEFAULT && host_tsc_is_safe() && > - (d->arch.tsc_khz == cpu_khz || > + (disable_vtsc || > (is_hvm_domain(d) && > hvm_get_tsc_scaling_ratio(d->arch.tsc_khz))) ) > { I'm sorry, but I continue to object to this adjustment getting done both by default _and_ not in a per-guest manner. As said before, you can't demand guests to run NTP, and hence you can't expect them to get along with a few hundred kHz jump in observed TSC frequency. Whether the performance drop due to vTSC use is better or worse is a policy decision, which we should leave to the admin. Hence the feature needs to be off by default, and there needs to be at least a host-wide control to enable it; a per-guest control would be better. IOW I explicitly do not agree with the last sentence of the commit message. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |