[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH v2 6/6] x86/time: implement PVCLOCK_TSC_STABLE_BIT
When using TSC as clocksource we will solely rely on TSC for updating vcpu time infos (pvti). Right now, each vCPU takes the tsc_timestamp at different instants meaning every EPOCH + delta. This delta is variable depending on the time the CPU calibrates with CPU 0 (master), and will likely be different and variable across vCPUS. This means that each VCPU pvti won't account to its calibration error which could lead to time going backwards, and allowing a situation where time read on VCPU B immediately after A being smaller. While this doesn't happen a lot, I was able to observe (for clocksource=tsc) around 50 times in an hour having warps of < 100 ns. This patch proposes relying on host TSC synchronization and passthrough to the guest, when running on a TSC-safe platform. On time_calibration we retrieve the platform time in ns and the counter read by the clocksource that was used to compute system time. We introduce a new rendezous function which doesn't require synchronization between master and slave CPUS and just reads calibration_rendezvous struct and writes it down the stime and stamp to the cpu_calibration struct to be used later on. We can guarantee that on a platform with a constant and reliable TSC, that the time read on vcpu B right after A is bigger independently of the CPU calibration error. Since pvclock time infos are monotonic as seen by any vCPU set PVCLOCK_TSC_STABLE_BIT, which then enables usage of VDSO on Linux. IIUC, this is similar to how it's implemented on KVM. Note that PVCLOCK_TSC_STABLE_BIT is set only when CPU hotplug isn't meant to be performed on the host, which will either be when max vcpus and num_present_cpu are the same or if "nocpuhotplug" command line parameter is used. This is because a newly hotplugged CPU may not satisfy the condition of having all TSCs synchronized. Signed-off-by: Joao Martins <joao.m.martins@xxxxxxxxxx> --- Cc: Keir Fraser <keir@xxxxxxx> Cc: Jan Beulich <jbeulich@xxxxxxxx> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Perhaps "cpuhotplugsafe" would be a better name, since potentially hardware could guarantee TSCs are synchronized on hotplug? Changes since v1: - Change approach to follow Andrew's guideline to skip std_rendezvous. And doing so by introducing a nop_rendezvous - Change commit message reflecting the change above. - Use TSC_STABLE_BIT only if cpu hotplug isn't possible. - Add command line option to override it if no cpu hotplug is intended. --- xen/arch/x86/time.c | 43 +++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 43 insertions(+) diff --git a/xen/arch/x86/time.c b/xen/arch/x86/time.c index 123aa42..1dcd4af 100644 --- a/xen/arch/x86/time.c +++ b/xen/arch/x86/time.c @@ -43,6 +43,10 @@ static char __initdata opt_clocksource[10]; string_param("clocksource", opt_clocksource); +/* opt_nocpuhotplug: Set if CPU hotplug isn't meant to be used */ +static bool_t __initdata opt_nocpuhotplug; +boolean_param("nocpuhotplug", opt_nocpuhotplug); + unsigned long __read_mostly cpu_khz; /* CPU clock frequency in kHz. */ DEFINE_SPINLOCK(rtc_lock); unsigned long pit0_ticks; @@ -435,6 +439,7 @@ uint64_t ns_to_acpi_pm_tick(uint64_t ns) * PLATFORM TIMER 4: TSC */ static bool_t clocksource_is_tsc; +static bool_t use_tsc_stable_bit; static u64 tsc_freq; static unsigned long tsc_max_warp; static void tsc_check_reliability(void); @@ -468,6 +473,11 @@ static int __init init_tsctimer(struct platform_timesource *pts) pts->frequency = tsc_freq; clocksource_is_tsc = tsc_reliable; + use_tsc_stable_bit = clocksource_is_tsc && + ((nr_cpu_ids == num_present_cpus()) || opt_nocpuhotplug); + + if ( clocksource_is_tsc && !use_tsc_stable_bit ) + printk(XENLOG_INFO "TSC: CPU Hotplug intended, not setting stable bit\n"); return tsc_reliable; } @@ -950,6 +960,8 @@ static void __update_vcpu_system_time(struct vcpu *v, int force) _u.tsc_timestamp = tsc_stamp; _u.system_time = t->stime_local_stamp; + if ( use_tsc_stable_bit ) + _u.flags |= PVCLOCK_TSC_STABLE_BIT; if ( is_hvm_domain(d) ) _u.tsc_timestamp += v->arch.hvm_vcpu.cache_tsc_offset; @@ -1431,6 +1443,22 @@ static void time_calibration_std_rendezvous(void *_r) raise_softirq(TIME_CALIBRATE_SOFTIRQ); } +/* + * Rendezvous function used when clocksource is TSC and + * no CPU hotplug will be performed. + */ +static void time_calibration_nop_rendezvous(void *_r) +{ + struct cpu_calibration *c = &this_cpu(cpu_calibration); + struct calibration_rendezvous *r = _r; + + c->local_tsc_stamp = r->master_tsc_stamp; + c->stime_local_stamp = get_s_time(); + c->stime_master_stamp = r->master_stime; + + raise_softirq(TIME_CALIBRATE_SOFTIRQ); +} + static void (*time_calibration_rendezvous_fn)(void *) = time_calibration_std_rendezvous; @@ -1440,6 +1468,13 @@ static void time_calibration(void *unused) .semaphore = ATOMIC_INIT(0) }; + if ( use_tsc_stable_bit ) + { + local_irq_disable(); + r.master_stime = read_platform_stime(&r.master_tsc_stamp); + local_irq_enable(); + } + cpumask_copy(&r.cpu_calibration_map, &cpu_online_map); /* @wait=1 because we must wait for all cpus before freeing @r. */ @@ -1555,6 +1590,14 @@ static int __init verify_tsc_reliability(void) init_percpu_time(); + /* + * We won't do CPU Hotplug and TSC clocksource is being used which + * means we have a reliable TSC, plus we don't sync with any other + * clocksource so no need for rendezvous. + */ + if ( use_tsc_stable_bit ) + time_calibration_rendezvous_fn = time_calibration_nop_rendezvous; + init_timer(&calibration_timer, time_calibration, NULL, 0); set_timer(&calibration_timer, NOW() + EPOCH); } -- 2.1.4 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |