[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 5/6] x86/time: implement PVCLOCK_TSC_STABLE_BIT



>>> On 30.08.16 at 14:26, <joao.m.martins@xxxxxxxxxx> wrote:
> On 08/29/2016 11:06 AM, Jan Beulich wrote:
>>>>> On 26.08.16 at 17:44, <joao.m.martins@xxxxxxxxxx> wrote:
>>> On 08/25/2016 11:37 AM, Jan Beulich wrote:
>>>>>>> On 24.08.16 at 14:43, <joao.m.martins@xxxxxxxxxx> wrote:
>>>>> This patch proposes relying on host TSC synchronization and
>>>>> passthrough to the guest, when running on a TSC-safe platform. On
>>>>> time_calibration we retrieve the platform time in ns and the counter
>>>>> read by the clocksource that was used to compute system time. We
>>>>> introduce a new rendezous function which doesn't require
>>>>> synchronization between master and slave CPUS and just reads
>>>>> calibration_rendezvous struct and writes it down the stime and stamp
>>>>> to the cpu_calibration struct to be used later on. We can guarantee that
>>>>> on a platform with a constant and reliable TSC, that the time read on
>>>>> vcpu B right after A is bigger independently of the VCPU calibration
>>>>> error. Since pvclock time infos are monotonic as seen by any vCPU set
>>>>> PVCLOCK_TSC_STABLE_BIT, which then enables usage of VDSO on Linux.
>>>>> IIUC, this is similar to how it's implemented on KVM.
>>>>
>>>> Without any tools side change, how is it guaranteed that a guest
>>>> which observed the stable bit won't get migrated to a host not
>>>> providing that guarantee?
>>> Do you want to prevent migration in such cases? The worst that can happen 
>>> is that the
>>> guest might need to fallback to a system call if this bit is 0 and would 
>>> keep doing
>>> so if the bit is 0.
>> 
>> Whether migration needs preventing I'm not sure; all I was trying
>> to indicate is that there seem to be pieces missing wrt migration.
>> As to the guest falling back to a system call - are guest kernels and
>> (as far as as affected) applications required to cope with the flag
>> changing from 1 to 0 behind their back?
> It's expected they cope with this bit changing AFAIK. The vdso code (i.e.
> applications) always check this bit on every read to decide whether to 
> fallback to a
> system call. And same for pvclock code in the guest kernel on every read in 
> both
> Linux/FreeBSD to see whether to skip or not the monotonicity checks.

Okay, but please make sure this is called out at least in the commit
message, if not in a code comment.

>> Perhaps even
>> worse than the multi-node consideration here is hyper-threading, as
>> that makes it fundamentally impossible that all threads within one core
>> execute the same operation at exactly the same time. Not to speak of
>> the various odd cache effects which I did observe while doing the
>> measurements for my series (e.g. the second thread speculating the
>> TSC reads much farther than the primary ones, presumably because
>> the primary ones first needed to get the I-cache populated).
> Hmmm, not sure how we could cope with TSC HT issues. In this patch, we 
> propagate TSC
> reads from platform timer on CPU 0 into the other CPUs, it would probably is
> non-visible as there aren't TSC reads being done on multiple threads 
> approximately at
> the same time?

Right - much depends on parameters the values of which we don't
even have an idea of. Like how frequently get hyperthreads get
switched within a core.

>>> Other than the things above I am not sure how to go about this :( Should we 
>>> start
>>> adjusting the TSCs if we find disparities or skew is observed on the long 
>>> run? Or
>>> allow only TSCs on vCPUS of the same package to expose this flag? Hmm, 
>>> what's your
>>> take on this? Appreciate your feedback.
>> 
>> At least as an initial approach requiring affinities to be limited to a
>> single socket would seem like a good compromise, provided HT
>> aspects don't have a bad effect (in which case also excluding HT
>> may be required). I'd also be fine with command line options
>> allowing to further relax that, but a simple "clocksource=tsc"
>> should imo result in a setup which from all we can tell will work as
>> intended.
> Sounds reasonable, so unless command line options are specified we disallow 
> TSC to be
> clocksource on multi-socket systems. WRT to command line options, how about 
> extending
> "tsc" parameter to accept another possible value such as "global" or 
> "socketsafe"?
> Current values are "unstable" and "skewed".

What about "stable, "stable:socket" (and then perhaps also
"stable:node")?

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.