[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 3/3] x86/time: avoid reading the platform timer in rendezvous functions



On 21.04.2021 12:06, Jan Beulich wrote:
> On 20.04.2021 18:12, Roger Pau Monné wrote:
>> On Thu, Apr 01, 2021 at 11:55:10AM +0200, Jan Beulich wrote:
>>> Reading the platform timer isn't cheap, so we'd better avoid it when the
>>> resulting value is of no interest to anyone.
>>>
>>> The consumer of master_stime, obtained by
>>> time_calibration_{std,tsc}_rendezvous() and propagated through
>>> this_cpu(cpu_calibration), is local_time_calibration(). With
>>> CONSTANT_TSC the latter function uses an early exit path, which doesn't
>>> explicitly use the field. While this_cpu(cpu_calibration) (including the
>>> master_stime field) gets propagated to this_cpu(cpu_time).stamp on that
>>> path, both structures' fields get consumed only by the !CONSTANT_TSC
>>> logic of the function.
>>>
>>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>>> ---
>>> v4: New.
>>> ---
>>> I realize there's some risk associated with potential new uses of the
>>> field down the road. What would people think about compiling time.c a
>>> 2nd time into a dummy object file, with a conditional enabled to force
>>> assuming CONSTANT_TSC, and with that conditional used to suppress
>>> presence of the field as well as all audited used of it (i.e. in
>>> particular that large part of local_time_calibration())? Unexpected new
>>> users of the field would then cause build time errors.
>>
>> Wouldn't that add quite a lot of churn to the file itself in the form
>> of pre-processor conditionals?
> 
> Possibly - I didn't try yet, simply because of fearing this might
> not be liked even without presenting it in patch form.
> 
>> Could we instead set master_stime to an invalid value that would make
>> the consumers explode somehow?
> 
> No idea whether there is any such "reliable" value.
> 
>> I know there might be new consumers, but those should be able to
>> figure whether the value is sane by looking at the existing ones.
> 
> This could be the hope, yes. But the effort of auditing the code to
> confirm the potential of optimizing this (after vaguely getting the
> impression there might be room) was non-negligible (in fact I did
> three runs just to be really certain). This in particular means
> that I'm in no way certain that looking at existing consumers would
> point out the possible pitfall.
> 
>> Also, since this is only done on the BSP on the last iteration I
>> wonder if it really makes such a difference performance-wise to
>> warrant all this trouble.
> 
> By "all this trouble", do you mean the outlined further steps or
> the patch itself? In the latter case, while it's only the BSP to
> read the value, all other CPUs are waiting for the BSP to get its
> part done. So the extra time it takes to read the platform clock
> affects the overall duration of the rendezvous, and hence the time
> not "usefully" spent by _all_ of the CPUs.

Ping? Your answer here has a significant effect on the disposition
of this change.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.