[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 3/3] x86/time: avoid reading the platform timer in rendezvous functions
On 21.04.2021 12:06, Jan Beulich wrote: > On 20.04.2021 18:12, Roger Pau Monné wrote: >> On Thu, Apr 01, 2021 at 11:55:10AM +0200, Jan Beulich wrote: >>> Reading the platform timer isn't cheap, so we'd better avoid it when the >>> resulting value is of no interest to anyone. >>> >>> The consumer of master_stime, obtained by >>> time_calibration_{std,tsc}_rendezvous() and propagated through >>> this_cpu(cpu_calibration), is local_time_calibration(). With >>> CONSTANT_TSC the latter function uses an early exit path, which doesn't >>> explicitly use the field. While this_cpu(cpu_calibration) (including the >>> master_stime field) gets propagated to this_cpu(cpu_time).stamp on that >>> path, both structures' fields get consumed only by the !CONSTANT_TSC >>> logic of the function. >>> >>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> >>> --- >>> v4: New. >>> --- >>> I realize there's some risk associated with potential new uses of the >>> field down the road. What would people think about compiling time.c a >>> 2nd time into a dummy object file, with a conditional enabled to force >>> assuming CONSTANT_TSC, and with that conditional used to suppress >>> presence of the field as well as all audited used of it (i.e. in >>> particular that large part of local_time_calibration())? Unexpected new >>> users of the field would then cause build time errors. >> >> Wouldn't that add quite a lot of churn to the file itself in the form >> of pre-processor conditionals? > > Possibly - I didn't try yet, simply because of fearing this might > not be liked even without presenting it in patch form. > >> Could we instead set master_stime to an invalid value that would make >> the consumers explode somehow? > > No idea whether there is any such "reliable" value. > >> I know there might be new consumers, but those should be able to >> figure whether the value is sane by looking at the existing ones. > > This could be the hope, yes. But the effort of auditing the code to > confirm the potential of optimizing this (after vaguely getting the > impression there might be room) was non-negligible (in fact I did > three runs just to be really certain). This in particular means > that I'm in no way certain that looking at existing consumers would > point out the possible pitfall. > >> Also, since this is only done on the BSP on the last iteration I >> wonder if it really makes such a difference performance-wise to >> warrant all this trouble. > > By "all this trouble", do you mean the outlined further steps or > the patch itself? In the latter case, while it's only the BSP to > read the value, all other CPUs are waiting for the BSP to get its > part done. So the extra time it takes to read the platform clock > affects the overall duration of the rendezvous, and hence the time > not "usefully" spent by _all_ of the CPUs. Ping? Your answer here has a significant effect on the disposition of this change. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |