[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v4 3/3] x86/time: avoid reading the platform timer in rendezvous functions
On Wed, Apr 21, 2021 at 12:06:34PM +0200, Jan Beulich wrote: > On 20.04.2021 18:12, Roger Pau Monné wrote: > > On Thu, Apr 01, 2021 at 11:55:10AM +0200, Jan Beulich wrote: > >> Reading the platform timer isn't cheap, so we'd better avoid it when the > >> resulting value is of no interest to anyone. > >> > >> The consumer of master_stime, obtained by > >> time_calibration_{std,tsc}_rendezvous() and propagated through > >> this_cpu(cpu_calibration), is local_time_calibration(). With > >> CONSTANT_TSC the latter function uses an early exit path, which doesn't > >> explicitly use the field. While this_cpu(cpu_calibration) (including the > >> master_stime field) gets propagated to this_cpu(cpu_time).stamp on that > >> path, both structures' fields get consumed only by the !CONSTANT_TSC > >> logic of the function. > >> > >> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx> > >> --- > >> v4: New. > >> --- > >> I realize there's some risk associated with potential new uses of the > >> field down the road. What would people think about compiling time.c a > >> 2nd time into a dummy object file, with a conditional enabled to force > >> assuming CONSTANT_TSC, and with that conditional used to suppress > >> presence of the field as well as all audited used of it (i.e. in > >> particular that large part of local_time_calibration())? Unexpected new > >> users of the field would then cause build time errors. > > > > Wouldn't that add quite a lot of churn to the file itself in the form > > of pre-processor conditionals? > > Possibly - I didn't try yet, simply because of fearing this might > not be liked even without presenting it in patch form. > > > Could we instead set master_stime to an invalid value that would make > > the consumers explode somehow? > > No idea whether there is any such "reliable" value. > > > I know there might be new consumers, but those should be able to > > figure whether the value is sane by looking at the existing ones. > > This could be the hope, yes. But the effort of auditing the code to > confirm the potential of optimizing this (after vaguely getting the > impression there might be room) was non-negligible (in fact I did > three runs just to be really certain). This in particular means > that I'm in no way certain that looking at existing consumers would > point out the possible pitfall. > > > Also, since this is only done on the BSP on the last iteration I > > wonder if it really makes such a difference performance-wise to > > warrant all this trouble. > > By "all this trouble", do you mean the outlined further steps or > the patch itself? Yes, either the further steps or the fact that we would have to be careful to not introduce new users of master_stime that expect it to be set when CONSTANT_TSC is true. > In the latter case, while it's only the BSP to > read the value, all other CPUs are waiting for the BSP to get its > part done. So the extra time it takes to read the platform clock > affects the overall duration of the rendezvous, and hence the time > not "usefully" spent by _all_ of the CPUs. Right, but that's only during the time rendezvous, which doesn't happen that often. And I guess that just the rendezvous of all CPUs is biggest hit in terms of performance. While I don't think I would have done the work myself, I guess there's no reason to block it. In any case I would prefer if such performance related changes come with some proof that they do indeed make a difference, or else we might just be making the code more complicated for no concrete performance benefit. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |