[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Ongoing/future speculative mitigation work



On Thu, Oct 25, 2018 at 11:02 AM George Dunlap <george.dunlap@xxxxxxxxxx> wrote:
>
> On 10/25/2018 05:55 PM, Andrew Cooper wrote:
> > On 24/10/18 16:24, Tamas K Lengyel wrote:
> >>> A solution to this issue was proposed, whereby Xen synchronises siblings
> >>> on vmexit/entry, so we are never executing code in two different
> >>> privilege levels.  Getting this working would make it safe to continue
> >>> using hyperthreading even in the presence of L1TF.  Obviously, its going
> >>> to come in perf hit, but compared to disabling hyperthreading, all its
> >>> got to do is beat a 60% perf hit to make it the preferable option for
> >>> making your system L1TF-proof.
> >> Could you shed some light what tests were done where that 60%
> >> performance hit was observed? We have performed intensive stress-tests
> >> to confirm this but according to our findings turning off
> >> hyper-threading is actually improving performance on all machines we
> >> tested thus far.
> >
> > Aggregate inter and intra host disk and network throughput, which is a
> > reasonable approximation of a load of webserver VM's on a single
> > physical server.  Small packet IO was hit worst, as it has a very high
> > vcpu context switch rate between dom0 and domU.  Disabling HT means you
> > have half the number of logical cores to schedule on, which doubles the
> > mean time to next timeslice.
> >
> > In principle, for a fully optimised workload, HT gets you ~30% extra due
> > to increased utilisation of the pipeline functional units.  Some
> > resources are statically partitioned, while some are competitively
> > shared, and its now been well proven that actions on one thread can have
> > a large effect on others.
> >
> > Two arbitrary vcpus are not an optimised workload.  If the perf
> > improvement you get from not competing in the pipeline is greater than
> > the perf loss from Xen's reduced capability to schedule, then disabling
> > HT would be an improvement.  I can certainly believe that this might be
> > the case for Qubes style workloads where you are probably not very
> > overprovisioned, and you probably don't have long running IO and CPU
> > bound tasks in the VMs.
>
> As another data point, I think it was MSCI who said they always disabled
> hyperthreading, because they also found that their workloads ran slower
> with HT than without.  Presumably they were doing massive number
> crunching, such that each thread was waiting on the ALU a significant
> portion of the time anyway; at which point the superscalar scheduling
> and/or reduction in cache efficiency would have brought performance from
> "no benefit" down to "negative benefit".
>

Thanks for the insights. Indeed, we are primarily concerned with
performance of Qubes-style workloads which may range from
no-oversubscription to heavily oversubscribed. It's not a workload we
can predict or optimize before-hand, so we are looking for a default
that would be 1) safe and 2) performant in the most general case
possible.

Tamas

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.