[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH][4.15] Performance regression due to XSA-336
On 24.03.2021 22:05, Boris Ostrovsky wrote: > > (Re-sending with Stephen added) > > > While running performance tests with recent XSAs backports to our product > we've > discovered significant regression in TPCC performance. With a particular guest > kernel the numbers dropped by as much as 40%. While the change is more intrusive than one would like at this point, an up-to-40% regression imo makes this at least a change to be considered for 4.15. I will admit though that before next week I won't get around to look at this in any more detail than just having read through this cover letter. But perhaps someone else might find time earlier. > We've narrowed that down to XSA-336 patch, specifically to the pt_migrate > rwlock, > and even more specifically to this lock being taken in pt_update_irq(). > > We have quite a large guest (92 VCPUs) doing lots of VMEXITs and the theory is > that lock's cnts atomic is starting to cause lots of coherence traffic. As a > quick test of this replacing pt_vcpu_lock() in pt_update_irq() with just > spin_lock(&v->arch.hvm_vcpu.tm_lock) gets us almost all performance back. > > Stephen Brennan came up with new locking algorithm, I just coded it up. > > A couple of notes: > > * We have only observed the problem and tested this patch for performance on > a fairly old Xen version. However, vpt code is almost identical and I expect > upstream to suffer from the same issue. > > * Stephen provided the following (slightly edited by me) writeup explaining > the > locking algorithm. I would like to include it somewhere but not sure what > the > right place would be. Commit message perhaps? If nowhere else, then definitely in the commit message. But perhaps it could (also) sit in some form right ahead of pt_lock() / pt_unlock()? Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |