[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] x86/pv: Split pv_hypercall() in two
On 11.10.2021 20:05, Andrew Cooper wrote: > The is_pv_32bit_vcpu() conditionals hide four lfences, with two taken on any > individual path through the function. There is very little code common > between compat and native, and context-dependent conditionals predict very > badly for a period of time after context switch. > > Move do_entry_int82() from pv/traps.c into pv/hypercall.c, allowing > _pv_hypercall() to be static and forced inline. The delta is: > > add/remove: 0/0 grow/shrink: 1/1 up/down: 300/-282 (18) > Function old new delta > do_entry_int82 50 350 +300 > pv_hypercall 579 297 -282 > > which is tiny, but the perf implications are large: > > Guest | Naples | Milan | SKX | CFL-R | > ------+--------+--------+--------+--------+ > pv64 | 17.4% | 15.5% | 2.6% | 4.5% | > pv32 | 1.9% | 10.9% | 1.4% | 2.5% | > > These are percentage improvements in raw TSC detlas for a xen_version > hypercall, with obvious outliers excluded. Therefore, it is an idealised best > case improvement. > > The pv64 path uses `syscall`, while the pv32 path uses `int $0x82` so > necessarily has higher overhead. Therefore, dropping the lfences is less over > an overall improvement. > > I don't know why the Naples pv32 improvement is so small, but I've double > checked the numbers and they're correct. There's something we're doing which > is a large overhead in the pipeline. > > On the Intel side, both systems are writing to MSR_SPEC_CTRL on > entry/exit (SKX using the retrofitted microcode implementation, CFL-R using > the hardware implementation), while SKX is suffering further from XPTI for > Meltdown protection. > > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |