[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH] xen/livepatch: Make check_for_livepatch_work() faster in the common case
When livepatching is enabled, this function is used all the time. Really do check the fastpath first, and annotate it likely() as this is the right answer 100% of the time (to many significant figures). This cuts out 3 pointer dereferences in the "nothing to do path", and it seems the optimiser has an easier time too. Bloat-o-meter reports: add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-57 (-57) Function old new delta check_for_livepatch_work.cold 1201 1183 -18 check_for_livepatch_work 1021 982 -39 which isn't too shabby for no logical change. Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> --- CC: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> CC: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx> CC: Jan Beulich <JBeulich@xxxxxxxx> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> CC: Wei Liu <wl@xxxxxxx> I'm still a little disappointed with the code generation. GCC still chooses to set up the full stack frame (6 regs, +3 more slots) intermixed with the per-cpu calculations. In isolation, GCC can check the boolean without creating a stack frame: <work_to_to>: 48 89 e2 mov %rsp,%rdx 48 8d 05 de e1 37 00 lea 0x37e1de(%rip),%rax # ffff82d0405b6068 <per_cpu__work_to_do> 48 81 ca ff 7f 00 00 or $0x7fff,%rdx 8b 4a c1 mov -0x3f(%rdx),%ecx 48 8d 15 45 aa 39 00 lea 0x39aa45(%rip),%rdx # ffff82d0405d28e0 <__per_cpu_offset> 48 8b 14 ca mov (%rdx,%rcx,8),%rdx 0f b6 04 02 movzbl (%rdx,%rax,1),%eax c3 retq but I can't find a way to convince GCC that it would be worth not setting up a stack frame in in the common case, and having a few extra mov reg/reg's later in the uncommon case. I haven't tried manually splitting the function into a check() and a do() function. Views on whether that might be acceptable? At a guess, do() would need to be a static noinline to avoid it turning back into what it currently is. --- xen/common/livepatch.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c index 1209fea2566c..b6275339f663 100644 --- a/xen/common/livepatch.c +++ b/xen/common/livepatch.c @@ -1706,15 +1706,15 @@ void check_for_livepatch_work(void) s_time_t timeout; unsigned long flags; + /* Fast path: no work to do. */ + if ( likely(!per_cpu(work_to_do, cpu)) ) + return; + /* Only do any work when invoked in truly idle state. */ if ( system_state != SYS_STATE_active || !is_idle_domain(current->sched_unit->domain) ) return; - /* Fast path: no work to do. */ - if ( !per_cpu(work_to_do, cpu ) ) - return; - smp_rmb(); /* In case we aborted, other CPUs can skip right away. */ if ( !livepatch_work.do_work ) base-commit: 49818cde637b5ec20383e46b71f93b2e7d867686 -- 2.30.2
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |