|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH] xen/livepatch: Make check_for_livepatch_work() faster in the common case
When livepatching is enabled, this function is used all the time. Really do
check the fastpath first, and annotate it likely() as this is the right answer
100% of the time (to many significant figures).
This cuts out 3 pointer dereferences in the "nothing to do path", and it seems
the optimiser has an easier time too. Bloat-o-meter reports:
add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-57 (-57)
Function old new delta
check_for_livepatch_work.cold 1201 1183 -18
check_for_livepatch_work 1021 982 -39
which isn't too shabby for no logical change.
Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
---
CC: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
CC: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
CC: Jan Beulich <JBeulich@xxxxxxxx>
CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
CC: Wei Liu <wl@xxxxxxx>
I'm still a little disappointed with the code generation. GCC still chooses
to set up the full stack frame (6 regs, +3 more slots) intermixed with the
per-cpu calculations.
In isolation, GCC can check the boolean without creating a stack frame:
<work_to_to>:
48 89 e2 mov %rsp,%rdx
48 8d 05 de e1 37 00 lea 0x37e1de(%rip),%rax #
ffff82d0405b6068 <per_cpu__work_to_do>
48 81 ca ff 7f 00 00 or $0x7fff,%rdx
8b 4a c1 mov -0x3f(%rdx),%ecx
48 8d 15 45 aa 39 00 lea 0x39aa45(%rip),%rdx #
ffff82d0405d28e0 <__per_cpu_offset>
48 8b 14 ca mov (%rdx,%rcx,8),%rdx
0f b6 04 02 movzbl (%rdx,%rax,1),%eax
c3 retq
but I can't find a way to convince GCC that it would be worth not setting up a
stack frame in in the common case, and having a few extra mov reg/reg's later
in the uncommon case.
I haven't tried manually splitting the function into a check() and a do()
function. Views on whether that might be acceptable? At a guess, do() would
need to be a static noinline to avoid it turning back into what it currently
is.
---
xen/common/livepatch.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
index 1209fea2566c..b6275339f663 100644
--- a/xen/common/livepatch.c
+++ b/xen/common/livepatch.c
@@ -1706,15 +1706,15 @@ void check_for_livepatch_work(void)
s_time_t timeout;
unsigned long flags;
+ /* Fast path: no work to do. */
+ if ( likely(!per_cpu(work_to_do, cpu)) )
+ return;
+
/* Only do any work when invoked in truly idle state. */
if ( system_state != SYS_STATE_active ||
!is_idle_domain(current->sched_unit->domain) )
return;
- /* Fast path: no work to do. */
- if ( !per_cpu(work_to_do, cpu ) )
- return;
-
smp_rmb();
/* In case we aborted, other CPUs can skip right away. */
if ( !livepatch_work.do_work )
base-commit: 49818cde637b5ec20383e46b71f93b2e7d867686
--
2.30.2
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |