[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] xen/livepatch: Make check_for_livepatch_work() faster in the common case


  • To: Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Fri, 22 Dec 2023 22:00:45 +0000
  • Authentication-results: esa3.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>, "Jan Beulich" <JBeulich@xxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Fri, 22 Dec 2023 22:01:10 +0000
  • Ironport-data: A9a23:tmoGEqAe5NEh8xVW/3Ljw5YqxClBgxIJ4kV8jS/XYbTApDlw0zZVy DcWXzjVP6qOMDTzL9lwbIm/9kgAsZCBm4U2QQY4rX1jcSlH+JHPbTi7wuUcHAvJd5GeExg3h yk6QoOdRCzhZiaE/n9BCpC48D8kk/nOH+KgYAL9EngZbRd+Tys8gg5Ulec8g4p56fC0GArIs t7pyyHlEAbNNwVcbCRMt8pvlDs15K6p4WlC5ARlDRx2lAS2e0c9Xcp3yZ6ZdxMUcqEMdsamS uDKyq2O/2+x13/B3fv8z94X2mVTKlLjFVDmZkh+AsBOsTAbzsAG6Y4pNeJ0VKtio27hc+ada jl6ncfYpQ8BZsUgkQmGOvVSO3kW0aZuoNcrLZUj2CA6IoKvn3bEmp1T4E8K0YIw69xXO1hy1 sEhAxckTDeF3uWG7ruLRbw57igjBJGD0II3v3hhyXfSDOo8QICFSKLPjTNa9G5u3IYUR6+YP pdIL2U3BPjDS0Qn1lM/IZQyhuq3wFL4dCVVsgm9rqsr+WnDigd21dABNfKMIIzbG5QExhrwS mTuzVWmAQ0gNsel+ze79VmTgPXAxx6nYddHfFG/3qEz2wDCroAJMzUGWF3+rfSnh0qWX9NEN 1dS6icotbI19kGgUp/6RRLQiH2DuAQVV5xPEuk5wAaXw6HQ7kCSAW1sZhxrZcEitcQ2bSc3z VLPlNTsbRRwtJWFRHTb8a2bxQ5eIgBMczVEP3VdC1JYsp+8+Onfky4jUP5yKo6Pt+HWJAjb5 Bmoiy5ngIU9hP4EgvDTEU/8v968mnTYZldqvl6NATv/v14RWWKzW2C/BbHmARd8wGWxFADpU IAswZT20Qz3JcjleNaxaOsMBqq1wP2OLSfRh1Vid7F4qGz2oSH7J9wBumgnTKuMDirjUWa4C HI/RCsLvMMDVJdURf4fj32N5zQCkvG7SIWNugH8ZdtSeJlhHDJrDwk3DXN8K1vFyRB2+YlmY MfzTCpZJSpCYUiR5GbsFrh1PH5C7nxW+F4/srihlEv+iefOOiTLIVrHWXPXBt0EAGq/iF292 75i2wGikn2zjMWWjvHrzLMu
  • Ironport-hdrordr: A9a23:ZIgAgqD0V6tEll/lHemg55DYdb4zR+YMi2TC1yhKJyC9Ffbo8P xG/c5rsSMc5wxwZJhNo7y90cq7MBbhHPxOkOos1N6ZNWGM0gaVxelZnO3fKlbbehEWmNQz6U 4ZSdkdNOHN
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

When livepatching is enabled, this function is used all the time.  Really do
check the fastpath first, and annotate it likely() as this is the right answer
100% of the time (to many significant figures).

This cuts out 3 pointer dereferences in the "nothing to do path", and it seems
the optimiser has an easier time too.  Bloat-o-meter reports:

  add/remove: 0/0 grow/shrink: 0/2 up/down: 0/-57 (-57)
  Function                                     old     new   delta
  check_for_livepatch_work.cold               1201    1183     -18
  check_for_livepatch_work                    1021     982     -39

which isn't too shabby for no logical change.

Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
---
CC: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
CC: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
CC: Jan Beulich <JBeulich@xxxxxxxx>
CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
CC: Wei Liu <wl@xxxxxxx>

I'm still a little disappointed with the code generation.  GCC still chooses
to set up the full stack frame (6 regs, +3 more slots) intermixed with the
per-cpu calculations.

In isolation, GCC can check the boolean without creating a stack frame:

  <work_to_to>:
    48 89 e2                mov    %rsp,%rdx
    48 8d 05 de e1 37 00    lea    0x37e1de(%rip),%rax        # 
ffff82d0405b6068 <per_cpu__work_to_do>
    48 81 ca ff 7f 00 00    or     $0x7fff,%rdx
    8b 4a c1                mov    -0x3f(%rdx),%ecx
    48 8d 15 45 aa 39 00    lea    0x39aa45(%rip),%rdx        # 
ffff82d0405d28e0 <__per_cpu_offset>
    48 8b 14 ca             mov    (%rdx,%rcx,8),%rdx
    0f b6 04 02             movzbl (%rdx,%rax,1),%eax
    c3                      retq

but I can't find a way to convince GCC that it would be worth not setting up a
stack frame in in the common case, and having a few extra mov reg/reg's later
in the uncommon case.

I haven't tried manually splitting the function into a check() and a do()
function.  Views on whether that might be acceptable?  At a guess, do() would
need to be a static noinline to avoid it turning back into what it currently
is.
---
 xen/common/livepatch.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/xen/common/livepatch.c b/xen/common/livepatch.c
index 1209fea2566c..b6275339f663 100644
--- a/xen/common/livepatch.c
+++ b/xen/common/livepatch.c
@@ -1706,15 +1706,15 @@ void check_for_livepatch_work(void)
     s_time_t timeout;
     unsigned long flags;
 
+    /* Fast path: no work to do. */
+    if ( likely(!per_cpu(work_to_do, cpu)) )
+        return;
+
     /* Only do any work when invoked in truly idle state. */
     if ( system_state != SYS_STATE_active ||
          !is_idle_domain(current->sched_unit->domain) )
         return;
 
-    /* Fast path: no work to do. */
-    if ( !per_cpu(work_to_do, cpu ) )
-        return;
-
     smp_rmb();
     /* In case we aborted, other CPUs can skip right away. */
     if ( !livepatch_work.do_work )

base-commit: 49818cde637b5ec20383e46b71f93b2e7d867686
-- 
2.30.2




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.