[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[xen master] x86/mwait-idle: enable interrupts before C1 on Xeons



commit b17e0ec72eded037297f34a233655aad23f64711
Author:     Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
AuthorDate: Wed Feb 2 10:28:29 2022 +0100
Commit:     Jan Beulich <jbeulich@xxxxxxxx>
CommitDate: Wed Feb 2 10:28:29 2022 +0100

    x86/mwait-idle: enable interrupts before C1 on Xeons
    
    Enable local interrupts before requesting C1 on the last two generations
    of Intel Xeon platforms: Sky Lake, Cascade Lake, Cooper Lake, Ice Lake.
    This decreases average C1 interrupt latency by about 5-10%, as measured
    with the 'wult' tool.
    
    The '->enter()' function of the driver enters C-states with local
    interrupts disabled by executing the 'monitor' and 'mwait' pair of
    instructions. If an interrupt happens, the CPU exits the C-state and
    continues executing instructions after 'mwait'. It does not jump to
    the interrupt handler, because local interrupts are disabled. The
    cpuidle subsystem enables interrupts a bit later, after doing some
    housekeeping.
    
    With this patch, we enable local interrupts before requesting C1. In
    this case, if the CPU wakes up because of an interrupt, it will jump
    to the interrupt handler right away. The cpuidle housekeeping will be
    done after the pending interrupt(s) are handled.
    
    Enabling interrupts before entering a C-state has measurable impact
    for faster C-states, like C1. Deeper, but slower C-states like C6 do
    not really benefit from this sort of change, because their latency is
    a lot higher comparing to the delay added by cpuidle housekeeping.
    
    This change was also tested with cyclictest and dbench. In case of Ice
    Lake, the average cyclictest latency decreased by 5.1%, and the average
    'dbench' throughput increased by about 0.8%. Both tests were run for 4
    hours with only C1 enabled (all other idle states, including 'POLL',
    were disabled). CPU frequency was pinned to HFM, and uncore frequency
    was pinned to the maximum value. The other platforms had similar
    single-digit percentage improvements.
    
    It is worth noting that this patch affects 'cpuidle' statistics a tiny
    bit.  Before this patch, C1 residency did not include the interrupt
    handling time, but with this patch, it will include it. This is similar
    to what happens in case of the 'POLL' state, which also runs with
    interrupts enabled.
    
    Suggested-by: Len Brown <len.brown@xxxxxxxxx>
    Signed-off-by: Artem Bityutskiy <artem.bityutskiy@xxxxxxxxxxxxxxx>
    [Linux commit: c227233ad64c77e57db738ab0e46439db71822a3]
    
    We don't have a pointer into cpuidle_state_table[] readily available.
    To compensate, propagate the flag into struct acpi_processor_cx.
    
    Unlike Linux we want to
    - disable IRQs again after MWAITing, as subsequently invoked functions
      assume so,
    - avoid enabling IRQs if cstate_restore_tsc() is not a no-op, to avoid
      interfering with, in particular, the time rendezvous.
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
---
 xen/arch/x86/cpu/mwait-idle.c | 22 +++++++++++++++++++---
 xen/include/xen/cpuidle.h     |  1 +
 2 files changed, 20 insertions(+), 3 deletions(-)

diff --git a/xen/arch/x86/cpu/mwait-idle.c b/xen/arch/x86/cpu/mwait-idle.c
index 12524eefd4..24d073d315 100644
--- a/xen/arch/x86/cpu/mwait-idle.c
+++ b/xen/arch/x86/cpu/mwait-idle.c
@@ -107,6 +107,11 @@ static const struct cpuidle_state {
 } *cpuidle_state_table;
 
 #define CPUIDLE_FLAG_DISABLED          0x1
+/*
+ * Enable interrupts before entering the C-state. On some platforms and for
+ * some C-states, this may measurably decrease interrupt latency.
+ */
+#define CPUIDLE_FLAG_IRQ_ENABLE                0x8000
 /*
  * Set this flag for states where the HW flushes the TLB for us
  * and so we don't need cross-calls to keep it consistent.
@@ -539,7 +544,7 @@ static struct cpuidle_state __read_mostly skl_cstates[] = {
 static struct cpuidle_state __read_mostly skx_cstates[] = {
        {
                .name = "C1",
-               .flags = MWAIT2flg(0x00),
+               .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
                .exit_latency = 2,
                .target_residency = 2,
        },
@@ -561,7 +566,7 @@ static struct cpuidle_state __read_mostly skx_cstates[] = {
 static const struct cpuidle_state icx_cstates[] = {
        {
                .name = "C1",
-               .flags = MWAIT2flg(0x00),
+               .flags = MWAIT2flg(0x00) | CPUIDLE_FLAG_IRQ_ENABLE,
                .exit_latency = 1,
                .target_residency = 1,
        },
@@ -842,9 +847,15 @@ static void mwait_idle(void)
 
        update_last_cx_stat(power, cx, before);
 
-       if (cpu_is_haltable(cpu))
+       if (cpu_is_haltable(cpu)) {
+               if (cx->irq_enable_early)
+                       local_irq_enable();
+
                mwait_idle_with_hints(cx->address, MWAIT_ECX_INTERRUPT_BREAK);
 
+               local_irq_disable();
+       }
+
        after = alternative_call(cpuidle_get_tick);
 
        cstate_restore_tsc();
@@ -1335,6 +1346,11 @@ static int mwait_idle_cpu_init(struct notifier_block 
*nfb,
                cx->latency = cpuidle_state_table[cstate].exit_latency;
                cx->target_residency =
                        cpuidle_state_table[cstate].target_residency;
+               if ((cpuidle_state_table[cstate].flags &
+                    CPUIDLE_FLAG_IRQ_ENABLE) &&
+                   /* cstate_restore_tsc() needs to be a no-op */
+                   boot_cpu_has(X86_FEATURE_NONSTOP_TSC))
+                       cx->irq_enable_early = true;
 
                dev->count++;
        }
diff --git a/xen/include/xen/cpuidle.h b/xen/include/xen/cpuidle.h
index af50d37bb7..bd24a31e12 100644
--- a/xen/include/xen/cpuidle.h
+++ b/xen/include/xen/cpuidle.h
@@ -42,6 +42,7 @@ struct acpi_processor_cx
     u8 idx;
     u8 type;         /* ACPI_STATE_Cn */
     u8 entry_method; /* ACPI_CSTATE_EM_xxx */
+    bool irq_enable_early;
     u32 address;
     u32 latency;
     u32 target_residency;
--
generated by git-patchbot for /home/xen/git/xen.git#master



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.