[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-changelog] [xen master] x86: suppress event check IPI to MWAITing CPUs



commit 9a727a813e9b25003e433b3dc3fa47e621f9e238
Author:     Jan Beulich <jbeulich@xxxxxxxx>
AuthorDate: Thu Sep 18 14:43:49 2014 +0200
Commit:     Jan Beulich <jbeulich@xxxxxxxx>
CommitDate: Thu Sep 18 14:43:49 2014 +0200

    x86: suppress event check IPI to MWAITing CPUs
    
    Mass wakeups (via vlapic_ipi()) can take enormous amounts of time,
    especially when many of the remote pCPU-s are in deep C-states. For
    64-vCPU Windows Server 2012 R2 guests on Ivybridge hardware,
    accumulated times of over 2ms were observed (average 1.1ms).
    Considering that Windows broadcasts IPIs from its timer interrupt,
    which at least at certain times can run at 1kHz, it is clear that this
    can't result in good guest behavior. In fact, on said hardware guests
    with significantly beyond 40 vCPU-s simply hung when e.g. ServerManager
    gets started.
    
    Recognizing that writes to softirq_pending() already have the effect of
    waking remote CPUs from MWAITing (due to being co-located on the same
    cache line with mwait_wakeup()), we can avoid sending IPIs to CPUs we
    know are in a (deep) C-state entered via MWAIT.
    
    With this, average broadcast times for a 64-vCPU guest went down to a
    measured maximum of 255us (which is still quite a lot).
    
    One aspect worth noting is that cpumask_raise_softirq() gets brought in
    sync here with cpu_raise_softirq() in that now both don't attempt to
    raise a self-IPI on the processing CPU.
    
    Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
    Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
    Reviewed-by: Tim Deegan <tim@xxxxxxx>
---
 xen/arch/x86/acpi/cpu_idle.c  |   12 +++++++++++-
 xen/common/softirq.c          |    9 ++++++---
 xen/include/asm-arm/softirq.h |    2 ++
 xen/include/asm-x86/softirq.h |    2 ++
 4 files changed, 21 insertions(+), 4 deletions(-)

diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c
index 136c0b6..f72719c 100644
--- a/xen/arch/x86/acpi/cpu_idle.c
+++ b/xen/arch/x86/acpi/cpu_idle.c
@@ -330,6 +330,16 @@ void cpuidle_wakeup_mwait(cpumask_t *mask)
     cpumask_andnot(mask, mask, &target);
 }
 
+bool_t arch_skip_send_event_check(unsigned int cpu)
+{
+    /*
+     * This relies on softirq_pending() and mwait_wakeup() to access data
+     * on the same cache line.
+     */
+    smp_mb();
+    return !!cpumask_test_cpu(cpu, &cpuidle_mwait_flags);
+}
+
 void mwait_idle_with_hints(unsigned int eax, unsigned int ecx)
 {
     unsigned int cpu = smp_processor_id();
@@ -349,7 +359,7 @@ void mwait_idle_with_hints(unsigned int eax, unsigned int 
ecx)
      * Timer deadline passing is the event on which we will be woken via
      * cpuidle_mwait_wakeup. So check it now that the location is armed.
      */
-    if ( expires > NOW() || expires == 0 )
+    if ( (expires > NOW() || expires == 0) && !softirq_pending(cpu) )
     {
         cpumask_set_cpu(cpu, &cpuidle_mwait_flags);
         __mwait(eax, ecx);
diff --git a/xen/common/softirq.c b/xen/common/softirq.c
index 195f8ff..ea86671 100644
--- a/xen/common/softirq.c
+++ b/xen/common/softirq.c
@@ -70,12 +70,14 @@ void open_softirq(int nr, softirq_handler handler)
 
 void cpumask_raise_softirq(const cpumask_t *mask, unsigned int nr)
 {
-    int cpu;
+    unsigned int cpu, this_cpu = smp_processor_id();
     cpumask_t send_mask;
 
     cpumask_clear(&send_mask);
     for_each_cpu(cpu, mask)
-        if ( !test_and_set_bit(nr, &softirq_pending(cpu)) )
+        if ( !test_and_set_bit(nr, &softirq_pending(cpu)) &&
+             cpu != this_cpu &&
+             !arch_skip_send_event_check(cpu) )
             cpumask_set_cpu(cpu, &send_mask);
 
     smp_send_event_check_mask(&send_mask);
@@ -84,7 +86,8 @@ void cpumask_raise_softirq(const cpumask_t *mask, unsigned 
int nr)
 void cpu_raise_softirq(unsigned int cpu, unsigned int nr)
 {
     if ( !test_and_set_bit(nr, &softirq_pending(cpu))
-         && (cpu != smp_processor_id()) )
+         && (cpu != smp_processor_id())
+         && !arch_skip_send_event_check(cpu) )
         smp_send_event_check_cpu(cpu);
 }
 
diff --git a/xen/include/asm-arm/softirq.h b/xen/include/asm-arm/softirq.h
index 35e578a..976e0eb 100644
--- a/xen/include/asm-arm/softirq.h
+++ b/xen/include/asm-arm/softirq.h
@@ -3,6 +3,8 @@
 
 #define NR_ARCH_SOFTIRQS       0
 
+#define arch_skip_send_event_check(cpu) 0
+
 #endif /* __ASM_SOFTIRQ_H__ */
 /*
  * Local variables:
diff --git a/xen/include/asm-x86/softirq.h b/xen/include/asm-x86/softirq.h
index 9d8e2e1..7225dea 100644
--- a/xen/include/asm-x86/softirq.h
+++ b/xen/include/asm-x86/softirq.h
@@ -9,4 +9,6 @@
 #define PCI_SERR_SOFTIRQ       (NR_COMMON_SOFTIRQS + 4)
 #define NR_ARCH_SOFTIRQS       5
 
+bool_t arch_skip_send_event_check(unsigned int cpu);
+
 #endif /* __ASM_SOFTIRQ_H__ */
--
generated by git-patchbot for /home/xen/git/xen.git#master

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxx
http://lists.xensource.com/xen-changelog


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.