[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 4/6] x86/idle: Implement a new MWAIT IPI-elision algorithm
On 03/07/2025 5:36 pm, Roger Pau Monné wrote: > On Wed, Jul 02, 2025 at 03:41:19PM +0100, Andrew Cooper wrote: >> In order elide IPIs, we must be able to identify whether a target CPU is in >> MWAIT at the point it is woken up. i.e. the store to wake it up must also >> identify the state. >> >> Create a new in_mwait variable beside __softirq_pending, so we can use a >> CMPXCHG to set the softirq while also observing the status safely. Implement >> an x86 version of arch_pend_softirq() which does this. >> >> In mwait_idle_with_hints(), advertise in_mwait, with an explanation of >> precisely what it means. X86_BUG_MONITOR can be accounted for simply by not >> advertising in_mwait. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> --- >> CC: Jan Beulich <JBeulich@xxxxxxxx> >> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> CC: Anthony PERARD <anthony.perard@xxxxxxxxxx> >> CC: Michal Orzel <michal.orzel@xxxxxxx> >> CC: Julien Grall <julien@xxxxxxx> >> CC: Stefano Stabellini <sstabellini@xxxxxxxxxx> >> >> This is modelled after Linux's TIF_NEED_RESCHED (single bit equivelent of all >> of __softirq_pending), and TIF_POLLING_NRFLAG (arch-neutral "in_mwait"). >> >> In Linux, they're both in the same flags field, which adds complexity. In >> Xen, __softirq_pending is already unsigned long for everything other than >> x86, >> so adding an arch-neutral "in_mwait" would need wider changes. >> --- >> xen/arch/x86/acpi/cpu_idle.c | 20 +++++++++++++++++- >> xen/arch/x86/include/asm/hardirq.h | 14 +++++++++++- >> xen/arch/x86/include/asm/softirq.h | 34 ++++++++++++++++++++++++++++++ >> 3 files changed, 66 insertions(+), 2 deletions(-) >> >> diff --git a/xen/arch/x86/acpi/cpu_idle.c b/xen/arch/x86/acpi/cpu_idle.c >> index caa0fef0b3b1..13040ef467a0 100644 >> --- a/xen/arch/x86/acpi/cpu_idle.c >> +++ b/xen/arch/x86/acpi/cpu_idle.c >> @@ -439,7 +439,21 @@ __initcall(cpu_idle_key_init); >> void mwait_idle_with_hints(unsigned int eax, unsigned int ecx) >> { >> unsigned int cpu = smp_processor_id(); >> - const unsigned int *this_softirq_pending = &softirq_pending(cpu); >> + irq_cpustat_t *stat = &irq_stat[cpu]; >> + const unsigned int *this_softirq_pending = &stat->__softirq_pending; >> + >> + /* >> + * By setting in_mwait, we promise to other CPUs that we'll notice >> changes >> + * to __softirq_pending without being sent an IPI. We achieve this by >> + * either not going to sleep, or by having hardware notice on our >> behalf. >> + * >> + * Some errata exist where MONITOR doesn't work properly, and the >> + * workaround is to force the use of an IPI. Cause this to happen by >> + * simply not advertising outselves as being in_mwait. >> + */ >> + alternative_io("movb $1, %[in_mwait]", >> + "", X86_BUG_MONITOR, >> + [in_mwait] "=m" (stat->in_mwait)); >> >> monitor(this_softirq_pending, 0, 0); >> smp_mb(); >> @@ -452,6 +466,10 @@ void mwait_idle_with_hints(unsigned int eax, unsigned >> int ecx) >> mwait(eax, ecx); >> spec_ctrl_exit_idle(info); >> } >> + >> + alternative_io("movb $0, %[in_mwait]", >> + "", X86_BUG_MONITOR, >> + [in_mwait] "=m" (stat->in_mwait)); > Isn't it a bit overkill to use alternatives here when you could have a > conditional based on a global variable to decide whether to set/clear > in_mwait? I view it differently. Why should the common case suffer overhead (extra memory read and conditional branch) to work around 3 buggy pieces of hardware in a common path? >> } >> >> static void acpi_processor_ffh_cstate_enter(struct acpi_processor_cx *cx) >> diff --git a/xen/arch/x86/include/asm/hardirq.h >> b/xen/arch/x86/include/asm/hardirq.h >> index f3e93cc9b507..1647cff04dc8 100644 >> --- a/xen/arch/x86/include/asm/hardirq.h >> +++ b/xen/arch/x86/include/asm/hardirq.h >> @@ -5,7 +5,19 @@ >> #include <xen/types.h> >> >> typedef struct { >> - unsigned int __softirq_pending; >> + /* >> + * The layout is important. Any CPU can set bits in __softirq_pending, >> + * but in_mwait is a status bit owned by the CPU. softirq_mwait_raw >> must >> + * cover both, and must be in a single cacheline. >> + */ >> + union { >> + struct { >> + unsigned int __softirq_pending; >> + bool in_mwait; > Given the usage in assembly where you store 0 and 1, this might better > be a uint8_t then? We have loads of asm code which accesses bool's like this. C guarantees that sizeof(bool) >= sizeof(char). > >> + }; >> + uint64_t softirq_mwait_raw; >> + }; > This could be a named union type ... > >> + >> unsigned int __local_irq_count; >> unsigned int nmi_count; >> unsigned int mce_count; >> diff --git a/xen/arch/x86/include/asm/softirq.h >> b/xen/arch/x86/include/asm/softirq.h >> index e4b194f069fb..069e5716a68d 100644 >> --- a/xen/arch/x86/include/asm/softirq.h >> +++ b/xen/arch/x86/include/asm/softirq.h >> @@ -1,6 +1,8 @@ >> #ifndef __ASM_SOFTIRQ_H__ >> #define __ASM_SOFTIRQ_H__ >> >> +#include <asm/system.h> >> + >> #define NMI_SOFTIRQ (NR_COMMON_SOFTIRQS + 0) >> #define TIME_CALIBRATE_SOFTIRQ (NR_COMMON_SOFTIRQS + 1) >> #define VCPU_KICK_SOFTIRQ (NR_COMMON_SOFTIRQS + 2) >> @@ -9,4 +11,36 @@ >> #define HVM_DPCI_SOFTIRQ (NR_COMMON_SOFTIRQS + 4) >> #define NR_ARCH_SOFTIRQS 5 >> >> +/* >> + * Ensure softirq @nr is pending on @cpu. Return true if an IPI can be >> + * skipped, false if the IPI cannot be skipped. >> + * >> + * We use a CMPXCHG covering both __softirq_pending and in_mwait, in order >> to >> + * set softirq @nr while also observing in_mwait in a race-free way. >> + */ >> +static always_inline bool arch_pend_softirq(unsigned int nr, unsigned int >> cpu) >> +{ >> + uint64_t *ptr = &irq_stat[cpu].softirq_mwait_raw; >> + uint64_t old, new; > ... so that you also use it here? No, it cant. The of softirq_pending() in common code requires no intermediate field names, and I'm not untangling that mess in a series wanting backporting. ~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |