[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 4/6] x86/idle: Implement a new MWAIT IPI-elision algorithm
On 03/07/2025 10:01 am, Jan Beulich wrote: > On 02.07.2025 16:41, Andrew Cooper wrote: >> In order elide IPIs, we must be able to identify whether a target CPU is in >> MWAIT at the point it is woken up. i.e. the store to wake it up must also >> identify the state. >> >> Create a new in_mwait variable beside __softirq_pending, so we can use a >> CMPXCHG to set the softirq while also observing the status safely. Implement >> an x86 version of arch_pend_softirq() which does this. >> >> In mwait_idle_with_hints(), advertise in_mwait, with an explanation of >> precisely what it means. X86_BUG_MONITOR can be accounted for simply by not >> advertising in_mwait. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> --- >> CC: Jan Beulich <JBeulich@xxxxxxxx> >> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> CC: Anthony PERARD <anthony.perard@xxxxxxxxxx> >> CC: Michal Orzel <michal.orzel@xxxxxxx> >> CC: Julien Grall <julien@xxxxxxx> >> CC: Stefano Stabellini <sstabellini@xxxxxxxxxx> >> >> This is modelled after Linux's TIF_NEED_RESCHED (single bit equivelent of all >> of __softirq_pending), and TIF_POLLING_NRFLAG (arch-neutral "in_mwait"). >> >> In Linux, they're both in the same flags field, which adds complexity. In >> Xen, __softirq_pending is already unsigned long for everything other than >> x86, >> so adding an arch-neutral "in_mwait" would need wider changes. > Why would the flag need to be arch-neutral? Because it's not about mwait; it's about the ability to notice the rising edge of TIF_NEED_RESCHED, and is implemented by more than just x86 in Linux. >> + */ >> + alternative_io("movb $1, %[in_mwait]", >> + "", X86_BUG_MONITOR, >> + [in_mwait] "=m" (stat->in_mwait)); >> >> monitor(this_softirq_pending, 0, 0); >> smp_mb(); > Unlike alternative(), alternative_io() has no memory clobber. To the compiler > the two resulting asm() therefore have no dependency operand-wise[1]. The > sole ordering criteria then is that they're both volatile asm()-s. It not > being explicitly said (anywhere that I could find) that volatile guarantees > such ordering, I wonder if we wouldn't better have an operand-based > dependency between the two. Otoh it may well be that we rely on such ordering > elsewhere already, in which case having one more instance probably would be > okay(ish). > > [1] It's actually worse than that, I think: monitor() also doesn't specify > the location as a (memory) input. The GCC bugzilla has plenty of statements that volatiles (which have survived optimisation passes) can't be reordered. > >> @@ -452,6 +466,10 @@ void mwait_idle_with_hints(unsigned int eax, unsigned >> int ecx) >> mwait(eax, ecx); >> spec_ctrl_exit_idle(info); >> } >> + >> + alternative_io("movb $0, %[in_mwait]", >> + "", X86_BUG_MONITOR, >> + [in_mwait] "=m" (stat->in_mwait)); > This doesn't strictly need to be an alternative, does it? We can save a > memory write in the buggy case, but that likely makes at most a marginal > difference. It doesn't *need* to be an alternative. It could be: if ( !cpu_bug_monitor ) ACCESS_ONCE(stat->in_mwait) = true; but getting rid of the branch is an advantage too. >> --- a/xen/arch/x86/include/asm/hardirq.h >> +++ b/xen/arch/x86/include/asm/hardirq.h >> @@ -5,7 +5,19 @@ >> #include <xen/types.h> >> >> typedef struct { >> - unsigned int __softirq_pending; >> + /* >> + * The layout is important. Any CPU can set bits in __softirq_pending, >> + * but in_mwait is a status bit owned by the CPU. softirq_mwait_raw >> must >> + * cover both, and must be in a single cacheline. >> + */ >> + union { >> + struct { >> + unsigned int __softirq_pending; >> + bool in_mwait; >> + }; >> + uint64_t softirq_mwait_raw; >> + }; > To guard against someone changing e.g. __softirq_pending to unsigned long > while ignoring the comment, how about adding a BUILD_BUG_ON() checking both > parts of the union are the same size (which of course would require naming > either the union field within the containing struct or its struct member)? That turns out to be very hard because of the definition of softirq_pending() being common. More below. > >> @@ -9,4 +11,36 @@ >> #define HVM_DPCI_SOFTIRQ (NR_COMMON_SOFTIRQS + 4) >> #define NR_ARCH_SOFTIRQS 5 >> >> +/* >> + * Ensure softirq @nr is pending on @cpu. Return true if an IPI can be >> + * skipped, false if the IPI cannot be skipped. >> + * >> + * We use a CMPXCHG covering both __softirq_pending and in_mwait, in order >> to >> + * set softirq @nr while also observing in_mwait in a race-free way. >> + */ >> +static always_inline bool arch_pend_softirq(unsigned int nr, unsigned int >> cpu) >> +{ >> + uint64_t *ptr = &irq_stat[cpu].softirq_mwait_raw; >> + uint64_t old, new; >> + unsigned int softirq = 1U << nr; >> + >> + old = ACCESS_ONCE(*ptr); >> + >> + do { >> + if ( old & softirq ) >> + /* Softirq already pending, nothing to do. */ >> + return true; >> + >> + new = old | softirq; >> + >> + } while ( (old = cmpxchg(ptr, old, new)) != new ); > Don't you mean > > } while ( (new = cmpxchg(ptr, old, new)) != old ); > > here No. I'm pretty sure I don't. > (with new latched ahead of the loop and old set from new inside it), > like we have it elsewhere when we avoid the use of yet another variable, > e.g. in inc_linear_{entries,uses}()? Eww, no. Having new and old backwards like that is borderline obfucation, and is unnecessary cognitive complexity for the reader. > >> + /* >> + * We have caused the softirq to become pending. If in_mwait was set, >> the >> + * target CPU will notice the modification and act on it. >> + */ >> + return new & (1UL << 32); > Hmm, this requires the layout to allow for even less re-arrangement than the > comment ahead of the union says: You strictly require in_mwait to follow > __softirq_pending immediately, and the (assembly) write to strictly write 1 > into the field. Could I talk you into at least > > return new & (1UL << (offsetof(..., in_mwait) * 8)); > > ? This way the field being inspected would also be mentioned in the access > itself (i.e. become grep-able beyond the mentioning in the comment). And I > sincerely dislike hard-coded literal numbers when they (relatively) easily > can be expressed another way. The nice way to do this would be a named union and a proper field, but that doesn't work because of how softirq_pending() is declared. I suppose I can use an offsetof() expression. ~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |