[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/flush: use APIC ALLBUT destination shorthand when possible
On Fri, Jan 03, 2020 at 01:08:20PM +0100, Jan Beulich wrote: > On 24.12.2019 13:44, Roger Pau Monne wrote: > > @@ -227,14 +233,47 @@ void flush_area_mask(const cpumask_t *mask, const > > void *va, unsigned int flags) > > if ( (flags & ~FLUSH_ORDER_MASK) && > > !cpumask_subset(mask, cpumask_of(cpu)) ) > > { > > + bool cpus_locked = false; > > + > > spin_lock(&flush_lock); > > cpumask_and(&flush_cpumask, mask, &cpu_online_map); > > cpumask_clear_cpu(cpu, &flush_cpumask); > > flush_va = va; > > flush_flags = flags; > > - send_IPI_mask(&flush_cpumask, INVALIDATE_TLB_VECTOR); > > + > > + /* > > + * Prevent any CPU hot{un}plug while sending the IPIs if we are to > > use > > + * a shorthand, also refuse to use a shorthand if not all CPUs are > > + * online or have been parked. > > + */ > > + if ( system_state > SYS_STATE_smp_boot && !cpu_overflow && > > + (cpus_locked = get_cpu_maps()) && > > + (park_offline_cpus || > > Why is it relevant whether we park offline CPUs, or whether we've > even brought up all of the ones a system has? An IPI, in particular > a broadcast one, shouldn't have any issue getting delivered if some > of the nominal recipients don't listen, should it? (The use of > cpu_online_map that was already there above is a sign - but not a > proof, as it may itself be buggy - that the set of online CPUs > fluctuating behind this function's back ought to not be a problem.) I've tried it myself, and if not all CPUs are onlined when the shorthand is used the box would just reboot. This matches the description at: https://lwn.net/Articles/793065/ Of the Linux shorthand implementation. > Further a question on lock nesting: Since the commit message > doesn't say anything in this regard, did you check there are no > TLB flush invocations with the get_cpu_maps() lock held? The CPU maps lock is a recursive one, so it should be fine to attempt a TLB flush with the lock already held. > Even if > you did and even if there are none, I think the function should > then get a comment attached to the effect of this lock order > inversion risk. (For example, it isn't obvious to me that no user > of stop_machine() would ever want to do any kind of TLB flushing.) > > Overall I wonder whether your goal couldn't be achieved without > the extra locking and without the special conditions. Hm, this so far has worked fine on all the boxes that I've tried. I'm happy to change it to a simpler approach, but I think the conditions and locking are required for this to work correctly. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |