[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 5/7] x86/irq: deal with old_cpu_mask for interrupts in movement in fixup_irqs()
On 12.06.2024 10:47, Roger Pau Monné wrote: > On Tue, Jun 11, 2024 at 02:45:09PM +0200, Jan Beulich wrote: >> On 10.06.2024 16:20, Roger Pau Monne wrote: >>> Given the current logic it's possible for ->arch.old_cpu_mask to get out of >>> sync: if a CPU set in old_cpu_mask is offlined and then onlined >>> again without old_cpu_mask having been updated the data in the mask will no >>> longer be accurate, as when brought back online the CPU will no longer have >>> old_vector configured to handle the old interrupt source. >>> >>> If there's an interrupt movement in progress, and the to be offlined CPU >>> (which >>> is the call context) is in the old_cpu_mask clear it and update the mask, >>> so it >>> doesn't contain stale data. >> >> This imo is too __cpu_disable()-centric. In the code you cover the >> smp_send_stop() case afaict, where it's all _other_ CPUs which are being >> offlined. As we're not meaning to bring CPUs online again in that case, >> dealing with the situation likely isn't needed. Yet the description should >> imo at least make clear that the case was considered. > > What about adding the following paragraph: Sounds good, just maybe one small adjustment: > Note that when the system is going down fixup_irqs() will be called by > smp_send_stop() from CPU 0 with a mask with only CPU 0 on it, > effectively asking to move all interrupts to the current caller (CPU > 0) which is the only CPU online. In that case we don't care to "... the only CPU to remain online." > migrate interrupts that are in the process of being moved, as it's > likely we won't be able to move all interrupts to CPU 0 due to vector > shortage anyway. > >> >>> @@ -2589,6 +2589,28 @@ void fixup_irqs(const cpumask_t *mask, bool verbose) >>> affinity); >>> } >>> >>> + if ( desc->arch.move_in_progress && >>> + !cpumask_test_cpu(cpu, &cpu_online_map) && >> >> This part of the condition is, afaict, what covers (excludes) the >> smp_send_stop() case. Might be nice to have a brief comment here, thus >> also clarifying ... > > Would you be fine with: > > if ( desc->arch.move_in_progress && > /* > * Only attempt to migrate if the current CPU is going > * offline, otherwise the whole system is going down and > * leaving stale interrupts is fine. > */ > !cpumask_test_cpu(cpu, &cpu_online_map) && > cpumask_test_cpu(cpu, desc->arch.old_cpu_mask) ) Sure, this is even more verbose (i.e. better) than I was after. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |