Xen project Mailing List

Re: [PATCH v3 for-4.19 2/3] x86/irq: handle moving interrupts in _assign_irq_vector()

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Tue, 18 Jun 2024 13:22:42 +0200

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>

Delivery-date: Tue, 18 Jun 2024 11:22:53 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Jun 17, 2024 at 03:31:13PM +0200, Jan Beulich wrote: > On 13.06.2024 18:56, Roger Pau Monne wrote: > > Currently there's logic in fixup_irqs() that attempts to prevent > > _assign_irq_vector() from failing, as fixup_irqs() is required to evacuate > > all > > interrupts from the CPUs not present in the input mask. The current logic > > in > > fixup_irqs() is incomplete, as it doesn't deal with interrupts that have > > move_cleanup_count > 0 and a non-empty ->arch.old_cpu_mask field. > > > > Instead of attempting to fixup the interrupt descriptor in fixup_irqs() so > > that > > _assign_irq_vector() cannot fail, introduce logic in _assign_irq_vector() > > to deal with interrupts that have either move_{in_progress,cleanup_count} > > set > > and no remaining online CPUs in ->arch.cpu_mask. > > > > If _assign_irq_vector() is requested to move an interrupt in the state > > described above, first attempt to see if ->arch.old_cpu_mask contains any > > valid > > CPUs that could be used as fallback, and if that's the case do move the > > interrupt back to the previous destination. Note this is easier because the > > vector hasn't been released yet, so there's no need to allocate and setup a > > new > > vector on the destination. > > > > Due to the logic in fixup_irqs() that clears offline CPUs from > > ->arch.old_cpu_mask (and releases the old vector if the mask becomes empty) > > it > > shouldn't be possible to get into _assign_irq_vector() with > > ->arch.move_{in_progress,cleanup_count} set but no online CPUs in > > ->arch.old_cpu_mask. > > > > However if ->arch.move_{in_progress,cleanup_count} is set and the interrupt > > has > > also changed affinity, it's possible the members of ->arch.old_cpu_mask are > > no > > longer part of the affinity set, move the interrupt to a different CPU part > > of > > the provided mask and keep the current ->arch.old_{cpu_mask,vector} for the > > pending interrupt movement to be completed. > > > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > > Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > > > --- a/xen/arch/x86/irq.c > > +++ b/xen/arch/x86/irq.c > > @@ -544,7 +544,58 @@ static int _assign_irq_vector(struct irq_desc *desc, > > const cpumask_t *mask) > > } > > > > if ( desc->arch.move_in_progress || desc->arch.move_cleanup_count ) > > - return -EAGAIN; > > + { > > + /* > > + * If the current destination is online refuse to shuffle. Retry > > after > > + * the in-progress movement has finished. > > + */ > > + if ( cpumask_intersects(desc->arch.cpu_mask, &cpu_online_map) ) > > + return -EAGAIN; > > + > > + /* > > + * Due to the logic in fixup_irqs() that clears offlined CPUs from > > + * ->arch.old_cpu_mask it shouldn't be possible to get here with > > + * ->arch.move_{in_progress,cleanup_count} set and no online CPUs > > in > > + * ->arch.old_cpu_mask. > > + */ > > + ASSERT(valid_irq_vector(desc->arch.old_vector)); > > + ASSERT(cpumask_intersects(desc->arch.old_cpu_mask, > > &cpu_online_map)); > > + > > + if ( cpumask_intersects(desc->arch.old_cpu_mask, mask) ) > > + { > > + /* > > + * Fallback to the old destination if moving is in progress > > and the > > + * current destination is to be offlined. This is only > > possible if > > + * the CPUs in old_cpu_mask intersect with the affinity mask > > passed > > + * in the 'mask' parameter. > > + */ > > + desc->arch.vector = desc->arch.old_vector; > > I'm a little puzzled that you use desc->arch.old_vector here, but ... old_vector can't be used here, as old_vector == desc->arch.vector at this point. The name of the variable is IMO a bit misleading, as it's value only becomes the old_vector once a new vector is assigned. It would be more appropriate to name it current_vector IMO. > > + cpumask_and(desc->arch.cpu_mask, desc->arch.old_cpu_mask, > > mask); > > + > > + /* Undo any possibly done cleanup. */ > > + for_each_cpu(cpu, desc->arch.cpu_mask) > > + per_cpu(vector_irq, cpu)[desc->arch.vector] = irq; > > + > > + /* Cancel the pending move and release the current vector. */ > > + desc->arch.old_vector = IRQ_VECTOR_UNASSIGNED; > > + cpumask_clear(desc->arch.old_cpu_mask); > > + desc->arch.move_in_progress = 0; > > + desc->arch.move_cleanup_count = 0; > > + if ( desc->arch.used_vectors ) > > + { > > + ASSERT(test_bit(old_vector, desc->arch.used_vectors)); > > + clear_bit(old_vector, desc->arch.used_vectors); > > ... old_vector here. Since we have the latter, uniformly using it might > be more consistent. Keep in mind that old_vector a cache of the value of desc->arch.vector at the start of the function. > I realize though that irq_to_vector() has cases where > it wouldn't return desc->arch.old_vector; I think, however, that in those > case we can't make it here. Still I'm not going to insist on making the > adjustment. Happy to make it though while committing, should you agree. > > Also I'm not happy to see another instance of this pattern appear. In > x86-specific code this is inefficient, as {set,clear}_bit resolve to the > same insn as test_and_{set,clear}_bit(). Therefore imo more efficient > would be > > if (!test_and_clear_bit(old_vector, desc->arch.used_vectors)) > ASSERT_UNREACHABLE(); > > (and then the two if()s folded). > > I've been meaning to propose a patch to the other similar sites ... Oh, indeed. IIRC I've copied this pattern from somewhere else without noticing. I'm happy for you to fold the test_and_clear_bit() call into the parent if condition. Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.