|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 7/7] x86/irq: forward pending interrupts to new destination in fixup_irqs()
On Tue, Jun 11, 2024 at 03:50:42PM +0200, Jan Beulich wrote:
> On 10.06.2024 16:20, Roger Pau Monne wrote:
> > fixup_irqs() is used to evacuate interrupts from to be offlined CPUs. Given
> > the CPU is to become offline, the normal migration logic used by Xen where
> > the
> > vector in the previous target(s) is left configured until the interrupt is
> > received on the new destination is not suitable.
> >
> > Instead attempt to do as much as possible in order to prevent loosing
> > interrupts. If fixup_irqs() is called from the CPU to be offlined (as is
> > currently the case) attempt to forward pending vectors when interrupts that
> > target the current CPU are migrated to a different destination.
> >
> > Additionally, for interrupts that have already been moved from the current
> > CPU
> > prior to the call to fixup_irqs() but that haven't been delivered to the new
> > destination (iow: interrupts with move_in_progress set and the current CPU
> > set
> > in ->arch.old_cpu_mask) also check whether the previous vector is pending
> > and
> > forward it to the new destination.
> >
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > ---
> > Changes since v1:
> > - Rename to apic_irr_read().
> > ---
> > xen/arch/x86/include/asm/apic.h | 5 +++++
> > xen/arch/x86/irq.c | 37 ++++++++++++++++++++++++++++++++-
> > 2 files changed, 41 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/x86/include/asm/apic.h
> > b/xen/arch/x86/include/asm/apic.h
> > index d1cb001fb4ab..7bd66dc6e151 100644
> > --- a/xen/arch/x86/include/asm/apic.h
> > +++ b/xen/arch/x86/include/asm/apic.h
> > @@ -132,6 +132,11 @@ static inline bool apic_isr_read(uint8_t vector)
> > (vector & 0x1f)) & 1;
> > }
> >
> > +static inline bool apic_irr_read(unsigned int vector)
> > +{
> > + return apic_read(APIC_IRR + (vector / 32 * 0x10)) & (1U << (vector %
> > 32));
> > +}
> > +
> > static inline u32 get_apic_id(void)
> > {
> > u32 id = apic_read(APIC_ID);
> > diff --git a/xen/arch/x86/irq.c b/xen/arch/x86/irq.c
> > index 54eabd23995c..ed262fb55f4a 100644
> > --- a/xen/arch/x86/irq.c
> > +++ b/xen/arch/x86/irq.c
> > @@ -2601,7 +2601,7 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
> >
> > for ( irq = 0; irq < nr_irqs; irq++ )
> > {
> > - bool break_affinity = false, set_affinity = true;
> > + bool break_affinity = false, set_affinity = true, check_irr =
> > false;
> > unsigned int vector, cpu = smp_processor_id();
> > cpumask_t *affinity = this_cpu(scratch_cpumask);
> >
> > @@ -2649,6 +2649,25 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
> > !cpumask_test_cpu(cpu, &cpu_online_map) &&
> > cpumask_test_cpu(cpu, desc->arch.old_cpu_mask) )
> > {
> > + /*
> > + * This to be offlined CPU was the target of an interrupt
> > that's
> > + * been moved, and the new destination target hasn't yet
> > + * acknowledged any interrupt from it.
> > + *
> > + * We know the interrupt is configured to target the new CPU at
> > + * this point, so we can check IRR for any pending vectors and
> > + * forward them to the new destination.
> > + *
> > + * Note the difference between move_in_progress or
> > + * move_cleanup_count being set. For the later we know the new
> > + * destination has already acked at least one interrupt from
> > this
> > + * source, and hence there's no need to forward any stale
> > + * interrupts.
> > + */
>
> I'm a little confused by this last paragraph: It talks about a difference,
> yet ...
>
> > + if ( apic_irr_read(desc->arch.old_vector) )
> > + send_IPI_mask(cpumask_of(cpumask_any(desc->arch.cpu_mask)),
> > + desc->arch.vector);
>
> ... in the code being commented there's no difference visible. Hmm, I guess
> this is related to the enclosing if(). Maybe this could be worded a little
> differently, e.g. starting with "Note that for the other case -
> move_cleanup_count being non-zero - we know ..."?
Hm, I see. Yes, the difference is that for interrupts that have
move_cleanup_count set we don't forward pending interrupts in IRR on
this CPU. I put this here because I think it's more naturally
arranged with the rest of the comment. I can pull the whole comment
ahead if the if() if that's better.
> > @@ -2689,11 +2708,27 @@ void fixup_irqs(const cpumask_t *mask, bool verbose)
> > if ( desc->handler->disable )
> > desc->handler->disable(desc);
> >
> > + /*
> > + * If the current CPU is going offline and is (one of) the
> > target(s) of
> > + * the interrupt signal to check whether there are any pending
> > vectors
> > + * to be handled in the local APIC after the interrupt has been
> > moved.
> > + */
>
> After reading this a number of times, I think there wants to be a comma
> between
> "interrupt" and "signal". Or am I getting wrong what is being meant?
Indeed.
> > + if ( !cpu_online(cpu) && cpumask_test_cpu(cpu,
> > desc->arch.cpu_mask) )
> > + check_irr = true;
> > +
> > if ( desc->handler->set_affinity )
> > desc->handler->set_affinity(desc, affinity);
> > else if ( !(warned++) )
> > set_affinity = false;
> >
> > + if ( check_irr && apic_irr_read(vector) )
> > + /*
> > + * Forward pending interrupt to the new destination, this CPU
> > is
> > + * going offline and otherwise the interrupt would be lost.
> > + */
> > + send_IPI_mask(cpumask_of(cpumask_any(desc->arch.cpu_mask)),
> > + desc->arch.vector);
> > +
> > if ( desc->handler->enable )
> > desc->handler->enable(desc);
> >
>
> Down from here, after the loop, there's a 1ms window where latched but not
> yet delivered interrupts can be received. How's that playing together with
> the changes you're making? Aren't we then liable to get two interrupts, one
> at the old and one at the new source, in unknown order?
I was mistakenly thinking that clear_local_APIC() would block
interrupt delivery, but that's not the case, so yes, interrupts should
still be delivered in the window below.
Let me test without this last patch.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |