[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.22] char/ns16550: bound execution time of ns16550_interrupt()


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Thu, 25 Jun 2026 15:07:29 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=rJAtNnNTQ0xDAY4Zu9aq3RpDUrWDd3fvU18IzTnvZuI=; b=vXdY4E4ODjk8lCrm17T5Mu45kZaR9YZltV4pTaGoUxaTLfRLvUMZyjLytAi+lRkBEZW4Y+mtHv0IA8BqlNyenyY3gFV0u4IUXBYJRLpIXaFO1bdv9Qm/umilwv/JNayH+XzwnELEJQ2YXcInw2OyiwifR6cwJMehTrHBrW5hKOo/b0DBDMwTxI/Fb4er4l884cj4evHsCYOVjmhagYkT5WCJHt06/GNHeB12GSMG9R6SP6BL8f6nNI6cpJaw1vpldyGUqlmf/JLpm13Z2rvPrV6uP+Z7KhpcFeJfyOtX0zQIMETFSC8RBh4ZDuqYQsLlKIk4admdjKP4PPY8v9mPgg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=WC9UauOpbH7xl2i4/xjVDUzoPG2mvs87sUr5CCmunvvG2rWtCMDi6518ARIVoWY63CFj8eFMI/muGVyOqPsYwrCZLlohZ0dwmFkcfSLHNMTpzUvEoD+MG3r+JZ7zCEFYFjsxoNuak1HizHXtkE0svCT8GRRrqPNXhlAqjpdBnabEkPKrj3PXT+5+b3rbtDRoBcOVNEyqbzNbF9OHw/4s2ruSBdC9kIGnEwUbTQm82UBFSKEQE279SVOgc3mRXcsGcfre23PS14ZNh/hDpsMGSmETcIhIpxNehCKfevyyPZD8tqnJEjuN3r0mCbaAMkArJGUBYS2HKJowJPn4/+d9iQ==
  • Authentication-results: eu.smtp.expurgate.cloud; dkim=pass header.s=selector1 header.d=citrix.com header.i="@citrix.com" header.h="From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck"
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 25 Jun 2026 13:07:44 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Jun 25, 2026 at 01:31:26PM +0200, Jan Beulich wrote:
> On 25.06.2026 12:08, Roger Pau Monné wrote:
> > On Wed, Jun 24, 2026 at 10:01:36AM +0200, Jan Beulich wrote:
> >> On 23.06.2026 17:54, Roger Pau Monné wrote:
> >>> On Tue, Jun 23, 2026 at 04:27:12PM +0200, Jan Beulich wrote:
> >>>> On 23.06.2026 16:16, Roger Pau Monné wrote:
> >>>>> On Tue, Jun 23, 2026 at 03:44:06PM +0200, Jan Beulich wrote:
> >>>>>> On 23.06.2026 12:31, Roger Pau Monne wrote:
> >>>>>>> +    if ( uart->force_polling )
> >>>>>>> +        return;
> >>>>>>
> >>>>>> As the IRQ was disabled, is this even possible? I.e. should this be 
> >>>>>> some
> >>>>>> kind of assertion or alike?
> >>>>>
> >>>>> Hm, I wasn't setting IRQ_DISABLED before, and hence needed this guard.
> >>>>> But now with IRQ_DISABLED being set in ->status do_IRQ() should filter
> >>>>> any stray interrupts.  I will attempt to add an ASSERT_UNREACHABLE()
> >>>>> here.
> >>>>
> >>>> Simply ASSERT(!uart->force_polling) should do here? It is not wrong to
> >>>> run the code below in release builds in such an event. If we kept getting
> >>>> interrupts (perhaps at a high frequency) we'd be in trouble anyway.
> >>>
> >>> No, I'm afraid I can't do it like that, I can't put an ASSERT there,
> >>> because we can still get into ns16550_interrupt() after the interrupt
> >>> has been disabled.  In do_IRQ() we have the following loop:
> >>>
> >>>     while ( desc->status & IRQ_PENDING )
> >>>     {
> >>>         desc->status &= ~IRQ_PENDING;
> >>>         spin_unlock_irq(&desc->lock);
> >>>
> >>>         tsc_in = tb_init_done ? get_cycles() : 0;
> >>>         action->handler(irq, action->dev_id);
> >>>         TRACE_TIME(TRC_HW_IRQ_HANDLED, irq, tsc_in, get_cycles());
> >>>
> >>>         spin_lock_irq(&desc->lock);
> >>>     }
> >>>
> >>> So if the device is generating further interrupts in the window with
> >>> IRQs enabled (while we execute the handler), we will keep looping
> >>> around this, without taking into account the setting of IRQ_DISABLED.
> >>
> >> Ah yes.
> >>
> >>> This is something that we might want to fix, so that the loop is bound
> >>> by IRQ_PENDING being set, and IRQ_DISABLED not, ie:
> >>>
> >>>     while ( (desc->status & (IRQ_PENDING | IRQ_DISABLED)) == IRQ_PENDING )
> >>
> >> Or perhaps ahead of the loop
> >>
> >>     desc->status &= ~IRQ_REPLAY;
> >>
> >>     if ( desc->status & IRQ_DISABLED )
> >>         goto out;
> >>
> >>     desc->status |= IRQ_PENDING;
> >>
> >>     /*
> >>      * Since we set PENDING, if another processor is handling a different
> >>      * instance of this same irq, the other processor will take care of it.
> >>      */
> >>     if ( desc->status & IRQ_INPROGRESS )
> >>         goto out;
> >>
> >>     desc->status |= IRQ_INPROGRESS;
> >>
> >> thus also having the comment no longer describe only part of the 
> >> conditional.
> > 
> > I think this is racy.  An interrupt hitting in the window with
> > interrupts enabled ahead of the handler having set IRQ_DISABLED will
> > still set IRQ_PENDING, and thus the loop would get executed a further
> > time, and the handler called after IRQ_DISABLED having been set.
> 
> Hmm, I don't quite agree with how you put it, but I think I see what you mean.
> There's one question here, though: If PENDING is set first, and DISABLED only
> later, shouldn't that IRQ instance still be handled? If so, ...
> 
> > I think we need an extra condition in the loop, I see no way this can
> > be solved only by dealing with the concurrent setting of IRQ_PENDING.
> 
> ... such an extra condition would be wrong. If not, yes, I agree.

But PENDING is always set, regardless of whether the IRQ is disabled,
the normal flow in do_IRQ() is:

    desc->status |= IRQ_PENDING;

    /*
     * Since we set PENDING, if another processor is handling a different
     * instance of this same irq, the other processor will take care of it.
     */
    if ( desc->status & (IRQ_DISABLED | IRQ_INPROGRESS) )
        goto out;

I think it's valid to have both PENDING and DISABLED set with the
current logic.  In fact, the code in ack_edge_ioapic_irq() relies on
having both PENDING and DISABLED set to mask the source, as the
->disable hook for edge triggered IO-APIC pins is a no-op.

We could likely change all this to be more straight forward, but as
with the serial interrupt handling I would rather not do that change
during a code freeze.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.