[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] xen/arm: GIC: unable to inject hw irq warning



On Mon, 8 Sep 2014, Stefano Stabellini wrote:
> On Sat, 6 Sep 2014, Vijay Kilari wrote:
> > On Thu, Sep 4, 2014 at 4:19 AM, Stefano Stabellini
> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > > On Wed, 3 Sep 2014, Vijay Kilari wrote:
> > >> Hi Stefano,
> > >>
> > >>  I face following warning messages for LPI/ITS interrupt.
> > >> Can you throw some light on this warning?
> > >>
> > >> (XEN) gic.c:394:d0v0 unable to inject hw irq=8736 into d0v0: already
> > >> active in LR0
> > >> (XEN) gic.c:394:d0v0 unable to inject hw irq=8736 into d0v0: already
> > >> active in LR0
> > >> (XEN) gic.c:394:d0v0 unable to inject hw irq=8736 into d0v0: already
> > >> active in LR1
> > >> (XEN) gic.c:394:d0v0 unable to inject hw irq=8736 into d0v0: already
> > >> active in LR1
> > >> (XEN) gic.c:394:d0v0 unable to inject hw irq=8736 into d0v0: already
> > >> active in LR1
> > >> (XEN) gic.c:394:d0v0 unable to inject hw irq=8736 into d0v0: already
> > >> active in LR0
> > >
> > > Hello Vijay,
> > > the warning is for a condition that should not happen. All the following
> > > must be true for the warning to trigger:
> > >
> > > - the irq is an hardware irq (p->desc != NULL)
> > >
> > > - the irq is active in the LR register (GICH_LR_ACTIVE)
> > >
> > > - we need to inject a second irq while the first one is still active
> > >
> > > The GICv2 specification explicitely states that it is not possible to
> > > set an hardware irq as pending and active in the LR registers (5-175).
> > 
> >    Here in the code we are checking only for LR active state(0x2) and not
> > for ''pending and active' (0x3) state.
> 
> True. If the irq is GICH_LR_ACTIVE we do not set it GICH_LR_PENDING for
> hardware interrupts.
> 
> 
> > > Regardless from the GICv2 specification, it shouldn't be possible to
> > > receive a second hardware interrupt in Xen while the guest is still
> > > handling the first one. So the question is: how is it possible that xen
> > > is even trying to inject a second irq while the first one is still
> > > active?  I would check how GIC_IRQ_GUEST_QUEUED was set.
> > 
> > Where in GICv2 specification it is men>tioned?.
> 
> It is explained in the general handling of interrupts, 3.2.
> 
> "When a processor takes the interrupt exception, it reads the GICC_IAR of its 
> CPU interface to acknowledge
> the interrupt. This read returns an Interrupt ID, and for an SGI, the source 
> processor ID, that the processor
> uses to select the correct interrupt handler. When it recognizes this read, 
> the GIC changes the state of the
> interrupt as follows:
> * if the pending state of the interrupt persists when the interrupt becomes 
> active, or if the interrupt is
> generated again, from pending to active and pending.
> * otherwise, from pending to active"
> 
> Later in the same chapter:
> 
> "A CPU interface never signals to the connected processor any interrupt
> that is active and pending. It only signals interrupts that are pending
> and have sufficient priority"
> 
> 
> > In case of GICv3 LPI's are edge triggered and there is no active state.
> > Once the hypervisor has EOI'd the LPI, it will be delivered again if
> > the device generates a new edge. See 4.3.2 and 4.1.5
> 
> Thanks for the pointer, very interesting!
> The spec also says that LPIs are never ACTIVE, so how is it possible
> that the irq is ACTIVE in the LR register in your case?
> Maybe only physical LPIs are never active but virtual LPIs can be
> active? That is plausible but very confusing. Could you please run a few
> tests to confirm this theory?
> 
> If that is correct then I think we'll have to make a few changes to
> accommodate LPIs:
> 
> - do not set the HW bit in GICH_LRs for LPIs, because the corresponding
> physical irq doesn't need to be deactivated anyway
> 
> - once we remove the HW bit, LPIs become purely virtual irqs at the
> GICH_LR level, therefore we can set them ACTIVE & PENDING in GICH_LR
> from gic_update_one_lr
> 
> See the attached patch as reference, not even compile tested.
> If the LPI is already PENDING in GICH_LRs, it is still possible to loose
> an interrupt but I think that is acceptable (interrupt coalescing).

One more thing: unfortunately a misconfigured device could cause an
interrupt storm and the hypervisor would be completely unprotected
against it.

It is not necessary to implement this in first instance, a comment in
the code would suffice, but I think that eventually we would need to:

- realize that Xen received an LPI that is already pending in an LR
register
- ask for a maintenance interrupt for the virtual LPI
- deactivate the physical LPI

Afterward upon receiving the maintenance interrupt:
- reactivate the physical LPI

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.