[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] x86/io-apic: fix directed EOI when using AMD-Vi interrupt remapping



On Wed, Oct 30, 2024 at 11:57:39AM +0100, Jan Beulich wrote:
> On 30.10.2024 11:09, Roger Pau Monné wrote:
> > On Wed, Oct 30, 2024 at 10:41:40AM +0100, Jan Beulich wrote:
> >> On 29.10.2024 18:48, Roger Pau Monné wrote:
> >>> On Tue, Oct 29, 2024 at 05:43:24PM +0100, Jan Beulich wrote:
> >>>> On 29.10.2024 12:03, Roger Pau Monne wrote:
> >>>>> @@ -273,6 +293,13 @@ void __ioapic_write_entry(
> >>>>>      {
> >>>>>          __io_apic_write(apic, 0x11 + 2 * pin, eu.w2);
> >>>>>          __io_apic_write(apic, 0x10 + 2 * pin, eu.w1);
> >>>>> +        /*
> >>>>> +         * Called in clear_IO_APIC_pin() before io_apic_pin_eoi is 
> >>>>> allocated.
> >>>>> +         * Entry will be updated once the array is allocated and 
> >>>>> there's a
> >>>>> +         * write against the pin.
> >>>>> +         */
> >>>>> +        if ( io_apic_pin_eoi )
> >>>>> +            io_apic_pin_eoi[apic][pin] = e.vector;
> >>>>
> >>>> The comment here looks a little misleading to me. clear_IO_APIC_pin() 
> >>>> calls
> >>>> here to, in particular, set the mask bit. With the mask bit the vector 
> >>>> isn't
> >>>> meaningful anyway (and indeed clear_IO_APIC_pin() sets it to zero, at 
> >>>> which
> >>>> point recording IRQ_VECTOR_UNASSIGNED might be better than the bogus 
> >>>> vector
> >>>> 0x00).
> >>>
> >>> Note that clear_IO_APIC_pin() performs the call to
> >>> __ioapic_write_entry() with raw == false, at which point
> >>> __ioapic_write_entry() will call iommu_update_ire_from_apic() if IOMMU
> >>> IR is enabled.  The cached 'vector' value will be the IOMMU entry
> >>> offset for the AMD-Vi case, as the IOMMU code will perform the call to
> >>> __ioapic_write_entry() with raw == true.
> >>>
> >>> What matters is that the cached value matches what's written in the
> >>> IO-APIC RTE, and the current logic ensures this.
> >>>
> >>> What's the benefit of using IRQ_VECTOR_UNASSIGNED if the result is
> >>> reading the RTE and finding that vector == 0?
> >>
> >> It's not specifically the vector == 0 case alone. Shouldn't we leave
> >> the latched vector alone when writing an RTE with the mask bit set?
> > 
> > I'm not sure what's the benefit of the extra logic to detect such
> > cases, just to avoid a write to the io_apic_pin_eoi matrix.
> 
> Perhaps the largely theoretical concern towards having stale data
> somewhere. Yet ...
> 
> >> Any still pending EOI (there should be none aiui) can't possibly
> >> target the meaningless vector / index in such an RTE. Perhaps it was
> >> wrong to suggest to overwrite (with IRQ_VECTOR_UNASSIGNED) what we
> >> have on record.
> >>
> >> Yet at the same time there ought to be a case where the recorded
> >> indeed moves back to IRQ_VECTOR_UNASSIGNED.
> > 
> > The only purpose of the io_apic_pin_eoi matrix is to cache what's
> > currently in the RTE entry 'vector' field.  I don't think we should
> > attempt to add extra logic as to whether the entry is valid, or
> > masked.  Higher level layers should already take care of that.  The
> > only purpose of the logic added in this patch is to ensure the EOI is
> > performed using what's in the RTE vector field for the requested pin.
> > Anything else is out of scope IMO.
> > 
> > Another option, which would allow to make the matrix store uint8_t
> > elements would be to initialize it at allocation with the RTE vector
> > fields currently present, IOW: do a raw read of every RTE and set the
> > fetched vector field in io_apic_pin_eoi.  Would that be better to you,
> > as also removing the need to ever store IRQ_VECTOR_UNASSIGNED?
> 
> ... yes, that may make sense (and eliminate my concern there).
> 
> I wonder whether the allocation of the array then wouldn't better be
> moved earlier, to enable_IO_APIC(), such that clear_IO_APIC_pin()
> already can suitably update it. In fact, since that function writes
> zero[1], no extra reads would then be needed at all, and the array could
> simply start out all zeroed.

I agree with the suggestion to allocate and setup the io_apic_pin_eoi
matrix in enable_IO_APIC().  However, I'm not sure I follow your
suggestion about the matrix starting as all zeroes being a sane state.

I think we need to do the raw RTE reads in enable_IO_APIC() before
calling clear_IO_APIC(), otherwise clear_IO_APIC_pin() can call
__io_apic_eoi() before any __ioapic_write_entry() has been performed,
and hence the state of the RTE.vector field could possibly be out of
sync with the initial value in io_apic_pin_eoi, and the EOI not take
effect.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.