[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3] x86/io-apic: fix directed EOI when using AMD-Vi interrupt remapping


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Thu, 31 Oct 2024 09:37:28 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Willi Junga <xenproject@xxxxxx>, David Woodhouse <dwmw@xxxxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Thu, 31 Oct 2024 08:37:42 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 30.10.2024 13:26, Roger Pau Monné wrote:
> On Wed, Oct 30, 2024 at 11:57:39AM +0100, Jan Beulich wrote:
>> On 30.10.2024 11:09, Roger Pau Monné wrote:
>>> On Wed, Oct 30, 2024 at 10:41:40AM +0100, Jan Beulich wrote:
>>>> On 29.10.2024 18:48, Roger Pau Monné wrote:
>>>>> On Tue, Oct 29, 2024 at 05:43:24PM +0100, Jan Beulich wrote:
>>>>>> On 29.10.2024 12:03, Roger Pau Monne wrote:
>>>>>>> @@ -273,6 +293,13 @@ void __ioapic_write_entry(
>>>>>>>      {
>>>>>>>          __io_apic_write(apic, 0x11 + 2 * pin, eu.w2);
>>>>>>>          __io_apic_write(apic, 0x10 + 2 * pin, eu.w1);
>>>>>>> +        /*
>>>>>>> +         * Called in clear_IO_APIC_pin() before io_apic_pin_eoi is 
>>>>>>> allocated.
>>>>>>> +         * Entry will be updated once the array is allocated and 
>>>>>>> there's a
>>>>>>> +         * write against the pin.
>>>>>>> +         */
>>>>>>> +        if ( io_apic_pin_eoi )
>>>>>>> +            io_apic_pin_eoi[apic][pin] = e.vector;
>>>>>>
>>>>>> The comment here looks a little misleading to me. clear_IO_APIC_pin() 
>>>>>> calls
>>>>>> here to, in particular, set the mask bit. With the mask bit the vector 
>>>>>> isn't
>>>>>> meaningful anyway (and indeed clear_IO_APIC_pin() sets it to zero, at 
>>>>>> which
>>>>>> point recording IRQ_VECTOR_UNASSIGNED might be better than the bogus 
>>>>>> vector
>>>>>> 0x00).
>>>>>
>>>>> Note that clear_IO_APIC_pin() performs the call to
>>>>> __ioapic_write_entry() with raw == false, at which point
>>>>> __ioapic_write_entry() will call iommu_update_ire_from_apic() if IOMMU
>>>>> IR is enabled.  The cached 'vector' value will be the IOMMU entry
>>>>> offset for the AMD-Vi case, as the IOMMU code will perform the call to
>>>>> __ioapic_write_entry() with raw == true.
>>>>>
>>>>> What matters is that the cached value matches what's written in the
>>>>> IO-APIC RTE, and the current logic ensures this.
>>>>>
>>>>> What's the benefit of using IRQ_VECTOR_UNASSIGNED if the result is
>>>>> reading the RTE and finding that vector == 0?
>>>>
>>>> It's not specifically the vector == 0 case alone. Shouldn't we leave
>>>> the latched vector alone when writing an RTE with the mask bit set?
>>>
>>> I'm not sure what's the benefit of the extra logic to detect such
>>> cases, just to avoid a write to the io_apic_pin_eoi matrix.
>>
>> Perhaps the largely theoretical concern towards having stale data
>> somewhere. Yet ...
>>
>>>> Any still pending EOI (there should be none aiui) can't possibly
>>>> target the meaningless vector / index in such an RTE. Perhaps it was
>>>> wrong to suggest to overwrite (with IRQ_VECTOR_UNASSIGNED) what we
>>>> have on record.
>>>>
>>>> Yet at the same time there ought to be a case where the recorded
>>>> indeed moves back to IRQ_VECTOR_UNASSIGNED.
>>>
>>> The only purpose of the io_apic_pin_eoi matrix is to cache what's
>>> currently in the RTE entry 'vector' field.  I don't think we should
>>> attempt to add extra logic as to whether the entry is valid, or
>>> masked.  Higher level layers should already take care of that.  The
>>> only purpose of the logic added in this patch is to ensure the EOI is
>>> performed using what's in the RTE vector field for the requested pin.
>>> Anything else is out of scope IMO.
>>>
>>> Another option, which would allow to make the matrix store uint8_t
>>> elements would be to initialize it at allocation with the RTE vector
>>> fields currently present, IOW: do a raw read of every RTE and set the
>>> fetched vector field in io_apic_pin_eoi.  Would that be better to you,
>>> as also removing the need to ever store IRQ_VECTOR_UNASSIGNED?
>>
>> ... yes, that may make sense (and eliminate my concern there).
>>
>> I wonder whether the allocation of the array then wouldn't better be
>> moved earlier, to enable_IO_APIC(), such that clear_IO_APIC_pin()
>> already can suitably update it. In fact, since that function writes
>> zero[1], no extra reads would then be needed at all, and the array could
>> simply start out all zeroed.
> 
> I agree with the suggestion to allocate and setup the io_apic_pin_eoi
> matrix in enable_IO_APIC().  However, I'm not sure I follow your
> suggestion about the matrix starting as all zeroes being a sane state.
> 
> I think we need to do the raw RTE reads in enable_IO_APIC() before
> calling clear_IO_APIC(), otherwise clear_IO_APIC_pin() can call
> __io_apic_eoi() before any __ioapic_write_entry() has been performed,
> and hence the state of the RTE.vector field could possibly be out of
> sync with the initial value in io_apic_pin_eoi, and the EOI not take
> effect.

Oh, you're right of course. That's a (side) effect of wanting to always
use the cached value in __io_apic_eoi(), and hence never reading the RTE
there.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.