[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/ioapic: sanitize IO-APIC pins before enabling the local APIC


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 12 Jul 2023 16:50:34 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=7fmrr+0oe+hp5dkJ3y2M5RfCMw4ULlntis3R15ewqVQ=; b=kcAWuq66bgl05LwOK7j3skje9BFFlBOR8nw2YBC6gaLb9YOiFRYcU3aEDdWMxFLuad5BT38uj10UGoGn3re67AgFjDka/qKUF9ke8nYI2kvh7/0L+OulrjtIaSP1+tU94qT5gjMo73TWDKHa0r5YpNgQoZYPVm2xs4VPLn9VaAGdpjgQ/CGql0qMmDMUrYtr56aszsUDTIER6Iwe8H2eVq0SQCm6cyA+DeCaujVqQk0oH219zymiPON1aFx3Fus60TsGOBkAoP9ejZVhummuOT8bSWj+uXBPA9RhYTPYU3i4iZj0E2Mpe0n8wX6SG4SJOXLIE8Nc9fN7KBQK76jBTg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=IhH4ZVNKSYuNHiLDblrALwKOuprX4FxB+AFU8/TjagABSspGhPqPdgpyi1/I9AyOLImYV9dPsMNeTUs2qGDnJUKqAWiIy9T7Ex4jgLpWWrLHlEhu3lQ5zRDdxGkvmb+uEbqAztrwl3aiZbsMUC9xPot47nhB0dxN4J1JnxSv+bcJLfo/yfcOx8AAtEBb+jN28KPrVCMARanoHsCAKJy6UlKXJ424QpfsZFQz7ysv83CYToqXDc5ZopzUbF2EEQY8mZkl5i3SkWf7XHULoWJ1mWrcJx5Y+X1jv3ab4ELdSP/Zw5c6GW32NfN4WDudb8Q+3NSokVRg5q2TSHIcMdYB4Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 12 Jul 2023 14:50:59 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 12.07.2023 15:53, Roger Pau Monné wrote:
> On Wed, Jul 12, 2023 at 12:07:43PM +0200, Jan Beulich wrote:
>> On 11.07.2023 15:02, Roger Pau Monné wrote:
>>> On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote:
>>>> On 10.07.2023 16:43, Roger Pau Monné wrote:
>>>>> On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote:
>>>>>> On 07.07.2023 11:53, Roger Pau Monne wrote:
>>>>>>> The current logic to init the local APIC and the IO-APIC does init the
>>>>>>> former first before doing any kind of sanitation on the IO-APIC pin
>>>>>>> configuration.  It's already noted on enable_IO_APIC() that Xen
>>>>>>> shouldn't trust the IO-APIC being empty at bootup.
>>>>>>>
>>>>>>> At XenServer we have a system where the IO-APIC 0 is handed to Xen
>>>>>>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and
>>>>>>> with a vector of 0 (all fields of the RTE are zeroed).  Once the local
>>>>>>> APIC is enabled periodic injections from such pin cause a storm of
>>>>>>> errors:
>>>>>>>
>>>>>>> APIC error on CPU0: 00(40), Received illegal vector
>>>>>>> APIC error on CPU0: 40(40), Received illegal vector
>>>>>>> APIC error on CPU0: 40(40), Received illegal vector
>>>>>>> APIC error on CPU0: 40(40), Received illegal vector
>>>>>>> APIC error on CPU0: 40(40), Received illegal vector
>>>>>>> APIC error on CPU0: 40(40), Received illegal vector
>>>>>>>
>>>>>>> That prevents Xen from booting.
>>>>>>
>>>>>> And I expect no RTE is in ExtInt mode, so one might conclude that
>>>>>> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit
>>>>>> of course there's then the question whether to change the RTE
>>>>>> rather than masking it. What do ACPI tables say?
>>>>>
>>>>> There's one relevant override:
>>>>>
>>>>> [668h 1640   1]                Subtable Type : 02 [Interrupt Source 
>>>>> Override]
>>>>> [669h 1641   1]                       Length : 0A
>>>>> [66Ah 1642   1]                          Bus : 00
>>>>> [66Bh 1643   1]                       Source : 00
>>>>> [66Ch 1644   4]                    Interrupt : 00000002
>>>>> [670h 1648   2]        Flags (decoded below) : 0000
>>>>>                                     Polarity : 0
>>>>>                                 Trigger Mode : 0
>>>>>
>>>>> So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is
>>>>> connected to.
>>>>
>>>> Then wouldn't we be better off converting that RTE to ExtInt? That
>>>> would allow PIC IRQs (not likely to exist, but still) to work
>>>> (without undue other side effects afaics), whereas masking this RTE
>>>> would not.
>>>
>>> I'm kind of worry of trying to automate this logic.  Should we always
>>> convert pin 0 to ExtInt if it's found unmasked, with and invalid
>>> vector and there's a source override entry for the IRQ?
>>>
>>> It seems weird to infer this just from the fact that pin 0 is all
>>> zeroed out.
>>
>> As you say in the earlier paragraph, it wouldn't be just that. It's
>> unmasked + bad vector + present source override (indicating that
>> nothing except perhaps a PIC is connected to the pin).
> 
> I can do this as a separate patch, but not really here IMO.  The
> purpose of this patch is strictly to perform the IO-APIC sanitation
> ahead of enabling the local APIC.  I will add a followup patch to the
> series, albeit I'm unconvinced we want to infer IO-APIC pin
> configuration based on firmware miss configurations.

Hmm. The question really is which of the changes we want to backport.
That one should be first. While it's largely guesswork, I'd be more
inclined to take the change that has less of an effect overall.

That said I can see that Linux have their enable_IO_APIC() calls
also ahead of setup_IO_APIC() (but after connect_bsp_APIC() and
setup_local_APIC()). IOW with your change we'd do the masking yet
earlier than them. This may of course even be advantageous, but
there may also be good reasons for the sequence they're using.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.