[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/ioapic: sanitize IO-APIC pins before enabling the local APIC


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Wed, 12 Jul 2023 17:41:52 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=G/IRc1XeWGs0W6e4UVFb3G6LChKbmKKuFcLi12nb4ZE=; b=F3G9SNj6sEAqAikU/Y/yCYoT0ZQtuUAzYc59D/uOBOfchMsu7vcr5Ulz4qFw7s3aZgrcCVnyW12m44awKgitWxsii/nRUhiIT+5OUjJvVjeL3LXk3M1BkwLub02jtLKqxgzqsIEbUxOpLAVIkJDnPXjVXOrgvu2Dx35tqIr3rTTlfwApP3zOw7uykH2TSlIGXvD9kdAQyP1/o0ZWKImeVJxb47FsfcCR1cY2xQZYz1vE4APrdPYdbLVku1QgKI/O03Z+93p+EDmYp7gBE5oVv+IV1RE09cif72U6/taS4GofSugY0otQ2IE0RFtHRwl3bYWRvH+78N5XxZvKhAjXPw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Y0Yoc+jFwM7A3ZMVBAZDYqyp4MggeRSlBqk6tODErs5iTFlB1d/QBiRYohd8xYjPRwnoNvVjZAEq2QtP639U1WTXIxsSl/VeSEWH95IFUzRTgmXWupX8op4HJLfSAJSwpCpk+mm5OmfblqPz+arcubEwPBRK0ZfAuSpmAJJnzeYDIRP+HNVRlzFa/oHdxjJFgiMNpZaiMnzre0PdzuhiAzi2YEQbBIVvJkwLnbMzlTdrU/AamzNLamN6SjcBo+v3UbZ3kRQ8H4KPZbkjeQQQe1yS71sRUDSRJa3O9OkPTiszccohqrybdcvEO3Z8mBoHkjMzHi7173idp9yPAAdmOQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Wed, 12 Jul 2023 15:42:30 +0000
  • Ironport-data: A9a23:CVwKSqjopMKLEfHfo8w44UPjX161RhEKZh0ujC45NGQN5FlHY01je htvWW2DP6zeNjegfNx3Pd7ioRwAscTSndYwSgo9pSowFyIb9cadCdqndUqhZCn6wu8v7q5Ex 55HNoSfdpBcolv0/ErF3m3J9CEkvU2wbuOgTrWCYmYpHlUMpB4J0XpLg/Q+jpNjne+3CgaMv cKai8DEMRqu1iUc3lg8sspvkzsx+qyr0N8klgZmP6sT4waEzyJ94K83fsldEVOpGuG4IcbiL wrz5OnR1n/U+R4rFuSknt7TGqHdauePVeQmoiM+t5mK2nCulARrukoIHKN0hXNsoyeIh7hMJ OBl7vRcf+uL0prkw4zxWzEAe8130DYvFLXveRBTuuTLp6HKnueFL1yDwyjaMKVBktubD12i+ tQ1EWs/QUGC3Nmk0e6KFOQ8254jPs3SadZ3VnFIlVk1DN4AaLWaG+DmwIEd2z09wMdTAfzZe swVLyJ1awjNaAFOPVFRD48imOCvhT/0dDgwRFC9/PJrpTSMilEhluG1abI5efTTLSlRtlyfq W/cuXzwHzkRNcCFyCrD+XWp7gPKtXqiCN9MRefmrZaGhnWTnmUpGU0QeWKYsMOB1kGDeehiI HIbr39GQa8asRbDosPGdx+yrWOAvxUcc8FNCOB84waIooLL5y6JC25CSSROAPQ2uclzSTE02 1uhm9LyGScpoLCTUWia9LqfsXW1Iyd9EIMZTSoNTA9A79y9pog210rLVow6SPfzicDpEzbtx TzMtDI5m7gYkc8M0eO84EzDhDWv4JPOS2bZ+znqY45s1SshDKbNWmBiwQOzASpoRGpBcmS8g Q==
  • Ironport-hdrordr: A9a23:J86FJ6q2Q33ZIAFCXkHXZFgaV5rveYIsimQD101hICG9Evb0qy nOpoV/6faQslwssR4b9uxoVJPvfZq+z+8W3WByB9eftWDd0QPFEGgL1+DfKlbbak7DH4BmtJ uJc8JFeafN5VoRt7eG3OFveexQvOVu88qT9JjjJ28Gd3APV0n5hT0JcjpyFCdNNW57LKt8Lr WwzOxdqQGtfHwGB/7LfUXsD4D41rv2fIuNW29+OyIa
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Jul 12, 2023 at 04:50:34PM +0200, Jan Beulich wrote:
> On 12.07.2023 15:53, Roger Pau Monné wrote:
> > On Wed, Jul 12, 2023 at 12:07:43PM +0200, Jan Beulich wrote:
> >> On 11.07.2023 15:02, Roger Pau Monné wrote:
> >>> On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote:
> >>>> On 10.07.2023 16:43, Roger Pau Monné wrote:
> >>>>> On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote:
> >>>>>> On 07.07.2023 11:53, Roger Pau Monne wrote:
> >>>>>>> The current logic to init the local APIC and the IO-APIC does init the
> >>>>>>> former first before doing any kind of sanitation on the IO-APIC pin
> >>>>>>> configuration.  It's already noted on enable_IO_APIC() that Xen
> >>>>>>> shouldn't trust the IO-APIC being empty at bootup.
> >>>>>>>
> >>>>>>> At XenServer we have a system where the IO-APIC 0 is handed to Xen
> >>>>>>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and
> >>>>>>> with a vector of 0 (all fields of the RTE are zeroed).  Once the local
> >>>>>>> APIC is enabled periodic injections from such pin cause a storm of
> >>>>>>> errors:
> >>>>>>>
> >>>>>>> APIC error on CPU0: 00(40), Received illegal vector
> >>>>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>>>> APIC error on CPU0: 40(40), Received illegal vector
> >>>>>>>
> >>>>>>> That prevents Xen from booting.
> >>>>>>
> >>>>>> And I expect no RTE is in ExtInt mode, so one might conclude that
> >>>>>> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit
> >>>>>> of course there's then the question whether to change the RTE
> >>>>>> rather than masking it. What do ACPI tables say?
> >>>>>
> >>>>> There's one relevant override:
> >>>>>
> >>>>> [668h 1640   1]                Subtable Type : 02 [Interrupt Source 
> >>>>> Override]
> >>>>> [669h 1641   1]                       Length : 0A
> >>>>> [66Ah 1642   1]                          Bus : 00
> >>>>> [66Bh 1643   1]                       Source : 00
> >>>>> [66Ch 1644   4]                    Interrupt : 00000002
> >>>>> [670h 1648   2]        Flags (decoded below) : 0000
> >>>>>                                     Polarity : 0
> >>>>>                                 Trigger Mode : 0
> >>>>>
> >>>>> So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is
> >>>>> connected to.
> >>>>
> >>>> Then wouldn't we be better off converting that RTE to ExtInt? That
> >>>> would allow PIC IRQs (not likely to exist, but still) to work
> >>>> (without undue other side effects afaics), whereas masking this RTE
> >>>> would not.
> >>>
> >>> I'm kind of worry of trying to automate this logic.  Should we always
> >>> convert pin 0 to ExtInt if it's found unmasked, with and invalid
> >>> vector and there's a source override entry for the IRQ?
> >>>
> >>> It seems weird to infer this just from the fact that pin 0 is all
> >>> zeroed out.
> >>
> >> As you say in the earlier paragraph, it wouldn't be just that. It's
> >> unmasked + bad vector + present source override (indicating that
> >> nothing except perhaps a PIC is connected to the pin).
> > 
> > I can do this as a separate patch, but not really here IMO.  The
> > purpose of this patch is strictly to perform the IO-APIC sanitation
> > ahead of enabling the local APIC.  I will add a followup patch to the
> > series, albeit I'm unconvinced we want to infer IO-APIC pin
> > configuration based on firmware miss configurations.
> 
> Hmm. The question really is which of the changes we want to backport.
> That one should be first. While it's largely guesswork, I'd be more
> inclined to take the change that has less of an effect overall.

My views would be the other way around I think.  Current code already
has a comment to notice that IO-APIC pins might be wrongly unmasked,
and there's also logic for masking them when the IO-APICs are
initialized.  The fact that such logic is placed after the local APIC
has been initialized is IMO a bug, as having unmasked unconfigured
IO-APIC pins when the local APIC is enabled should be avoided.

> That said I can see that Linux have their enable_IO_APIC() calls
> also ahead of setup_IO_APIC() (but after connect_bsp_APIC() and
> setup_local_APIC()). IOW with your change we'd do the masking yet
> earlier than them. This may of course even be advantageous, but
> there may also be good reasons for the sequence they're using.

Linux calls enable_IO_APIC() before setting up the local
APIC LVT Error vector (that's done in end_local_APIC_setup()).  That
logic seems to be introduced by commit:

1c69524c2e5b x86: clear IO_APIC before enabing apic error vector.

Might it be less controversial to do like Linux does and call
enable_IO_APIC() before the local APIC ESR is setup?

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.