[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/ioapic: sanitize IO-APIC pins before enabling the local APIC


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 11 Jul 2023 15:02:13 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=3U/MC3s1bATOC1m3p28D0fmyspNQvfKwMStXKvWIuSY=; b=B43fnfCDAuUcdg4TIciSiVd/y6TkK39kPOwyd2TRvqiSQ/iYIFSP6P0xD6lbZvI1PfbMoBAwfiSU2eRMrCPlrcC+vNcS04WK3/mk0m8sWY2Il4GQungh1pjOcaw7nFgMdIsGB4Hbf5JiPH7Psdq7cybFtki0dMaTQf9MDBxM32iAoWSEuC/jZ1pjFB9aQ6kNzSNjP9714N+ThCh1rkQ464I5zDwR/crnf+C0SM5OyAaN5jX15Kr2KzHrSOnJNxzNlFY41pfqAZn5gImCmv/24Hf/sI5UslSBrnBK4P2sbAL6JlMxchzqxH/whP0XiMw3/o/xemNIH21guZdQM8UIzA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Y61AyWRNiHKmeNeZv7qLw/O7cnPbIlf3wp/J6mVxJDur+D91Sn6u1foPsaZKjenwVKsYuZH1oEAzMmasfJfKVpLvFcN/g/7TCcrpGLAyDE9KKrdIvjVknyMERoWBcEFX2dhrzdLz0pL9AybyjUg22X76uuGEkJyQkMV8mBEvZJpCqlmWpMpG7WJf7I90NX+Ishr8uIyncKdeSWYajfnWy73Qn7HcWOP9Ir5BayLjQHQy6r0WvVRwTQu6vJtoMByqtKF7BpHSin5EUMXv+C3cwXaWloHTzf0F3jU5ygCG0LmsTmDDsZNiTxLoddkhphLCaszmbMgUvL1pJj9IlAxYug==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 11 Jul 2023 13:02:56 +0000
  • Ironport-data: A9a23:OCFDEaDjQiuahBVW/w/iw5YqxClBgxIJ4kV8jS/XYbTApG8mhjwDx 2ZMCmCGP62IajOmfop+Oo+18R5VuJbdnNBiQQY4rX1jcSlH+JHPbTi7wuUcHAvJd5GeExg3h yk6QoOdRCzhZiaE/n9BCpC48T8nk/nOHuGmYAL9EngZbRd+Tys8gg5Ulec8g4p56fC0GArIs t7pyyHlEAbNNwVcbCRMsspvlDs15K6p4GxB5wRnDRx2lAS2e0c9Xcp3yZ6ZdxMUcqEMdsamS uDKyq2O/2+x13/B3fv8z94X2mVTKlLjFVDmZkh+AsBOsTAbzsAG6Y4pNeJ0VKtio27hc+ada jl6ncfYpQ8BZsUgkQmGOvVSO3kW0aZuoNcrLZUj2CA6IoKvn3bEmp1T4E8K0YIw27tvDkMQp NskFCkqbUCHqseK5p6iRbw57igjBJGD0II3nFhFlGucJ9B2BJfJTuPN+MNS2yo2ioZWB/HCa sEFaD1pKhPdfxlIPVRRA5U79AuqriCnL3sE9xTI+OxuvTm7IA9ZidABNPLPfdOHX4NNl1uwr WPa5WXpRBodMbRzzBLcqy/13LSewHmTtIQ6BKaI68dv2wWo220TFRkRUkOd+fy5hRvrMz5YA wlOksY0loAw/kG2Stj2XzWjvWWJ+BUbXrJ4DOkS+AyLjK3O7G6xFmUCCzJMdtEinMs3XiAxk E+EmcvzAj5iu6HTTmiSnop4thu3MCkRaGUENSkNSFJf58G5+d9iyBXSUtxkDai5yMXvHi39y CyLqy54gKgPickM1OOw+lWvby+Qm6UlhzUdvm3/Nl9JJCsgDGJ5T+REMWTm0Ms=
  • Ironport-hdrordr: A9a23:V1eNCKpxOjDkXps7HENaaHQaV5oMeYIsimQD101hICG9E/bo7/ xG+c5x6faaslgssR0b9OxpFsG7IE80tqQFhLX5Xo3SPzUO2lHYVb2KhLGKq1fd8kvFh4xgPM xbHJSWZuedMbE0t7ef3OAUKadG/PCXtIqTraP1yXN1SAFjbKttqz1+Fh2QHiRNJDWuQaBJcq ah2g==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Jul 11, 2023 at 12:53:31PM +0200, Jan Beulich wrote:
> On 10.07.2023 16:43, Roger Pau Monné wrote:
> > On Mon, Jul 10, 2023 at 12:56:27PM +0200, Jan Beulich wrote:
> >> On 07.07.2023 11:53, Roger Pau Monne wrote:
> >>> The current logic to init the local APIC and the IO-APIC does init the
> >>> former first before doing any kind of sanitation on the IO-APIC pin
> >>> configuration.  It's already noted on enable_IO_APIC() that Xen
> >>> shouldn't trust the IO-APIC being empty at bootup.
> >>>
> >>> At XenServer we have a system where the IO-APIC 0 is handed to Xen
> >>> with pin 0 unmasked, set to Fixed delivery mode, edge triggered and
> >>> with a vector of 0 (all fields of the RTE are zeroed).  Once the local
> >>> APIC is enabled periodic injections from such pin cause a storm of
> >>> errors:
> >>>
> >>> APIC error on CPU0: 00(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>> APIC error on CPU0: 40(40), Received illegal vector
> >>>
> >>> That prevents Xen from booting.
> >>
> >> And I expect no RTE is in ExtInt mode, so one might conclude that
> >> firmware meant to set up RTE 0 that way. Mainly as a remark, albeit
> >> of course there's then the question whether to change the RTE
> >> rather than masking it. What do ACPI tables say?
> > 
> > There's one relevant override:
> > 
> > [668h 1640   1]                Subtable Type : 02 [Interrupt Source 
> > Override]
> > [669h 1641   1]                       Length : 0A
> > [66Ah 1642   1]                          Bus : 00
> > [66Bh 1643   1]                       Source : 00
> > [66Ch 1644   4]                    Interrupt : 00000002
> > [670h 1648   2]        Flags (decoded below) : 0000
> >                                     Polarity : 0
> >                                 Trigger Mode : 0
> > 
> > So IRQ 0 -> GSI 2, so it's likely pin 0 is the one the i8259 is
> > connected to.
> 
> Then wouldn't we be better off converting that RTE to ExtInt? That
> would allow PIC IRQs (not likely to exist, but still) to work
> (without undue other side effects afaics), whereas masking this RTE
> would not.

I'm kind of worry of trying to automate this logic.  Should we always
convert pin 0 to ExtInt if it's found unmasked, with and invalid
vector and there's a source override entry for the IRQ?

It seems weird to infer this just from the fact that pin 0 is all
zeroed out.

> >>> --- a/xen/arch/x86/apic.c
> >>> +++ b/xen/arch/x86/apic.c
> >>> @@ -1476,6 +1476,10 @@ int __init APIC_init_uniprocessor (void)
> >>>          return -1;
> >>>      }
> >>>  
> >>> +    if ( smp_found_config && !skip_ioapic_setup && nr_ioapics )
> >>> +        /* Sanitize the IO-APIC pins before enabling the local APIC. */
> >>> +        sanitize_IO_APIC();
> >>
> >> I'm a little puzzled by the smp_found_config part of the check here,
> >> but not in smp_prepare_cpus(). What's the reason for (a) the check
> >> and (b) the difference?
> > 
> > This just mimics what gates the call to setup_IO_APIC() in that same
> > function.  It makes no sense to call sanitize_IO_APIC() if
> > setup_IO_APIC() is not called, and I wasn't planning on changing the
> > logic that gates the call setup_IO_APIC() in this patch.
> > 
> > I did note the difference with smp_prepare_cpus(), and I think we
> > should look at unifying those paths, but didn't want to do it as part
> > of this fix.
> 
> Well, consistency is one valid goal. But masking RTEs may need to be
> done more aggressively than setting up the IO-APICs. In particular
> if we're not to use them, we still want to mask all RTEs. Otherwise
> we're likely to observe each IRQ to arrive via two separate routes
> (once through the 8259 and once from an unmasked IO-APIC pin).

So avoid the smp_found_config check here and use the same condition as in
smp_prepare_cpus(), I will adjust the patch to that effect.

I do wonder why we don't simply mandate IO-APIC usage and get rid of
the code to handle the 8259.  Is it only to cope with systems that
have a broken IO-APIC configuration?  Because I don't think there are
x86 64bit systems without an IO-APIC?

> >> Isn't checking nr_ioapics sufficient in this
> >> regard? (b) probably could be addressed by moving the code to the
> >> beginning of verify_local_APIC(), immediately ahead of which you
> >> insert both call sites. (At which point the function may want naming
> >> verify_IO_APIC() for consistency, but that's surely minor.)
> > 
> > I wanted the call to sanitize_IO_APIC() to be done at the same level
> > where the call to setup_IO_APIC() is done, because it makes the logic
> > clearer.
> 
> Hmm, I see.
> 
> >> As to also checking skip_ioapic_setup - wouldn't the unmasked pin be
> >> similarly troublesome in that case?
> > 
> > skip_ioapic_setup is set when the IO-APIC address in the MADT is
> > invalid, so in that case attempting to access IO-APIC registers in
> > that case won't lead to anything good.
> 
> Of course, as I did say as well. Still if for some IO-APICs we know
> their addresses, we would still be able to deal with those (if we
> were to stick to this mask-all-RTEs-early model).

The issue is that ioapic_init_mappings() will refuse to process
further IO-APIC entries once an entry with address 0 is found.  We
could change that, but I would likely do it as an improvement rather
than tie it to this change.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.