[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/IOMMU: mark IOMMU / intremap not in use when ACPI tables are missing


  • To: Jan Beulich <jbeulich@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Date: Wed, 20 Oct 2021 21:01:51 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=nTLVC7cS/lefeTZA/B6tP3IwJisz4pj7fwvDBwl2HGc=; b=GxGajN6bVtYljpv3PE+la7c+0DIEHTpS2+PQHfnD5/5mK2RgC5ot6qt1eE7CsTPaaKNHlDh/OjnbmdULg0+ErORwj8+INEgztBlm/KXpfskStNKqnUvhF0NwI6B+tJCH0YdTqqhkHpaIguOEfLTlApPmYHHelgAI9tO81QdgfX5kIpavJMWtLs54O19xh81v8nPVncpIbUhKxOpP/AXbV9KrPja8O7Guf3FPvs++lJxWlyyE2CrREF4WFFrpSWwzZfQLOPovHQHJp+XcX9w5hyll2yoJhurVb3FCBzgumQweeVC81RyQv9ay7JyovqROOGmDSVc8xL6ForjG9UcCAw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=eJQPsfhKanwB4ZCHzfqAHOq8Ct8h7w43IY3rU4qeOsFxVitD9/ew61DV2S9wGC+ncdrcO0Tew8u6boLj0JCIsBrIUUQRia0XxF21d5ddoYI9RdNf6RP0+s9MlDPSGRQWP5Cx8Z9I4NlTBtAm/Q8B1/g8rnoc4V4X2zEIsVcDvF3PRKSAK2Hq06gwTX2WIpeZsKJwT4npau4TUQVel4NN904j2USxa8+nW6cViRAL7y1TMCEHvcKfkr7QtAgWN2ybXOfs+2qgTlILpgeMEYvgwoAvNlOdeDc4NRMGd6dqpu3sG/nly/fTPm3/zfflCnB1YD3hAz+YgmiXIu4IJk4+qA==
  • Authentication-results: esa5.hc3370-68.iphmx.com; dkim=pass (signature verified) header.i=@citrix.onmicrosoft.com
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Paul Durrant <paul@xxxxxxx>
  • Delivery-date: Wed, 20 Oct 2021 20:02:18 +0000
  • Ironport-data: A9a23:PaU4iahjhVs4pxrY7Ejl3ydlX161lBcKZh0ujC45NGQN5FlHY01je htvXT/VaPqDNDamKYgnaYTlpk8OscSAndNqHQc9rX81En8b9cadCdqndUqhZCn6wu8v7a5EA 2fyTvGacajYm1eF/k/F3oAMKRCQ7InQLlbGILes1htZGEk0F0/NtTo5w7Rg29Yy2YDja++wk YiaT/P3aQfNNwFcagr424rbwP+4lK2v0N+wlgVWicFj5DcypVFMZH4sDfjZw0/DaptVBoaHq 9Prl9lVyI97EyAFUbtJmp6jGqEDryW70QKm0hK6UID66vROS7BbPg/W+5PwZG8O4whlkeydx /1RrZe5GAt5N5bhs9ZAbAAHSX1UE6RZreqvzXiX6aR/zmXDenrohf5vEFs3LcsT/eMf7WNmr KJCbmpXN1ba2rzwkOnTpupE36zPKOHCOo8Ft24m5jbeFfs8GrjIQrnQ5M8e1zA17ixLNauDP ZFEMGQ3BPjGS0JBHw0sFb0kptryrFmvXw1oq32KuZNitgA/yyQuieOwYbI5YOeiVchT20qVu G/C12D4GQ0BcsySzyKf9XChjfOJmjn0MKoQHrCl8v9hgHWI23ceThYRUDOTufCkjmauVtQZL FYbkgIxqYAi+UrtScPyNyBUu1bd4ERaAYAJVbRntkfdkcI4/jp1GEAYVjVIOdB2vfQbbg536 UbWxY/DKzVw5ej9pW2myp+Yqja7OC4wJGAEZDMZQQZt3+QPsL3fnTqUEY49SP/dYsndXGiqm WjT/XdWa6A71JZTj82GEUb7byVAT3QjZjU+4RnLRSqb5wd9aZ/Ni2eAuAWDs6gowGp0SDC8U Jk4dyq2sL9m4XKlznXlrAAx8FeBvart3Nr02gYHInXZ327xk0NPhKgJiN2EGG9nM9wfZRjia 1LJtAVa6fd7ZSXxMfEpMtrrV5xzlMAM8OgJsNiOMrKihbAqLGe6ENxGPxbMjwgBbmB9+U3AB XtrWZn1VitLYUiW5DG3W/0cwdcWKtMWngvuqWTA503/i9K2PSfNIZ9caQfmRr1pvcus/VSOm /4CZpTi9vmqeLCnCsUh2dVIdg5iwLlSLc2elvG7gcbZf1E8Rz14WqSPqV7jEqQ895loei7z1 ijVcmdTyUblhG2BLgOPa3t5b6joU4o5pnU+VRHA9370s5T6SYrwvqoZabUterwrqL5qwfJuF qFXcMScGPVfDD/A/m1FP5X6qYVjcjWthB6PYHX5MGRuIcY4Slyb4MLgcyvu6DIKUni9u/whr uDyzQjcW5cCGVhvVZ6EdPK1wlqtlnEBg+YuDVDQK9xedRy0oohnIiD8lNEtJMQIJUmRzzeWz V/OUxwZufPMs8k+99yQ3fKIqIKgEu1fGEtGHjaEsebqZHeCpmf6mN1OSueFezzZRVjYwqT6a LUH1ez4Pd0GgE1O79h2HYF0wP9s/NDovbJbkFhpRS2Zc1SxB7p8CXCaxs0T5LZVz7pUtAbqC EKC/t5WZeeANM//SQNDIQMkaqKI1O0OmymU5vMweR2o6Chy9buBcENTIxjT13ANcOoraNsok bU7pcobyw2jkR57YN+Jgxdd+3mIMnFdAb4ssYsXAdOzhwcmor2YjUcw1sMiDEmzVuhx
  • Ironport-hdrordr: A9a23:z1Ht1qhegw3bCOF2aT2xoCmkCHBQX0h13DAbv31ZSRFFG/FwyP rBoB1L73DJYWgqNE3I+erhBEGBKUmskaKdkrNhQ4tKOzOWx1dATbsSkbcKpgeAJ8SQzJ8n6U 4NSdkZNDS0NykGsS+Y2njKLz9D+qj+zEnAv463pB0BPGIaCdAT0+46MHf9LqQffng3OXNTLu vk2iMonUvERZ1aVLXAOpFTNNKz1uEj2aiWLSIuNloC0k2jnDmo4Ln1H1yx2QofaSpGxfMH/X LemwL0y62/u7XjoyWsmVP73tBzop/M29FDDMuDhow8LSjtsB+hYMBEV6eZtD44jemz4BIBkc XKoT0nI8NvgkmhMV2dkF/I4U3NwTwu43jtxRuxhmbim9XwQHYAB89IletiA1DkwntlmOs5/L NA3mqfuZYSJwjHhj7B69/BUAwvvlaooFI5+NRjzEB3YM87Uvt8vIYf9ERaHNMrByTh8r0qF+ FoEYX1+OtWS1WHdHrU11MfgOBEZk5DWytuf3Jy/vB8i1Nt7TdEJgojtY0id047hdAAo8Iu3Z WDDkxq/Is+BvP+I5gNXdvpevHHflAldyi8eV56EW6XYZ3vBEi93KIfwI9Frt1CK6Z4gafbpv z6ISVlXCgJChrTNfE=
  • Ironport-sdr: Blg05HS9F4XGiJQNbbREtmISYgCn4OI91WC1puiHP2+SHhe3rOYo2rfCCoU0sZnGKNQuAcjQBN lBZVb5Hv8zU3NHfknCZpSQvFbnXSto0C2+ZnzYY8JYmJjIGTt/HKpOovsDEbslMBexgReVE64Y 3hoIuHccmbFv5b/eHEFtxNxJZ+xdj7J4mze5p/1ia6cZ9f609TklY6uP9QJcnz8/CvYm0yVo4O KnK66qPl418JMlO5CTXnpeWIWQ74E6z3plw9kH4PUM+PxTURrPzdIq1SsJLLdKz9QP8KO2AxBp 4uqDCPLo9d9Q0MT8nGVfzQiZ
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 20/10/2021 11:36, Jan Beulich wrote:
> x2apic_bsp_setup() gets called ahead of iommu_setup(), and since x2APIC
> mode (physical vs clustered) depends on iommu_intremap, that variable
> needs to be set to off as soon as we know we can't / won't enable the
> IOMMU, i.e. in particular when
> - parsing of the respective ACPI tables failed,
> - "iommu=off" is in effect, but not "iommu=no-intremap".
> Move the turning off of iommu_intremap from AMD specific code into
> acpi_iommu_init(), accompanying it by clearing of iommu_enable.
>
> Take the opportunity and also skip ACPI table parsing altogether when
> "iommu=off" is in effect anyway.
>
> Reported-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> I've deliberately not added a Fixes: tag here, as I'm of the opinion
> that d8bd82327b0f ("AMD/IOMMU: obtain IVHD type to use earlier") only
> uncovered a pre-existing anomaly.

I agree it uncovered a pre-existing issue, but that doesn't mean a Fixes
tag should be omitted.  That change very concretely regressed booting on
some systems in their pre-existing configuration.

The commit message needs to spell out a link to d8bd82327b0f, but it's
fine to say "that commit broke it by violating an unexpected ordering
dependency, but isn't really the source of the bug".

> This particularly applies to the "iommu=off" aspect.

There should be at least two Fixes tags, but I suspect trying to trace
the history of this mess is not a productive use of time.

>  (This now allows me to remove an item from my TODO
> list: I was meaning to figure out why one of my systems wouldn't come
> up properly with "iommu=off", and I had never thought of combining this
> with "no-intremap".)
>
> Arguably "iommu=off" should turn off subordinate features in common
> IOMMU code, but doing so in parse_iommu_param() would be wrong (as
> there might be multiple "iommu=" to parse). This could be placed in
> iommu_supports_x2apic(), but see the next item.

I don't think we make any claim or implication that passing the same
option twice works.  The problem here is the nested structure of
options, and the variable doing double duty.

>
> While the change here deals with apic_x2apic_probe() as called from
> x2apic_bsp_setup(), the check_x2apic_preenabled() path looks to be
> similarly affected. That call occurs before acpi_boot_init(), which is
> what calls acpi_iommu_init(). The ordering in setup.c is in part
> relatively fragile, which is why for the moment I'm still hesitant to
> move the generic_apic_probe() call down. Plus I don't have easy access
> to a suitable system to test this case. Thoughts?

I've written these thoughts before, but IOMMU handling it a catastrophic
mess.  It needs burning to the ground and redoing from scratch.

We currently have two ways of turning on the IOMMU, depending on what
firmware does, and plenty ways of crashing Xen with cmdline options
which should work, and the legacy xAPIC startup routine is after
interrupts have been enabled, leading to an attempt to rewrite
interrupts in place to remap them.  This in particular has lead to XSAs
due to trusting registers which can't be trusted, and the rewrite is
impossible to do safely.

The correct order is:
1) Parse DMAR/IVRS (but do not configure anything), MADT, current APIC
setting and cmdline arguments
2) Figure out whether we want to be in xAPIC or x2APIC mode, and whether
we need intremap.  Change the LAPIC setting
3) Configure the IOMMUs.  In particular, their interrupt needs to be
after step 2
4) Start up Xen's general IRQ infrastructure.

It's a fair chunk of work, but it will vastly simplify the boot logic
and let us delete the infinite flushing loops out of the IOMMU logic,
and we don't need any logic which has to second guess itself based on
what happened earlier on boot.

> --- a/xen/drivers/passthrough/x86/iommu.c
> +++ b/xen/drivers/passthrough/x86/iommu.c
> @@ -41,6 +41,23 @@ enum iommu_intremap __read_mostly iommu_
>  bool __read_mostly iommu_intpost;
>  #endif
>  
> +void __init acpi_iommu_init(void)
> +{
> +    if ( iommu_enable )
> +    {
> +        int ret = acpi_dmar_init();
> +
> +        if ( ret == -ENODEV )
> +            ret = acpi_ivrs_init();
> +
> +        if ( ret )
> +            iommu_enable = false;
> +    }
> +
> +    if ( !iommu_enable )
> +        iommu_intremap = iommu_intremap_off;
> +}

This does fix my issue, so preferably with the Fixes tag reinstated,

Acked-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Tested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

However, I don't think skipping parsing is a sensible move.  Intremap is
utterly mandatory if during boot, we discover that our APIC ID is >254,
and iommu=no-intremap must be ignored in this case, or if the MADT says
we have CPUs beyond that limit and the user hasn't specified nr_cpus=1
or equivalent.

This applies to a future world with a sane boot order, but Xen needs to
know hardware_support_{dma,int}remapping (-> must parse the ACPI tables)
ahead of choosing whether to turn the facilities on or not.  Fixing this
removes the double duty from variables.

~Andrew




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.