[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v8 1/6] AMD/IOMMU: obtain IVHD type to use earlier


  • To: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 20 Oct 2021 10:17:08 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=cZuykuWKCYt+OY/Hy7gp+VtXnNos7YGottIWuVTw06w=; b=U2La5K/xYy4gc2EZEGh691bnbuMbfsI3dXfbevj4db8tg982JpN4KcGodKJgX4tH9y/JrdvVnqImVFxJkZCJtt65oWbFpmF6ZjOs3aoBcTLrRdhGdhI3ZUdWmisKTa0ibwOquOgSSnVvcz1OyqNpr5p7bl8xkBA6JMph+XOwwPcb3Tvh9zyIDlyQbuWfmJ8YTrfUl+l529C2A3aJnn0uzX5G5ODJkkVKXynSLjtO3QD53QzBOvkTbEPPcXdlK4xzCAVSRVaZnNwonBawu2Bz5P/xp3Gjn1s/rgRKlAFVBy2MX0+jIL+vQd+jv0uq9ZwUhxGJE9qBlXX7dbhB/imAVw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=XLOHohl66XEz6C38AWdog+3QSlh7/zOqw5c/tAsiB7ZrVZx6mAxcoZ5kQ/6wi+NOC4eu+WSy6upei5xKB7RMDB0MnajiVsKC34tVAiWEw2YrPNpgRLywSTOII6kP2rknGYR7vlQOYHlaCpdFHVA8Oe1EA70qP/GWiljtC2Gbk1Izp2cdPmD0rsqHxjR6ubAKR+wqu8sqIzCumG54KHbtXTlbEL1JINVQd7872e1N+ZILcOOfAUEKuOA5xZo1FOj7ikax4ltEVfnjT7/9z4PboDa0t3ELHQ2r8INJocePRCFW3+xRo5qAAr3dIsQ4GVJYepmLKSZpoyEEGQ7rIxOWYA==
  • Authentication-results: lists.xenproject.org; dkim=none (message not signed) header.d=none;lists.xenproject.org; dmarc=none action=none header.from=suse.com;
  • Cc: Paul Durrant <paul@xxxxxxx>, Ian Jackson <iwj@xxxxxxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 20 Oct 2021 08:17:21 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 20.10.2021 01:34, Andrew Cooper wrote:
> On 22/09/2021 15:36, Jan Beulich wrote:
>> Doing this in amd_iommu_prepare() is too late for it, in particular, to
>> be used in amd_iommu_detect_one_acpi(), as a subsequent change will want
>> to do. Moving it immediately ahead of amd_iommu_detect_acpi() is
>> (luckily) pretty simple, (pretty importantly) without breaking
>> amd_iommu_prepare()'s logic to prevent multiple processing.
>>
>> This involves moving table checksumming, as
>> amd_iommu_get_supported_ivhd_type() ->  get_supported_ivhd_type() will
>> now be invoked before amd_iommu_detect_acpi()  -> detect_iommu_acpi(). In
>> the course of doing so stop open-coding acpi_tb_checksum(), seeing that
>> we have other uses of this originally ACPI-private function elsewhere in
>> the tree.
>>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> I'm afraid this breaks booting on Skylake Server.  Yes, really - I
> didn't believe the bisection at first either.

I'll be able to debug this, as by disabling VT-d on my Skylake I can
repro. But ...

> From a bit of debugging, I've found:
> 
> (XEN) *** acpi_dmar_init() => -19
> (XEN) *** amd_iommu_get_supported_ivhd_type() => -19
> 
> So VT-d is disabled in firmware.  Oops, but something we should cope with.
> 
> Then we fall into acpi_ivrs_init(), and take the new-in-this-patch early
> exit with -ENOENT too.
> 
> It turns out ...
> 
>> --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c
>> @@ -179,9 +179,17 @@ static int __must_check amd_iommu_setup_
>>  
>>  int __init acpi_ivrs_init(void)
>>  {
>> +    int rc;
>> +
>>      if ( !iommu_enable && !iommu_intremap )
>>          return 0;
>>  
>> +    rc = amd_iommu_get_supported_ivhd_type();
>> +    if ( rc < 0 )
>> +        return rc;
>> +    BUG_ON(!rc);
>> +    ivhd_type = rc;
>> +
>>      if ( (amd_iommu_detect_acpi() !=0) || (iommu_found() == 0) )
>>      {
>>          iommu_intremap = iommu_intremap_off;
>>
> 
> ... we're relying on this path (now skipped) to set iommu_intremap away
> from iommu_intremap_full in the "no IOMMU anywhere to be found" case.

... this picture here looks incomplete, since in iommu_hardware_setup()
we have

    if ( !iommu_enabled )
        iommu_intremap = iommu_intremap_off;

which I don't see how it could be bypassed. Booting here fails because
of the AHCI driver not being able to obtain control of the disk, but
checking in a working setup I see it use MSI, which can't possibly be
affected by an early-boot-only wrong setting of iommu_intremap. (I can
easily believe that we have early IO-APIC setup logic going wrong when
this remains mistakenly set.)

What I'd like to avoid though is to add yet another custom writing of
iommu_intremap_off in acpi_ivrs_init(). I'd prefer to find a better
place for it, so I will want to do some debugging first. If all else
fails, the setting should at least be moved into the caller of the
function - after all switching around the order of the
acpi_{dmar,ivrs}_init() calls in acpi_iommu_init() shouldn't have any
negative effect.

Jan

> This explains why I occasionally during failure get spew about:
> 
> (XEN) CPU0: No irq handler for vector 7a (IRQ -2147483648, LAPIC)
> [   17.117518] xhci_hcd 0000:00:14.0: Error while assigning device slot ID
> [   17.121114] xhci_hcd 0000:00:14.0: Max number of devices this xHCI
> host supports is 64.
> [   17.125198] usb usb1-port2: couldn't allocate usb_device
> [  248.317462] INFO: task kworker/u32:0:7 blocked for more than 120 seconds.
> 
> and eventually (gone 400s) get dumped in a dracut shell.
> 
> Booting with an explicit iommu=no-intremap, which clobbers
> iommu_intremap during cmdline parsing, recovers the system.
> 
> This variable controls a whole lot of magic with interrupt handling.  It
> should default to 0, not 2, and only become nonzero when an IOMMU is
> properly established.  It also shouldn't be serving double duty as "what
> the user wants" ahead of determining the system capabilities.
> 
> And not to open another can of worms, but our entire way of working
> explodes if there are devices on the system not covered by an IOMMU.
> 
> ~Andrew
> 




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.