[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: PCI pass-through problem for SN570 NVME SSD


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 4 Jul 2022 18:05:52 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=m47xhOm/aDXB9ZZsWzK6a2O0sUtKynKll9WL+JTuz/I=; b=j3dP8IVp69JeWnQn7kYBX7owww7Va+NyUmQtME5hlRcYH/QemznXjeoDVmbNaVL0ZrwC7Cn2qWfFhkSHyi+Hp/8C06WKJySxeKxillxa9oEEC+EpbdyMa4RuBcuJkXFxn/l0LqJbCP/gKBEIMzYuiVv4R0wAYxpVw526ou9cQAIy9rCZeRtJlyvlnzWDcxBrsDO47QbNAPOwzCxdyma31JHpe3RM/hZ5DOIQRbfGrMz0UDICTHMP7XGp5kcGjZpb/kJCHCM6m6p0CXwHCwl8sf1vH8nJQtVD6H/PxpnBFZDUk5+eAFQIN+eElEp8+tijxYr7D6Z5oSaiw7tVSWbw1w==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PNKOepDS7ON4SPkWN2DLttn0mAaQL1HpIH7Y+VYc4mLjmscmX4qym6QAqUN8XAbZ2CV5p/jtmIDsu9tydjOjcZtOMnaIGznrT5qKJrlRMybZ3mxFanIEOxsAT1fVPr2vUFDzYQtG2YPLmqo8Ae6hWq5jaxaOD3OsDqi5XhHjsbvAVI4qsKL8slnOZYcWt0Yg0RnfED+VGn1ltfJvxd4mFDM8wgYSqK10GWXV0AcjVHn+UYNDWJ1wFz0dXQH4Sbf7TAMRA6qU6LWy++amjrWcc0ZTS/NkhwTp7Jaeur3W9xHNmvMmxhmFWK5rssHsxnbOA2ww7j46msszSbnArZL09Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxx>, "G.R." <firemeteor@xxxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Mon, 04 Jul 2022 16:06:03 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 04.07.2022 17:33, Roger Pau Monné wrote:
> On Mon, Jul 04, 2022 at 10:51:53PM +0800, G.R. wrote:
>> On Mon, Jul 4, 2022 at 9:09 PM Roger Pau Monné <roger.pau@xxxxxxxxxx> wrote:
>>>>
>>>> 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a (prog-if 
>>>> 02 [NVM Express])
>>>>       Subsystem: Sandisk Corp Device 501a
>>>>       Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- 
>>>> Stepping- SERR- FastB2B- DisINTx+
>>>>       Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- 
>>>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>>       Latency: 0, Cache Line Size: 64 bytes
>>>>       Interrupt: pin A routed to IRQ 16
>>>>       NUMA node: 0
>>>>       IOMMU group: 13
>>>>       Region 0: Memory at a2600000 (64-bit, non-prefetchable) [size=16K]
>>>>       Region 4: Memory at a2604000 (64-bit, non-prefetchable) [size=256]
>>>
>>> I think I'm slightly confused, the overlapping happens at:
>>>
>>> (XEN) d1: GFN 0xf3078 (0xa2616,0,5,7) -> (0xa2504,0,5,7) not permitted
>>>
>>> So it's MFNs 0xa2616 and 0xa2504, yet none of those are in the BAR
>>> ranges of this device.
>>>
>>> Can you paste the lspci -vvv output for any other device you are also
>>> passing through to this guest?
>>>
>>
>> I just realized that the address may change in different environments.
>> In previous email chains, I used a cached dump from a Linux
>> environment running outside the hypervisor.
>> Sorry for the confusion. Refreshing with a XEN dom0 dump.
>>
>> The other device I used is a SATA controller. I think I can get what
>> you are looking for now.
>> Both a2616 and a2504 are found!
>>
>> 00:17.0 SATA controller: Intel Corporation Cannon Lake PCH SATA AHCI
>> Controller (rev 10) (prog-if 01 [AHCI 1.0])
>>         DeviceName: Onboard - SATA
>>         Subsystem: Gigabyte Technology Co., Ltd Cannon Lake PCH SATA
>> AHCI Controller
>>         Control: I/O+ Mem+ BusMaster- SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR- FastB2B- DisINTx-
>>         Status: Cap+ 66MHz+ UDF- FastB2B+ ParErr- DEVSEL=medium
>>> TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Interrupt: pin A routed to IRQ 16
>>         Region 0: Memory at a2610000 (32-bit, non-prefetchable) [size=8K]
>>         Region 1: Memory at a2616000 (32-bit, non-prefetchable) [size=256]
>>         Region 2: I/O ports at 4090 [size=8]
>>         Region 3: I/O ports at 4080 [size=4]
>>         Region 4: I/O ports at 4060 [size=32]
>>
>> 05:00.0 Non-Volatile memory controller: Sandisk Corp Device 501a
>> (prog-if 02 [NVM Express])
>>         Subsystem: Sandisk Corp Device 501a
>>         Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop-
>> ParErr- Stepping- SERR- FastB2B- DisINTx-
>>         Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort-
>> <TAbort- <MAbort- >SERR- <PERR- INTx-
>>         Latency: 0, Cache Line Size: 64 bytes
>>         Interrupt: pin A routed to IRQ 11
>>         Region 0: Memory at a2500000 (64-bit, non-prefetchable) [size=16K]
>>         Region 4: Memory at a2504000 (64-bit, non-prefetchable) [size=256]
> 
> Right, so hvmloader attempts to place a BAR from 05:00.0 and a BAR
> from 00:17.0 into the same page, which is not that good behavior.  It
> might be sensible to attempt to share the page if both BARs belong to
> the same device, but not if they belong to different devices.
> 
> I think the following patch:
> 
> https://lore.kernel.org/xen-devel/20200117110811.43321-1-roger.pau@xxxxxxxxxx/

Hmm, yes, we definitely want to revive that one. Having gone through
the discussion again, I think what is needed is suitable checking in
tool stack and Xen for proper alignment. Unless of course non-page-
aligned BARs could be adjusted "on the fly" by some interaction with
the kernel (perhaps at pci-assignable-add time), in which case it
would only be Xen where a (final) check would want adding. Of course
if we can't adjust things "on the fly", then clear direction needs
to be provided to users as to what they need to do in order to be
able to assign a given device to a guest.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.