[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC XEN PATCH 2/6] vpci: accept BAR writes if dom0 is PVH


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>
  • From: Huang Rui <ray.huang@xxxxxxx>
  • Date: Wed, 22 Mar 2023 20:33:23 +0800
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=amd.com; dmarc=pass action=none header.from=amd.com; dkim=pass header.d=amd.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=trD0uMveClH44RZ/CLLdUab3NTJGL6cza4lbn4P8FHY=; b=DS2znbWwDxRop1cJCY1t7sEebqOS0Dn6i1Gdlj7FmUM+08FqxQH0I6OVsQ5SqMwBryCz0XeFW7sRSwsRbXqHKSxfYiHh4U5/v60/UYJjqdet4TQxp3CXHpD3OQk2T9HYjSbaYV7GP0z8GXUBUvGowzltt3dABX2YlGUtknCWHUieo23jAYfLOBRsCfHmNPH407SXx68mRHACHYD+bihwOgpQAiZDextshOsK+o8dINQ+NTMwy05Uc7Gy5nLIPwaEpB/dBrJX83y5Kla7owWaa2im47019+PfkAz82lKWnPBEN2TJMxN3rEJfmG4KCZOMgT+uXQKikU2Hcp1+tQJDfQ==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F3q3uiVSktwODPLcBG8M36XdOK/RhOTd5VGvm9wLy1W4I25jKTTjiffUpxS6vhJsVp9TpQQvRECG14AuqxVhaXGsEqsSbY6oOudXwXVlIPHbieo+qZkBM3pScjXI4bSkwp8wjEcYTSI9slXCo8elC/Ci810YdlHPVRvjhRBlSfuubmf7/3frye+f7bhGTAe9LNpnuyu+M63DgdPfw8ddXDizfoH9Oz9xbi0znwLi3cSTyLt1FWDK+oQRFdjVAdpnyUAnsIqo2d3XxPOYhlX4vSZ76GIsWmSXk9VmPCM4vDVrAEeRpswwgaLUj8B4fQNaropPHHTQjkvVJ+lHsNpEMA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=amd.com;
  • Cc: "Deucher, Alexander" <Alexander.Deucher@xxxxxxx>, "Koenig, Christian" <Christian.Koenig@xxxxxxx>, "Hildebrand, Stewart" <Stewart.Hildebrand@xxxxxxx>, Xenia Ragiadakou <burzalodowa@xxxxxxxxx>, "Huang, Honglei1" <Honglei1.Huang@xxxxxxx>, "Zhang, Julia" <Julia.Zhang@xxxxxxx>, "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 22 Mar 2023 12:34:17 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Wed, Mar 22, 2023 at 05:34:41PM +0800, Roger Pau Monné wrote:
> On Wed, Mar 22, 2023 at 03:28:58PM +0800, Huang Rui wrote:
> > On Tue, Mar 21, 2023 at 09:03:58PM +0800, Huang Rui wrote:
> > > On Tue, Mar 21, 2023 at 08:27:21PM +0800, Jan Beulich wrote:
> > > > On 21.03.2023 12:49, Huang Rui wrote:
> > > > > Thanks, but we found if dom0 is PV domain, the passthrough device will
> > > > > access this function to write the real bar.
> > > > 
> > > > Can you please be quite a bit more detailed about this? The specific 
> > > > code
> > > > paths taken (in upstream software) to result in such would of of 
> > > > interest.
> > > > 
> > > 
> > > yes, please wait for a moment. let me capture a trace dump in my side.
> > > 
> > 
> > Sorry, we are wrong that if Xen PV dom0, bar_write() won't be called,
> > please ignore above information.
> > 
> > While xen is on initialization on PVH dom0, it will add all PCI devices in
> > the real bus including 0000:03:00.0 (VGA device: GPU) and 0000:03:00.1
> > (Audio device).
> > 
> > Audio is another function in the pcie device, but we won't use it here. So
> > we will remove it after that.
> > 
> > Please see below xl dmesg:
> > 
> > (XEN) PCI add device 0000:03:00.0
> > (XEN) d0v0 bar_write Ray line 391 0000:03:00.1 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:03:00.1 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 391 0000:03:00.1 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:03:00.1 bar->enabled 0
> > (XEN) PCI add device 0000:03:00.1
> > (XEN) d0v0 bar_write Ray line 391 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 391 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 391 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 391 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 391 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 391 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 391 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 391 0000:04:00.0 bar->enabled 0
> > (XEN) d0v0 bar_write Ray line 406 0000:04:00.0 bar->enabled 0
> > (XEN) PCI add device 0000:04:00.0
> > 
> > ...
> > 
> > (XEN) PCI add device 0000:07:00.7
> > (XEN) arch/x86/hvm/svm/svm.c:2017:d0v0 RDMSR 0xc0010058 unimplemented
> > (XEN) arch/x86/hvm/svm/svm.c:2017:d0v0 RDMSR 0xc0011020 unimplemented
> > (XEN) PCI remove device 0000:03:00.1
> > 
> > We run below script to remove audio
> > 
> > echo -n "1" > /sys/bus/pci/devices/0000:03:00.1/remove
> > 
> > (XEN) arch/x86/hvm/svm/svm.c:2017:d0v0 RDMSR 0xc001029b unimplemented
> > (XEN) arch/x86/hvm/svm/svm.c:2017:d0v0 RDMSR 0xc001029a unimplemented
> > 
> > Then we will run "xl pci-assignable-add 03:00.0" to assign GPU as
> > passthrough. At this moment, the real bar is trying to be written.
> > 
> > (XEN) d0v7 bar_write Ray line 391 0000:03:00.0 bar->enabled 1
> > (XEN) d0v7 bar_write Ray line 406 0000:03:00.0 bar->enabled 1
> > (XEN) Xen WARN at drivers/vpci/header.c:408
> > (XEN) ----[ Xen-4.18-unstable  x86_64  debug=y  Not tainted ]----
> > (XEN) CPU:    8
> > (XEN) RIP:    e008:[<ffff82d040263cb9>] 
> > drivers/vpci/header.c#bar_write+0xc0/0x1ce
> > (XEN) RFLAGS: 0000000000010202   CONTEXT: hypervisor (d0v7)
> > (XEN) rax: ffff8303fc36d06c   rbx: ffff8303f90468b0   rcx: 0000000000000010
> > (XEN) rdx: 0000000000000002   rsi: ffff8303fc36a020   rdi: ffff8303fc36a018
> > (XEN) rbp: ffff8303fc367c18   rsp: ffff8303fc367be8   r8:  0000000000000001
> > (XEN) r9:  ffff8303fc36a010   r10: 0000000000000001   r11: 0000000000000001
> > (XEN) r12: 00000000d0700000   r13: ffff8303fc6d9230   r14: ffff8303fc6d9270
> > (XEN) r15: 0000000000000000   cr0: 0000000080050033   cr4: 00000000003506e0
> > (XEN) cr3: 00000003fc3c4000   cr2: 00007f180f6371e8
> > (XEN) fsb: 00007fce655edbc0   gsb: ffff88822f3c0000   gss: 0000000000000000
> > (XEN) ds: 0000   es: 0000   fs: 0000   gs: 0000   ss: 0000   cs: e008
> > (XEN) Xen code around <ffff82d040263cb9> 
> > (drivers/vpci/header.c#bar_write+0xc0/0x1ce):
> > (XEN)  b6 53 14 f6 c2 02 74 02 <0f> 0b 48 8b 03 45 84 ff 0f 85 ec 00 00 00 
> > 48 b9
> > (XEN) Xen stack trace from rsp=ffff8303fc367be8:
> > (XEN)    00000024fc367bf8 ffff8303f9046a50 0000000000000000 0000000000000004
> > (XEN)    0000000000000004 0000000000000024 ffff8303fc367ca0 ffff82d040263683
> > (XEN)    00000300fc367ca0 d070000003003501 00000024d0700000 ffff8303fc6d9230
> > (XEN)    0000000000000000 0000000000000000 0000002400000004 0000000000000000
> > (XEN)    0000000000000000 0000000000000000 0000000000000004 00000000d0700000
> > (XEN)    0000000000000024 0000000000000000 ffff82d040404bc0 ffff8303fc367cd0
> > (XEN)    ffff82d0402c60a8 0000030000000001 ffff8303fc367d88 0000000000000000
> > (XEN)    ffff8303fc610800 ffff8303fc367d30 ffff82d0402c54da ffff8303fc367ce0
> > (XEN)    ffff8303fc367fff 0000000000000004 ffff830300000004 00000000d0700000
> > (XEN)    ffff8303fc610800 ffff8303fc367d88 0000000000000001 0000000000000000
> > (XEN)    0000000000000000 ffff8303fc367d58 ffff82d0402c5570 0000000000000004
> > (XEN)    ffff8304065ea000 ffff8303fc367e28 ffff8303fc367dd0 ffff82d0402b5357
> > (XEN)    0000000000000cfc ffff8303fc621000 0000000000000000 0000000000000000
> > (XEN)    0000000000000cfc 00000000d0700000 0000000400000001 0001000000000000
> > (XEN)    0000000000000004 0000000000000004 0000000000000000 ffff8303fc367e44
> > (XEN)    ffff8304065ea000 ffff8303fc367e10 ffff82d0402b56d6 0000000000000000
> > (XEN)    ffff8303fc367e44 0000000000000004 0000000000000cfc ffff8304065e6000
> > (XEN)    0000000000000000 ffff8303fc367e30 ffff82d0402b6bcc ffff8303fc367e44
> > (XEN)    0000000000000001 ffff8303fc367e70 ffff82d0402c5e80 d070000040203490
> > (XEN)    000000000000007b ffff8303fc367ef8 ffff8304065e6000 ffff8304065ea000
> > (XEN) Xen call trace:
> > (XEN)    [<ffff82d040263cb9>] R drivers/vpci/header.c#bar_write+0xc0/0x1ce
> > (XEN)    [<ffff82d040263683>] F vpci_write+0x123/0x26c
> > (XEN)    [<ffff82d0402c60a8>] F 
> > arch/x86/hvm/io.c#vpci_portio_write+0xa0/0xa7
> > (XEN)    [<ffff82d0402c54da>] F hvm_process_io_intercept+0x203/0x26f
> > (XEN)    [<ffff82d0402c5570>] F hvm_io_intercept+0x2a/0x4c
> > (XEN)    [<ffff82d0402b5357>] F 
> > arch/x86/hvm/emulate.c#hvmemul_do_io+0x29b/0x5eb
> > (XEN)    [<ffff82d0402b56d6>] F 
> > arch/x86/hvm/emulate.c#hvmemul_do_io_buffer+0x2f/0x6a
> > (XEN)    [<ffff82d0402b6bcc>] F hvmemul_do_pio_buffer+0x33/0x35
> > (XEN)    [<ffff82d0402c5e80>] F handle_pio+0x70/0x1b7
> > (XEN)    [<ffff82d04029dc7f>] F svm_vmexit_handler+0x10ba/0x18aa
> > (XEN)    [<ffff82d0402034e5>] F svm_stgi_label+0x8/0x18
> > (XEN)
> > (XEN) d0v7 bar_write Ray line 391 0000:03:00.0 bar->enabled 1
> > (XEN) d0v7 bar_write Ray line 406 0000:03:00.0 bar->enabled 1
> 

Hi Jan, Roger,

> As said by Jan, it's hard to figure out where are the printks placed without a
> diff of your changes.

I attached the diff of my prints below, and I want to figure out why the
Bar_write() is called while we use pci-assignable-add to assign passthrough 
device in PVH dom0.


diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 918d11fbce..35447aff2a 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -388,12 +388,14 @@ static void cf_check bar_write(
     else
         val &= PCI_BASE_ADDRESS_MEM_MASK;
 
+    gprintk(XENLOG_WARNING, "%s Ray line %d %pp bar->enabled %d\n", __func__, 
__LINE__, &pdev->sbdf , bar->enabled);
     /*
      * Xen only cares whether the BAR is mapped into the p2m, so allow BAR
      * writes as long as the BAR is not mapped into the p2m.
      */
     if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY )
     {
+        gprintk(XENLOG_WARNING, "%s Ray line %d %pp bar->enabled %d\n", 
__func__, __LINE__, &pdev->sbdf , bar->enabled);
         /* If the value written is the current one avoid printing a warning. */
         if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) )
             gprintk(XENLOG_WARNING,
@@ -401,7 +403,9 @@ static void cf_check bar_write(
                     &pdev->sbdf, bar - pdev->vpci->header.bars + hi);
         return;
     }
-
+    gprintk(XENLOG_WARNING, "%s Ray line %d %pp bar->enabled %d\n", __func__, 
__LINE__, &pdev->sbdf , bar->enabled);
+    if (bar->enabled)
+       WARN_ON(1);
 
     /*
      * Update the cached address, so that when memory decoding is enabled

> 
> So far the above seems to be expected, as we currently don't handle BAR
> register writes with memory decoding enabled.
> 
> Given the change proposed in this patch, can you check whether `bar->enabled 
> ==
> true` but the PCI command register has the memory decoding bit unset?


I traced that while we do pci-assignable-add, we will follow below trace to
bind the passthrough device.

pciassignable_add()->libxl_device_pci_assignable_add()->libxl__device_pci_assignable_add()->pciback_dev_assign()

Then kernel xen-pciback driver want to add virtual configuration spaces. In
this phase, the bar_write() in xen hypervisor will be called. I still need
a bit more time to figure the exact reason. May I know where the
xen-pciback driver would trigger a hvm_io_intercept to xen hypervisor?

[  309.719049] xen_pciback: wants to seize 0000:03:00.0
[  462.911251] pciback 0000:03:00.0: xen_pciback: probing...
[  462.911256] pciback 0000:03:00.0: xen_pciback: seizing device
[  462.911257] pciback 0000:03:00.0: xen_pciback: pcistub_device_alloc
[  462.911261] pciback 0000:03:00.0: xen_pciback: initializing...
[  462.911263] pciback 0000:03:00.0: xen_pciback: initializing config
[  462.911265] pciback 0000:03:00.0: xen-pciback: initializing virtual 
configuration space
[  462.911268] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x00
[  462.911271] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x02
[  462.911284] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x04
[  462.911286] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x3c
[  462.911289] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x3d
[  462.911291] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x0c
[  462.911294] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x0d
[  462.911296] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x0f
[  462.911301] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x10
[  462.911306] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x14
[  462.911309] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x18
[  462.911313] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x1c
[  462.911317] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x20
[  462.911321] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x24
[  462.911325] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x30
[  462.911358] pciback 0000:03:00.0: Found capability 0x1 at 0x50
[  462.911361] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x50
[  462.911363] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x52
[  462.911368] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x54
[  462.911371] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x56
[  462.911373] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0x57
[  462.911386] pciback 0000:03:00.0: Found capability 0x5 at 0xa0
[  462.911388] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0xa0
[  462.911391] pciback 0000:03:00.0: xen-pciback: added config field at offset 
0xa2
[  462.911405] pciback 0000:03:00.0: xen_pciback: enabling device
[  462.911412] pciback 0000:03:00.0: enabling device (0006 -> 0007)
[  462.911658] Already setup the GSI :28
[  462.911668] Already map the GSI :28 and IRQ: 115
[  462.911684] pciback 0000:03:00.0: xen_pciback: save state of device
[  462.912154] pciback 0000:03:00.0: xen_pciback: resetting (FLR, D3, etc) the 
device
[  463.954998] pciback 0000:03:00.0: xen_pciback: reset device


> 
> If so it would mean Xen state got out-of-sync with the hardware state, and we
> would need to figure out where it happened.  Is there any backdoor in the AMD
> GPU that allows to disable memory decoding without using the PCI command
> register?
> 

I don't think we have any backdoor.

Thanks,
Ray



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.