[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] vpci: Add resizable bar support



On Fri, Nov 15, 2024 at 03:04:22AM +0000, Chen, Jiqian wrote:
> On 2024/11/15 01:36, Roger Pau Monné wrote:
> > On Thu, Nov 14, 2024 at 04:52:18PM +0100, Roger Pau Monné wrote:
> >> On Thu, Nov 14, 2024 at 06:11:46AM +0000, Chen, Jiqian wrote:
> >>> On 2024/11/13 18:30, Roger Pau Monné wrote:
> >>>> On Wed, Nov 13, 2024 at 10:00:33AM +0000, Chen, Jiqian wrote:
> >>>>> On 2024/11/13 17:30, Roger Pau Monné wrote:
> >>>>>> On Wed, Nov 13, 2024 at 04:00:27PM +0800, Jiqian Chen wrote:
> >>>>>>> Some devices, like discrete GPU of amd, support resizable bar 
> >>>>>>> capability,
> >>>>>>> but vpci of Xen doesn't support this feature, so they fail to resize 
> >>>>>>> bars
> >>>>>>> and then cause probing failure.
> >>>>>>>
> >>>>>>> According to PCIe spec, each bar that support resizing has two 
> >>>>>>> registers,
> >>>>>>> PCI_REBAR_CAP and PCI_REBAR_CTRL, so add these two registers and their
> >>>>>>> corresponding handler into vpci.
> >>>>>>>
> >>>>>>> PCI_REBAR_CAP is RO, only provide reading.
> >>>>>>>
> >>>>>>> PCI_REBAR_CTRL only has bar size is RW, so add write function to 
> >>>>>>> support
> >>>>>>> setting the new size.
> >>>>>>
> >>>>>> I think the logic to handle resizable BAR could be much simpler.  Some
> >>>>>> time ago I've made a patch to add support for it, but due to lack of
> >>>>>> hardware on my side to test it I've never submitted it.
> >>>>>>
> >>>>>> My approach would be to detect the presence of the
> >>>>>> PCI_EXT_CAP_ID_REBAR capability in init_header(), and if the
> >>>>>> capability is present force the sizing of BARs each time they are
> >>>>>> mapped in modify_bars().  I don't think we need to trap accesses to
> >>>>>> the capability itself, as resizing can only happen when memory
> >>>>>> decoding is not enabled for the device.  It's enough to fetch the size
> >>>>>> of the BARs ahead of each enabling of memory decoding.
> >>>>>>
> >>>>>> Note that memory decoding implies mapping the BARs into the p2m, which
> >>>>>> is already an expensive operation, the extra sizing is unlikely to
> >>>>>> make much of a difference performance wise.
> >>>>>>
> >>>>>> I've found the following on my git tree and rebased on top of staging:
> >>>>> OK.
> >>>>> Do you need me to validate your patch in my environment?
> >>>>
> >>>> Yes please, I have no way to test it.  Let's see what others think
> >>>> about the different approaches.
> >>> There are some errors with your method.
> >>> I attached the dmesg and xl dmesg logs.
> >>> From the dmesg logs, it seems that 0000:03:00.0 has addresse overlap with 
> >>> 0000:03:00.1
> >>
> >> Do you have the output of lspci with the BAR sizes/positions before
> >> and after the resizing?
> >>
> >>>
> >>> I think there is a place that needs to be modified regarding your method,
> >>> although this modification does not help with the above-mentioned errors,
> >>> it is that whether to support resizing is specific to which bar, rather 
> >>> than just determining whether there is a Rebar capability.
> >>
> >> Do we really need such fine-grained information?  It should be
> >> harmless (even if not strictly necessary) to size all BARs on the
> >> device before enabling memory decoding, even if some of them do not
> >> support resizing.
> >>
> >> I might have to provide a patch with additional messages to see what's
> >> going on.
> > 
> > One nit that I've noticed with the patch I gave you previously is that
> > the check for a size change in modify_bars() should be done ahead of
> > pci_check_bar(), otherwise the check is possibly using an outdated
> > size.
> > 
> > I've also added a debug message to notify when a BAR register is
> > written and report the new address.  This is done unconditionally, but
> > if you think it's too chatty you can limit to only printing for the
> > device that has the ReBAR capability.
> Errors are the same.
> Attached the dmesg, xl dmesg, patch and lspci output.
> I will also continue to debug your method on my side to try to get some 
> findings.

Hello,

I've been looking at the output, and it all seems fine, except the
03:00.0 device that becomes broken at some point, note the lspci
output lists [virtual] next to the resource sizes.  This means reading
for the registers returned 0, so the position and sizes are provided
from the internal OS information.

I'm assuming the patch you sent to the list doesn't lead to such
errors, in which case I can only guess that fetching the size of the
BARs in modify_bars() causes issues with the device.

To confirm this, can you try the following patch on top of your
original change?  This adds an extra pci_size_mem_bar() when the BARs
are resized.  From my reading of the PCI specification sizing the BARs
after having changed the size through the ReBAR capability is allowed.

Thanks, Roger.

diff --git a/xen/drivers/vpci/rebar.c b/xen/drivers/vpci/rebar.c
index 84dbd84b0745..e371ba0ef92a 100644
--- a/xen/drivers/vpci/rebar.c
+++ b/xen/drivers/vpci/rebar.c
@@ -40,6 +40,15 @@ static void cf_check rebar_ctrl_write(const struct pci_dev 
*pdev,
                        PCI_REBAR_CTRL_BAR_UNIT;
 
     pci_conf_write32(pdev->sbdf, reg, val);
+
+{
+    uint64_t addr, size;
+
+    pci_size_mem_bar(pdev->sbdf, PCI_BASE_ADDRESS_0 + index * 4,
+                     &addr, &size, 0);
+
+    ASSERT(size == bars[index].size);
+}
 }
 
 static int cf_check init_rebar(struct pci_dev *pdev)



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.