[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] vpci: Add resizable bar support
On 2024/11/26 17:47, Jan Beulich wrote: > On 26.11.2024 07:02, Chen, Jiqian wrote: >> On 2024/11/25 20:47, Roger Pau Monné wrote: >>> On Mon, Nov 25, 2024 at 03:44:52AM +0000, Chen, Jiqian wrote: >>>> On 2024/11/21 17:52, Roger Pau Monné wrote: >>>>> On Thu, Nov 21, 2024 at 03:05:14AM +0000, Chen, Jiqian wrote: >>>>>> On 2024/11/20 17:01, Roger Pau Monné wrote: >>>>>>> On Wed, Nov 20, 2024 at 03:01:57AM +0000, Chen, Jiqian wrote: >>>>>>>> The only difference between our methods is the timing of updating the >>>>>>>> size. >>>>>>>> Yours is later than mine because you updated the size when the driver >>>>>>>> re-enabled memory decoding, while I updated the size in time when >>>>>>>> driver resize it. >>>>>>> >>>>>>> Indeed, my last guess is the stale cached size is somehow used in my >>>>>>> approach, and that leads to the failures. One last (possibly dummy?) >>>>>>> thing to try might be to use your patch to detect writes to the resize >>>>>>> control register, but update the BAR sizes in modify_bars(), while >>>>>>> keeping the traces of when the operations happen. >>>>>>> >>>>>> This can work, combine our method, use my patch to detect and write the >>>>>> size into hardware register, and use your patch to update bar[i].size in >>>>>> modify_bars(). >>>>>> Attached the combined patch and the xl dmesg. >>>>> >>>>> This is even weirder, so the attached patch works fine? The only >>>>> difference with my proposal is that you trap the CTRL registers, but >>>>> the sizing is still done in modify_bars(). >>>>> >>>>> What happens if (based on the attached patch) you change >>>>> rebar_ctrl_write() to: >>>>> >>>>> static void cf_check rebar_ctrl_write(const struct pci_dev *pdev, >>>>> unsigned int reg, >>>>> uint32_t val, >>>>> void *data) >>>>> { >>>>> pci_conf_write32(pdev->sbdf, reg, val); >>>>> } >>>>> >>>> If I change rebar_ctrl_write() to: >>>> static void cf_check rebar_ctrl_write(const struct pci_dev *pdev, >>>> unsigned int reg, >>>> uint32_t val, >>>> void *data) >>>> { >>>> printk("cjq_debug %pp: bar ctrl write reg %u, val %x\n", &pdev->sbdf, >>>> reg, val); >>>> pci_conf_write32(pdev->sbdf, reg, val); >>>> } >>>> >>>> I can see three time prints, it can't work. >>>> (XEN) cjq_debug 0000:03:00.0: bar ctrl write reg 520, val d40 >>>> (XEN) cjq_debug 0000:03:00.0: bar ctrl write reg 520, val d40 >>>> (XEN) cjq_debug 0000:03:00.0: bar ctrl write reg 528, val 102 >>>> >>>> If I change rebar_ctrl_write() to: >>>> static void cf_check rebar_ctrl_write(const struct pci_dev *pdev, >>>> unsigned int reg, >>>> uint32_t val, >>>> void *data) >>>> { >>>> if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) >>>> return; >>>> printk("cjq_debug %pp: bar ctrl write reg %u, val %x\n", &pdev->sbdf, >>>> reg, val); >>>> pci_conf_write32(pdev->sbdf, reg, val); >>>> } >>>> >>>> I can only see one time print: >>>> (XEN) cjq_debug 0000:03:00.0: bar ctrl write reg 520, val d40 >>>> >>>> The check prevented the two times incorrect write actions. >>>> if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) >>>> return; >>>> >>>> And why my original patch can work too, the check: >>>> + ctrl = pci_conf_read32(pdev->sbdf, reg); >>>> + if ( ctrl == val ) >>>> + return; >>>> happened to play the same role as PCI_COMMAND_MEMORY check. >>> >>> Thank you very much for figuring this out. So in the end it's a bug >>> in the driver that plays with PCI_REBAR_CTRL with memory decoding >>> enabled. >> Yes, I think. >> During driver initiation, it calls pci_rebar_set_size to resize BARs, >> after that, it calls pci_restore_state->pci_restore_rebar_state to restore >> BARs, >> the problem is when calling pci_restore_rebar_state, memory deoding is >> enabled state. >> I will discuss with my colleagues internally whether this needs to be >> modified in amdgpu driver. > > Why would memory decoding be enabled at that time? pci_restore_config_space() > specifically takes care of restoring CMD only after restoring BARs. And > pci_restore_config_space() is invoked by pci_restore_state() quite a bit > later than pci_restore_rebar_state(). So the driver must (wrongly?) be > enabling decoding earlier on? I got some information from my colleague, driver save and restore the device's state immediately without disable decoding since the state are the same to fix a bug of driver. So, it is driver's problem, not Xen or Roger's method. But as Roger said, it is better to trap Rebar_ctrl to prevent similar problems. > > Jan -- Best regards, Jiqian Chen.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |