[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] vpci: Add resizable bar support


  • To: "Chen, Jiqian" <Jiqian.Chen@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 26 Nov 2024 10:47:43 +0100
  • Autocrypt: addr=jbeulich@xxxxxxxx; keydata= xsDiBFk3nEQRBADAEaSw6zC/EJkiwGPXbWtPxl2xCdSoeepS07jW8UgcHNurfHvUzogEq5xk hu507c3BarVjyWCJOylMNR98Yd8VqD9UfmX0Hb8/BrA+Hl6/DB/eqGptrf4BSRwcZQM32aZK 7Pj2XbGWIUrZrd70x1eAP9QE3P79Y2oLrsCgbZJfEwCgvz9JjGmQqQkRiTVzlZVCJYcyGGsD /0tbFCzD2h20ahe8rC1gbb3K3qk+LpBtvjBu1RY9drYk0NymiGbJWZgab6t1jM7sk2vuf0Py O9Hf9XBmK0uE9IgMaiCpc32XV9oASz6UJebwkX+zF2jG5I1BfnO9g7KlotcA/v5ClMjgo6Gl MDY4HxoSRu3i1cqqSDtVlt+AOVBJBACrZcnHAUSuCXBPy0jOlBhxPqRWv6ND4c9PH1xjQ3NP nxJuMBS8rnNg22uyfAgmBKNLpLgAGVRMZGaGoJObGf72s6TeIqKJo/LtggAS9qAUiuKVnygo 3wjfkS9A3DRO+SpU7JqWdsveeIQyeyEJ/8PTowmSQLakF+3fote9ybzd880fSmFuIEJldWxp Y2ggPGpiZXVsaWNoQHN1c2UuY29tPsJgBBMRAgAgBQJZN5xEAhsDBgsJCAcDAgQVAggDBBYC AwECHgECF4AACgkQoDSui/t3IH4J+wCfQ5jHdEjCRHj23O/5ttg9r9OIruwAn3103WUITZee e7Sbg12UgcQ5lv7SzsFNBFk3nEQQCACCuTjCjFOUdi5Nm244F+78kLghRcin/awv+IrTcIWF hUpSs1Y91iQQ7KItirz5uwCPlwejSJDQJLIS+QtJHaXDXeV6NI0Uef1hP20+y8qydDiVkv6l IreXjTb7DvksRgJNvCkWtYnlS3mYvQ9NzS9PhyALWbXnH6sIJd2O9lKS1Mrfq+y0IXCP10eS FFGg+Av3IQeFatkJAyju0PPthyTqxSI4lZYuJVPknzgaeuJv/2NccrPvmeDg6Coe7ZIeQ8Yj t0ARxu2xytAkkLCel1Lz1WLmwLstV30g80nkgZf/wr+/BXJW/oIvRlonUkxv+IbBM3dX2OV8 AmRv1ySWPTP7AAMFB/9PQK/VtlNUJvg8GXj9ootzrteGfVZVVT4XBJkfwBcpC/XcPzldjv+3 HYudvpdNK3lLujXeA5fLOH+Z/G9WBc5pFVSMocI71I8bT8lIAzreg0WvkWg5V2WZsUMlnDL9 mpwIGFhlbM3gfDMs7MPMu8YQRFVdUvtSpaAs8OFfGQ0ia3LGZcjA6Ik2+xcqscEJzNH+qh8V m5jjp28yZgaqTaRbg3M/+MTbMpicpZuqF4rnB0AQD12/3BNWDR6bmh+EkYSMcEIpQmBM51qM EKYTQGybRCjpnKHGOxG0rfFY1085mBDZCH5Kx0cl0HVJuQKC+dV2ZY5AqjcKwAxpE75MLFkr wkkEGBECAAkFAlk3nEQCGwwACgkQoDSui/t3IH7nnwCfcJWUDUFKdCsBH/E5d+0ZnMQi+G0A nAuWpQkjM1ASeQwSHEeAWPgskBQL
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Tue, 26 Nov 2024 09:48:03 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 26.11.2024 07:02, Chen, Jiqian wrote:
> On 2024/11/25 20:47, Roger Pau Monné wrote:
>> On Mon, Nov 25, 2024 at 03:44:52AM +0000, Chen, Jiqian wrote:
>>> On 2024/11/21 17:52, Roger Pau Monné wrote:
>>>> On Thu, Nov 21, 2024 at 03:05:14AM +0000, Chen, Jiqian wrote:
>>>>> On 2024/11/20 17:01, Roger Pau Monné wrote:
>>>>>> On Wed, Nov 20, 2024 at 03:01:57AM +0000, Chen, Jiqian wrote:
>>>>>>> The only difference between our methods is the timing of updating the 
>>>>>>> size.
>>>>>>> Yours is later than mine because you updated the size when the driver 
>>>>>>> re-enabled memory decoding, while I updated the size in time when 
>>>>>>> driver resize it.
>>>>>>
>>>>>> Indeed, my last guess is the stale cached size is somehow used in my
>>>>>> approach, and that leads to the failures.  One last (possibly dummy?)
>>>>>> thing to try might be to use your patch to detect writes to the resize
>>>>>> control register, but update the BAR sizes in modify_bars(), while
>>>>>> keeping the traces of when the operations happen.
>>>>>>
>>>>> This can work, combine our method, use my patch to detect and write the 
>>>>> size into hardware register, and use your patch to update bar[i].size in 
>>>>> modify_bars().
>>>>> Attached the combined patch and the xl dmesg.
>>>>
>>>> This is even weirder, so the attached patch works fine?  The only
>>>> difference with my proposal is that you trap the CTRL registers, but
>>>> the sizing is still done in modify_bars().
>>>>
>>>> What happens if (based on the attached patch) you change
>>>> rebar_ctrl_write() to:
>>>>
>>>> static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>>>>                                       unsigned int reg,
>>>>                                       uint32_t val,
>>>>                                       void *data)
>>>> {
>>>>     pci_conf_write32(pdev->sbdf, reg, val);
>>>> }
>>>>
>>> If I change rebar_ctrl_write() to:
>>> static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>>>                                       unsigned int reg,
>>>                                       uint32_t val,
>>>                                       void *data)
>>> {
>>>     printk("cjq_debug %pp: bar ctrl write reg %u, val %x\n", &pdev->sbdf, 
>>> reg, val);
>>>     pci_conf_write32(pdev->sbdf, reg, val);
>>> }
>>>
>>> I can see three time prints, it can't work.
>>> (XEN) cjq_debug 0000:03:00.0: bar ctrl write reg 520, val d40
>>> (XEN) cjq_debug 0000:03:00.0: bar ctrl write reg 520, val d40
>>> (XEN) cjq_debug 0000:03:00.0: bar ctrl write reg 528, val 102
>>>
>>> If I change rebar_ctrl_write() to:
>>> static void cf_check rebar_ctrl_write(const struct pci_dev *pdev,
>>>                                       unsigned int reg,
>>>                                       uint32_t val,
>>>                                       void *data)
>>> {
>>>     if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY )
>>>         return;
>>>     printk("cjq_debug %pp: bar ctrl write reg %u, val %x\n", &pdev->sbdf, 
>>> reg, val);
>>>     pci_conf_write32(pdev->sbdf, reg, val);
>>> } 
>>>
>>> I can only see one time print:
>>> (XEN) cjq_debug 0000:03:00.0: bar ctrl write reg 520, val d40
>>>
>>> The check prevented the two times incorrect write actions.
>>>     if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY )
>>>         return;
>>>
>>> And why my original patch can work too, the check:
>>> +    ctrl = pci_conf_read32(pdev->sbdf, reg);
>>> +    if ( ctrl == val )
>>> +        return;
>>> happened to play the same role as PCI_COMMAND_MEMORY check.
>>
>> Thank you very much for figuring this out.  So in the end it's a bug
>> in the driver that plays with PCI_REBAR_CTRL with memory decoding
>> enabled.
> Yes, I think.
> During driver initiation, it calls pci_rebar_set_size to resize BARs,
> after that, it calls pci_restore_state->pci_restore_rebar_state to restore 
> BARs,
> the problem is when calling pci_restore_rebar_state, memory deoding is 
> enabled state.
> I will discuss with my colleagues internally whether this needs to be 
> modified in amdgpu driver.

Why would memory decoding be enabled at that time? pci_restore_config_space()
specifically takes care of restoring CMD only after restoring BARs. And
pci_restore_config_space() is invoked by pci_restore_state() quite a bit
later than pci_restore_rebar_state(). So the driver must (wrongly?) be
enabling decoding earlier on?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.