[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 15/16] xen/arm: vpci: check guest range


  • To: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • From: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
  • Date: Thu, 28 Sep 2023 14:28:11 -0400
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=citrix.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=pbgVN9siqoU0orFJZhdhbTsHBoi/nFYOXk23kfEX5EY=; b=bd9fUaRASPZ3/zdorGehgS4ooMEzM1g9j+LvNtfGYuvvSv0HgM1VqXf/GeL+btAtMEL9uXhHmiXDFpmxMsRXVlUNs9GMLkZ75bcaHyUHC1iOF5kO57ac1JCpQAPF0/Tbg/7xVROGBzq4lQlNwT+82L/ZFAs3qeN+lrBf2nQPktbJUsRB3S1ks4+BgEUc5DCxtvCXK69Il/Bt6Glp5xO48TmoENQHW9rqGmi2AL7e/c3cAcwp3oF1j2JmYKXjxai+1IAZKvX7xJYC0mdZMePZC8n+zf59saIXRPAl/yXaURt4wVpe9rTH8Gb8nu1Z4DI4IMn1oeR8W5524sVlPO+Z9g==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=DDgwsMqY5K+bAzHnxMvwAOsPj7M5hgQRJsk+wRAt0FgNsg/2zNBpV0sqWRxThI8CAZ5shZiEZYuJFHSRZVmVX6ezqjoWoflTrl4IMtor7Vgy7ZbfAO1EbqLI0KBoWNxQe+tCpZzDZTPE061mUNLtA/6ISmsVmfi7zSkiH3fAQ4Fa/GOntcuWF1jTlvCM+Tx95Tj4a4g40Q3W5TJ/HwCFN5iAg8V/Q1oMI7/Lr46fJjmiu3EUuUA62nPxeMcHH5XP3rgK5x4zWZ5RDiaX9BnA8C3Gp4lduw8f+7B3hPoVT+vQGNpFH+o/J7D+IVTvoKCvw4dyr97ozeYFyaNKYcJXGg==
  • Cc: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 28 Sep 2023 18:28:14 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>


On 9/28/23 04:28, Roger Pau Monné wrote:
> Caution: This message originated from an External Source. Use proper caution 
> when opening attachments, clicking links, or responding.
> 
> 
> On Wed, Sep 27, 2023 at 02:03:30PM -0400, Stewart Hildebrand wrote:
>> On 9/26/23 11:48, Roger Pau Monné wrote:
>>> On Tue, Sep 26, 2023 at 11:27:48AM -0400, Stewart Hildebrand wrote:
>>>> On 9/26/23 04:07, Roger Pau Monné wrote:
>>>>> On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote:
>>>>>> On 9/22/23 04:44, Roger Pau Monné wrote:
>>>>>>> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote:
>>>>>>>> From: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
>>>>>>>>
>>>>>>>> Skip mapping the BAR if it is not in a valid range.
>>>>>>>>
>>>>>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@xxxxxxx>
>>>>>>>> ---
>>>>>>>>  xen/drivers/vpci/header.c | 9 +++++++++
>>>>>>>>  1 file changed, 9 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
>>>>>>>> index 1d243eeaf9..dbabdcbed2 100644
>>>>>>>> --- a/xen/drivers/vpci/header.c
>>>>>>>> +++ b/xen/drivers/vpci/header.c
>>>>>>>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev 
>>>>>>>> *pdev, uint16_t cmd, bool rom_only)
>>>>>>>>               bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) )
>>>>>>>>              continue;
>>>>>>>>
>>>>>>>> +#ifdef CONFIG_ARM
>>>>>>>> +        if ( !is_hardware_domain(pdev->domain) )
>>>>>>>> +        {
>>>>>>>> +            if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) ||
>>>>>>>> +                 (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + 
>>>>>>>> GUEST_VPCI_MEM_SIZE)) )
>>>>>>>> +                continue;
>>>>>>>> +        }
>>>>>>>> +#endif
>>>>>>>
>>>>>>> Hm, I think this should be in a hook similar to pci_check_bar() that
>>>>>>> can be implemented per-arch.
>>>>>>>
>>>>>>> IIRC at least on x86 we allow the guest to place the BARs whenever it
>>>>>>> wants, would such placement cause issues to the hypervisor on Arm?
>>>>>>
>>>>>> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. 
>>>>>> In my haste I also forgot about the prefetchable range starting at 
>>>>>> GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably 
>>>>>> throw this patch out.
>>>>>>
>>>>>> Now that I've had some more time to investigate, I believe the check in 
>>>>>> this patch is more or less redundant to the existing check in 
>>>>>> map_range() added in baa6ea700386 ("vpci: add permission checks to 
>>>>>> map_range()").
>>>>>>
>>>>>> The issue is that during initialization bar->guest_addr is zeroed, and 
>>>>>> this initial value of bar->guest_addr will fail the permissions check in 
>>>>>> map_range() and crash the domain. When the guest writes a new valid BAR, 
>>>>>> the old invalid address remains in the rangeset to be mapped. If we 
>>>>>> simply remove the old invalid BAR from the rangeset, that seems to fix 
>>>>>> the issue. So something like this:
>>>>>
>>>>> It does seem to me we are missing a proper cleanup of the rangeset
>>>>> contents in some paths then.  In the above paragraph you mention "the
>>>>> old invalid address remains in the rangeset to be mapped", how does it
>>>>> get in there in the first place, and why is the rangeset not emptied
>>>>> if the mapping failed?
>>>>
>>>> Back in ("vpci/header: handle p2m range sets per BAR") I added a v->domain 
>>>> == pdev->domain check near the top of vpci_process_pending() as you 
>>>> appropriately suggested.
>>>>
>>>> +    if ( v->domain != pdev->domain )
>>>> +    {
>>>> +        read_unlock(&v->domain->pci_lock);
>>>> +        return false;
>>>> +    }
>>>>
>>>> I have also reverted this patch ("xen/arm: vpci: check guest range").
>>>>
>>>> The sequence of events leading to the old value remaining in the rangeset 
>>>> are:
>>>>
>>>> # xl pci-assignable-add 01:00.0
>>>> drivers/vpci/vpci.c:vpci_deassign_device()
>>>>     deassign 0000:01:00.0 from d0
>>>> # grep pci domu.cfg
>>>> pci = [ "01:00.0" ]
>>>> # xl create domu.cfg
>>>> drivers/vpci/vpci.c:vpci_deassign_device()
>>>>     deassign 0000:01:00.0 from d[IO]
>>>> drivers/vpci/vpci.c:vpci_assign_device()
>>>>     assign 0000:01:00.0 to d1
>>>>     bar->guest_addr is initialized to zero because of the line: pdev->vpci 
>>>> = xzalloc(struct vpci);
>>>> drivers/vpci/header.c:init_bars()
>>>> drivers/vpci/header.c:modify_bars()
>>>
>>> I think I've commented this on another patch, but why is the device
>>> added with memory decoding enabled?  I would expect the FLR performed
>>> before assigning would leave the device with memory decoding disabled?
>>
>> It seems the device is indeed being assigned to the domU with memory 
>> decoding enabled, but I'm not entirely sure why. The device I'm testing with 
>> doesn't support FLR, but it does support pm bus reset:
>> # cat /sys/bus/pci/devices/0000\:01\:00.0/reset_method
>> pm bus
>>
>> As I understand it, libxl__device_pci_reset() should still be able to issue 
>> a reset in this case.
> 
> Maybe pciback is somehow restoring part of the previous state?  I
> have no insight in what state we expect the device to be handled by
> pciback, but this needs investigation in order to know what to expect.

Yep, during "xl pci-assignable-add ..." pciback resets the device and restores 
the state, including whether memory decoding is enabled.

drivers/xen/xen-pciback/pci_stub.c:pcistub_init_device():

        /* We need the device active to save the state. */
        dev_dbg(&dev->dev, "save state of device\n");
        pci_save_state(dev);
        dev_data->pci_saved_state = pci_store_saved_state(dev);
        if (!dev_data->pci_saved_state)
                dev_err(&dev->dev, "Could not store PCI conf saved state!\n");
        else {
                dev_dbg(&dev->dev, "resetting (FLR, D3, etc) the device\n");
                __pci_reset_function_locked(dev);
                pci_restore_state(dev);
        }
        /* Now disable the device (this also ensures some private device
         * data is setup before we export)
         */
        dev_dbg(&dev->dev, "reset device\n");
        xen_pcibk_reset_device(dev);

That last function, xen_pcibk_reset_device(), clears the bus master enable bit 
in the command register for devices with PCI_HEADER_TYPE_NORMAL (not a reset 
contrary to the function name).

xl create should reset the device again, but, similarly, this also seems to 
restore the state.

> Can you paste the full contents of the command register for this
> device?
Start of day (PCIe controller and bridge initialized, no device BARs or 
anything have been programmed yet): 0x0000
After dom0 boot, device is in use: 0x0006
After pci-assignable-add: 0x0002
After echo 1 > /sys/bus/pci/devices/0000\:01\:00.0/reset: 0x0002
After xl create, domU booted: 0x0006

Should mapping bars should be conditional on PCI_COMMAND_MASTER, not 
PCI_COMMAND_MEMORY? E.g.:

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 9cf701b3c464..9ce1793d64b8 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -1162,7 +1162,7 @@ static int cf_check init_bars(struct pci_dev *pdev)
             goto fail;
     }

-    return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, cmd, false) : 0;
+    return (cmd & PCI_COMMAND_MASTER) ? modify_bars(pdev, cmd, false) : 0;

  fail:
     pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd);



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.