[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v9 15/16] xen/arm: vpci: check guest range
On 9/26/23 11:48, Roger Pau Monné wrote: > On Tue, Sep 26, 2023 at 11:27:48AM -0400, Stewart Hildebrand wrote: >> On 9/26/23 04:07, Roger Pau Monné wrote: >>> On Mon, Sep 25, 2023 at 05:49:00PM -0400, Stewart Hildebrand wrote: >>>> On 9/22/23 04:44, Roger Pau Monné wrote: >>>>> On Tue, Aug 29, 2023 at 11:19:47PM +0000, Volodymyr Babchuk wrote: >>>>>> From: Stewart Hildebrand <stewart.hildebrand@xxxxxxx> >>>>>> >>>>>> Skip mapping the BAR if it is not in a valid range. >>>>>> >>>>>> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@xxxxxxx> >>>>>> --- >>>>>> xen/drivers/vpci/header.c | 9 +++++++++ >>>>>> 1 file changed, 9 insertions(+) >>>>>> >>>>>> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c >>>>>> index 1d243eeaf9..dbabdcbed2 100644 >>>>>> --- a/xen/drivers/vpci/header.c >>>>>> +++ b/xen/drivers/vpci/header.c >>>>>> @@ -345,6 +345,15 @@ static int modify_bars(const struct pci_dev *pdev, >>>>>> uint16_t cmd, bool rom_only) >>>>>> bar->enabled == !!(cmd & PCI_COMMAND_MEMORY) ) >>>>>> continue; >>>>>> >>>>>> +#ifdef CONFIG_ARM >>>>>> + if ( !is_hardware_domain(pdev->domain) ) >>>>>> + { >>>>>> + if ( (start_guest < PFN_DOWN(GUEST_VPCI_MEM_ADDR)) || >>>>>> + (end_guest >= PFN_DOWN(GUEST_VPCI_MEM_ADDR + >>>>>> GUEST_VPCI_MEM_SIZE)) ) >>>>>> + continue; >>>>>> + } >>>>>> +#endif >>>>> >>>>> Hm, I think this should be in a hook similar to pci_check_bar() that >>>>> can be implemented per-arch. >>>>> >>>>> IIRC at least on x86 we allow the guest to place the BARs whenever it >>>>> wants, would such placement cause issues to the hypervisor on Arm? >>>> >>>> Hm. I wrote this patch in a hurry to make v9 of this series work on ARM. >>>> In my haste I also forgot about the prefetchable range starting at >>>> GUEST_VPCI_PREFETCH_MEM_ADDR, but that won't matter as we can probably >>>> throw this patch out. >>>> >>>> Now that I've had some more time to investigate, I believe the check in >>>> this patch is more or less redundant to the existing check in map_range() >>>> added in baa6ea700386 ("vpci: add permission checks to map_range()"). >>>> >>>> The issue is that during initialization bar->guest_addr is zeroed, and >>>> this initial value of bar->guest_addr will fail the permissions check in >>>> map_range() and crash the domain. When the guest writes a new valid BAR, >>>> the old invalid address remains in the rangeset to be mapped. If we simply >>>> remove the old invalid BAR from the rangeset, that seems to fix the issue. >>>> So something like this: >>> >>> It does seem to me we are missing a proper cleanup of the rangeset >>> contents in some paths then. In the above paragraph you mention "the >>> old invalid address remains in the rangeset to be mapped", how does it >>> get in there in the first place, and why is the rangeset not emptied >>> if the mapping failed? >> >> Back in ("vpci/header: handle p2m range sets per BAR") I added a v->domain >> == pdev->domain check near the top of vpci_process_pending() as you >> appropriately suggested. >> >> + if ( v->domain != pdev->domain ) >> + { >> + read_unlock(&v->domain->pci_lock); >> + return false; >> + } >> >> I have also reverted this patch ("xen/arm: vpci: check guest range"). >> >> The sequence of events leading to the old value remaining in the rangeset >> are: >> >> # xl pci-assignable-add 01:00.0 >> drivers/vpci/vpci.c:vpci_deassign_device() >> deassign 0000:01:00.0 from d0 >> # grep pci domu.cfg >> pci = [ "01:00.0" ] >> # xl create domu.cfg >> drivers/vpci/vpci.c:vpci_deassign_device() >> deassign 0000:01:00.0 from d[IO] >> drivers/vpci/vpci.c:vpci_assign_device() >> assign 0000:01:00.0 to d1 >> bar->guest_addr is initialized to zero because of the line: pdev->vpci = >> xzalloc(struct vpci); >> drivers/vpci/header.c:init_bars() >> drivers/vpci/header.c:modify_bars() > > I think I've commented this on another patch, but why is the device > added with memory decoding enabled? I would expect the FLR performed > before assigning would leave the device with memory decoding disabled? It seems the device is indeed being assigned to the domU with memory decoding enabled, but I'm not entirely sure why. The device I'm testing with doesn't support FLR, but it does support pm bus reset: # cat /sys/bus/pci/devices/0000\:01\:00.0/reset_method pm bus As I understand it, libxl__device_pci_reset() should still be able to issue a reset in this case. > Otherwise we might have to force init_bars() to assume memory decoding > to be disabled, IOW: memory decoding would be set as disabled in the > guest cmd view, and leave the physical device cmd as-is. We might > also consider switching memory decoding off unconditionally for domUs > on the physical device. I did a quick test and it works as expected with my apparently quirky test case: diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index de29e5322d34..0ad0ad947759 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -790,7 +790,12 @@ static int cf_check init_bars(struct pci_dev *pdev) /* Disable memory decoding before sizing. */ if ( cmd & PCI_COMMAND_MEMORY ) + { + if ( !is_hwdom ) + cmd &= ~PCI_COMMAND_MEMORY; + pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd & ~PCI_COMMAND_MEMORY); + } header->guest_cmd = cmd & ~PCI_COMMAND_MEMORY; > >> BAR0: start 0xe0000, end 0xe000f, start_guest 0x0, end_guest 0xf >> The range { 0-f } is added to the BAR0 rangeset for d1 >> drivers/vpci/header.c:defer_map() >> raise_softirq(SCHEDULE_SOFTIRQ); >> drivers/vpci/header.c:vpci_process_pending() >> vpci_process_pending() returns because v->domain != pdev->domain (i.e. >> d0 != d1) > > I don't think we can easily handle BAR mappings during device > assignment with vPCI, because that would require adding some kind of > continuation support which we don't have ATM. Might be better to just > switch memory decoding unconditionally off at init_bars() for domUs as > that's the easier solution right now that would allow us to move > forward. +1
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |