[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] pci: a workaround for nonstandard PCI devices whose PBA shares
On Thu, Apr 05, 2018 at 10:34:39AM +0100, Roger Pau Monné wrote: >On Wed, Apr 04, 2018 at 11:29:39PM +0800, Chao Gao wrote: >> ... the same page with other registers which are not relevant to MSI-X. Xen >> marks pages where PBA resides as read-only. When assigning such devices to >> guest, device driver writes MSI-X irrelevant registers on those pages would >> lead to an EPT violation and the guest is destroyed because no handler is >> registered for those address range. In order to make guest capable to use >> such >> kind of devices, trapping very frequent write accesses is not a good idea for >> it would significantly impact the performance. >> >> This patch provides a workaround with caveat. Specifically, an option is >> introduced to specify a list of devices. For those devices, Xen doesn't >> control the access right to pages where PBA resides. Hence, guest device >> driver is able to write those pages and functions well. Note that adding an >> untrusted device to this option may endanger security of the entire system. >> >> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx> >> --- >> docs/misc/xen-command-line.markdown | 10 +++++++++ >> xen/arch/x86/msi.c | 7 ++++-- >> xen/drivers/passthrough/pci.c | 45 >> +++++++++++++++++++++++++++++++++++-- >> xen/include/asm-x86/msi.h | 1 + >> 4 files changed, 59 insertions(+), 4 deletions(-) >> >> diff --git a/docs/misc/xen-command-line.markdown >> b/docs/misc/xen-command-line.markdown >> index b353352..e382513 100644 >> --- a/docs/misc/xen-command-line.markdown >> +++ b/docs/misc/xen-command-line.markdown >> @@ -1423,6 +1423,16 @@ Defaults to booting secondary processors. >> >> > Default: `on` >> >> +### pba\_quirk > >pba_write_allowed would be better, pba_quirk is too generic IMO. > >> +> `= List of [<seg>:]<bus>:<device>.<function> >> + >> +Specify a list of SBDF of devices. When assigning devices in this list to >> +guest, reading or writing the page where MSI-X PBA resides are allowed. >> +This option provides a workaround for nonstandard PCI devices whose >> +MSI-X PBA shares the same 4K-byte page with other registers. Note that >> +adding an untrusted device to this option would undermine security of >> +the entire system. >> + >> ### pci >> > `= {no-}serr | {no-}perr` >> >> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c >> index 5567990..2abf2cf 100644 >> --- a/xen/arch/x86/msi.c >> +++ b/xen/arch/x86/msi.c >> @@ -992,7 +992,9 @@ static int msix_capability_init(struct pci_dev *dev, >> if ( rangeset_add_range(mmio_ro_ranges, msix->table.first, >> msix->table.last) ) >> WARN(); >> - if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first, >> + >> + if ( !msix->pba_quirk_enabled && >> + rangeset_add_range(mmio_ro_ranges, msix->pba.first, >> msix->pba.last) ) >> WARN(); > >This will work fine as long as the PBA is not in the same page as the >MSI-X table. In such case you will also need changes to QEMU (see >pci_msix_write), so that writes to the memory in the same page as the >MSI-X/PBA tables are forwarded to the underlying hardware. > >You should add least add something like: > >if ( msix->pba_quirk_enabled && > msix->table.first <= msix->pba.last && > msix->pba.first <= msix->table.last ) >{ > printk("PBA write not allowed to dev %04x:%02x:%02x.%u due to MSI-X table > overlap\n"); > return -ENXIO; >} > >Or similar if the QEMU side is not fixed. > >Note that in order to fix the QEMU side you would probably have to add >a flag to xl 'pci' config option and pass it to both QEMU and Xen. Thanks for your comments. First of all, I don't intend to also support devices which has MSI-X table, MSI-X PBA and other MSI-X irrelevant registers in the same page. Because as you said, it cleary violates MSI-X spec. IMO, we can extend the workaround when we found such a device. I also had the same concern with yours. But after careful thinking, I found it wouldn't be a problem. If MSI-X table resides the same pages with MSI-X PBA, we will mark the pages as RO when handling MSI-X table. As a consequence, guest isn't able to write MSI-X table directly in this case. Hence, it won't affect MSI-X table emulation and furthermore the quirk won't override the decision, marking the page RO, made for other reasons. > >> >> @@ -1139,7 +1141,8 @@ static void _pci_cleanup_msix(struct arch_msix *msix) >> if ( rangeset_remove_range(mmio_ro_ranges, msix->table.first, >> msix->table.last) ) >> WARN(); >> - if ( rangeset_remove_range(mmio_ro_ranges, msix->pba.first, >> + if ( !msix->pba_quirk_enabled && >> + rangeset_remove_range(mmio_ro_ranges, msix->pba.first, >> msix->pba.last) ) >> WARN(); >> } >> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c >> index 1db69d5..cd765ef 100644 >> --- a/xen/drivers/passthrough/pci.c >> +++ b/xen/drivers/passthrough/pci.c >> @@ -184,6 +184,38 @@ static int __init parse_phantom_dev(const char *str) >> } >> custom_param("pci-phantom", parse_phantom_dev); >> >> +static struct pba_quirk_dev { >> + uint32_t sbdf; >> +} pba_quirk_devs[8]; > >We have a sbdf type now, see 514f58. > >Also, I would prefer that you use a list here. I know it's not likely >to have a huge number of devices in the system that require this >quirk, but I also see no reason to limit this to 8 (or any other >arbitrary value). Ok. I will use a list and the new sbdf type. > >> +static unsigned int nr_pba_quirk_devs; >> + >> +static int __init parse_pba_quirk(const char *str) >> +{ >> + unsigned int seg, bus, dev, func; >> + >> + for ( ; ; ) >> + { >> + if ( nr_pba_quirk_devs >= ARRAY_SIZE(pba_quirk_devs) ) >> + return -E2BIG; >> + >> + str = parse_pci(str, &seg, &bus, &dev, &func); >> + if ( !str ) >> + return -EINVAL; >> + >> + pba_quirk_devs[nr_pba_quirk_devs++].sbdf = PCI_SBDF(seg, bus, dev, >> + func); >> + if ( *str == ',' ) >> + str++; >> + else if ( !*str ) >> + break; >> + else >> + return -EINVAL; >> + } >> + >> + return 0; >> +} >> +custom_param("pba_quirk", parse_pba_quirk); >> + >> static u16 __read_mostly command_mask; >> static u16 __read_mostly bridge_ctl_mask; >> >> @@ -300,6 +332,7 @@ static void check_pdev(const struct pci_dev *pdev) >> >> static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn) >> { >> + unsigned int i; >> struct pci_dev *pdev; >> >> list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list ) >> @@ -328,6 +361,16 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, >> u8 bus, u8 devfn) >> } >> spin_lock_init(&msix->table_lock); >> pdev->msix = msix; >> + >> + for ( i = 0; i < nr_pba_quirk_devs; i++ ) >> + { >> + if ( pba_quirk_devs[i].sbdf == PCI_SBDF3(pseg->nr, bus, devfn) ) >> + { >> + pdev->msix->pba_quirk_enabled = true; >> + printk(XENLOG_WARNING "Enable PBA quirk for >> %04x:%02x:%02x.%u\n", >> + pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); >> + } > >You could do this as: > >if ( pba_quirk_devs[i].sbdf != PCI_SBDF3(pseg->nr, bus, devfn) ) > continue; > >pdev->msix->pba_quirk_enabled = true; >printk(XENLOG_WARNING "Enable PBA quirk for %04x:%02x:%02x.%u\n", > pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn)); >break; > >Which removes one level of indentation (and also exits the loop as >soon as a match is found). Very good suggestion. Thanks Chao _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |