[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] pci: a workaround for nonstandard PCI devices whose PBA shares



On Wed, Apr 04, 2018 at 11:29:39PM +0800, Chao Gao wrote:
> ... the same page with other registers which are not relevant to MSI-X. Xen
> marks pages where PBA resides as read-only. When assigning such devices to
> guest, device driver writes MSI-X irrelevant registers on those pages would
> lead to an EPT violation and the guest is destroyed because no handler is
> registered for those address range. In order to make guest capable to use such
> kind of devices, trapping very frequent write accesses is not a good idea for
> it would significantly impact the performance.
> 
> This patch provides a workaround with caveat. Specifically, an option is
> introduced to specify a list of devices. For those devices, Xen doesn't
> control the access right to pages where PBA resides. Hence, guest device
> driver is able to write those pages and functions well. Note that adding an
> untrusted device to this option may endanger security of the entire system.
> 
> Signed-off-by: Chao Gao <chao.gao@xxxxxxxxx>
> ---
>  docs/misc/xen-command-line.markdown | 10 +++++++++
>  xen/arch/x86/msi.c                  |  7 ++++--
>  xen/drivers/passthrough/pci.c       | 45 
> +++++++++++++++++++++++++++++++++++--
>  xen/include/asm-x86/msi.h           |  1 +
>  4 files changed, 59 insertions(+), 4 deletions(-)
> 
> diff --git a/docs/misc/xen-command-line.markdown 
> b/docs/misc/xen-command-line.markdown
> index b353352..e382513 100644
> --- a/docs/misc/xen-command-line.markdown
> +++ b/docs/misc/xen-command-line.markdown
> @@ -1423,6 +1423,16 @@ Defaults to booting secondary processors.
>  
>  > Default: `on`
>  
> +### pba\_quirk

pba_write_allowed would be better, pba_quirk is too generic IMO.

> +> `= List of [<seg>:]<bus>:<device>.<function>
> +
> +Specify a list of SBDF of devices. When assigning devices in this list to
> +guest, reading or writing the page where MSI-X PBA resides are allowed.
> +This option provides a workaround for nonstandard PCI devices whose
> +MSI-X PBA shares the same 4K-byte page with other registers. Note that
> +adding an untrusted device to this option would undermine security of
> +the entire system.
> +
>  ### pci
>  > `= {no-}serr | {no-}perr`
>  
> diff --git a/xen/arch/x86/msi.c b/xen/arch/x86/msi.c
> index 5567990..2abf2cf 100644
> --- a/xen/arch/x86/msi.c
> +++ b/xen/arch/x86/msi.c
> @@ -992,7 +992,9 @@ static int msix_capability_init(struct pci_dev *dev,
>          if ( rangeset_add_range(mmio_ro_ranges, msix->table.first,
>                                  msix->table.last) )
>              WARN();
> -        if ( rangeset_add_range(mmio_ro_ranges, msix->pba.first,
> +
> +        if ( !msix->pba_quirk_enabled &&
> +             rangeset_add_range(mmio_ro_ranges, msix->pba.first,
>                                  msix->pba.last) )
>              WARN();

This will work fine as long as the PBA is not in the same page as the
MSI-X table. In such case you will also need changes to QEMU (see
pci_msix_write), so that writes to the memory in the same page as the
MSI-X/PBA tables are forwarded to the underlying hardware.

You should add least add something like:

if ( msix->pba_quirk_enabled &&
     msix->table.first <= msix->pba.last &&
     msix->pba.first <= msix->table.last )
{
    printk("PBA write not allowed to dev %04x:%02x:%02x.%u due to MSI-X table 
overlap\n");
    return -ENXIO;
}

Or similar if the QEMU side is not fixed.

Note that in order to fix the QEMU side you would probably have to add
a flag to xl 'pci' config option and pass it to both QEMU and Xen.

>  
> @@ -1139,7 +1141,8 @@ static void _pci_cleanup_msix(struct arch_msix *msix)
>          if ( rangeset_remove_range(mmio_ro_ranges, msix->table.first,
>                                     msix->table.last) )
>              WARN();
> -        if ( rangeset_remove_range(mmio_ro_ranges, msix->pba.first,
> +        if ( !msix->pba_quirk_enabled &&
> +             rangeset_remove_range(mmio_ro_ranges, msix->pba.first,
>                                     msix->pba.last) )
>              WARN();
>      }
> diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c
> index 1db69d5..cd765ef 100644
> --- a/xen/drivers/passthrough/pci.c
> +++ b/xen/drivers/passthrough/pci.c
> @@ -184,6 +184,38 @@ static int __init parse_phantom_dev(const char *str)
>  }
>  custom_param("pci-phantom", parse_phantom_dev);
>  
> +static struct pba_quirk_dev {
> +    uint32_t sbdf;
> +} pba_quirk_devs[8];

We have a sbdf type now, see 514f58.

Also, I would prefer that you use a list here. I know it's not likely
to have a huge number of devices in the system that require this
quirk, but I also see no reason to limit this to 8 (or any other
arbitrary value).

> +static unsigned int nr_pba_quirk_devs;
> +
> +static int __init parse_pba_quirk(const char *str)
> +{
> +    unsigned int seg, bus, dev, func;
> +
> +    for ( ; ; )
> +    {
> +        if ( nr_pba_quirk_devs >= ARRAY_SIZE(pba_quirk_devs) )
> +            return -E2BIG;
> +
> +        str = parse_pci(str, &seg, &bus, &dev, &func);
> +        if ( !str )
> +            return -EINVAL;
> +
> +        pba_quirk_devs[nr_pba_quirk_devs++].sbdf = PCI_SBDF(seg, bus, dev,
> +                                                            func);
> +        if ( *str == ',' )
> +            str++;
> +        else if ( !*str )
> +            break;
> +        else
> +            return -EINVAL;
> +    }
> +
> +    return 0;
> +}
> +custom_param("pba_quirk", parse_pba_quirk);
> +
>  static u16 __read_mostly command_mask;
>  static u16 __read_mostly bridge_ctl_mask;
>  
> @@ -300,6 +332,7 @@ static void check_pdev(const struct pci_dev *pdev)
>  
>  static struct pci_dev *alloc_pdev(struct pci_seg *pseg, u8 bus, u8 devfn)
>  {
> +    unsigned int i;
>      struct pci_dev *pdev;
>  
>      list_for_each_entry ( pdev, &pseg->alldevs_list, alldevs_list )
> @@ -328,6 +361,16 @@ static struct pci_dev *alloc_pdev(struct pci_seg *pseg, 
> u8 bus, u8 devfn)
>          }
>          spin_lock_init(&msix->table_lock);
>          pdev->msix = msix;
> +
> +        for ( i = 0; i < nr_pba_quirk_devs; i++ )
> +        {
> +            if ( pba_quirk_devs[i].sbdf == PCI_SBDF3(pseg->nr, bus, devfn) )
> +            {
> +                pdev->msix->pba_quirk_enabled = true;
> +                printk(XENLOG_WARNING "Enable PBA quirk for 
> %04x:%02x:%02x.%u\n",
> +                       pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
> +            }

You could do this as:

if ( pba_quirk_devs[i].sbdf != PCI_SBDF3(pseg->nr, bus, devfn) )
    continue;

pdev->msix->pba_quirk_enabled = true;
printk(XENLOG_WARNING "Enable PBA quirk for %04x:%02x:%02x.%u\n",
       pseg->nr, bus, PCI_SLOT(devfn), PCI_FUNC(devfn));
break;

Which removes one level of indentation (and also exits the loop as
soon as a match is found).

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.