[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 08/11] vpci/bars: add handlers to map the BARs



>>> On 19.09.17 at 17:29, <roger.pau@xxxxxxxxxx> wrote:
> @@ -48,6 +49,9 @@ bool_t hvm_io_pending(struct vcpu *v)
>      struct domain *d = v->domain;
>      struct hvm_ioreq_server *s;
>  
> +    if ( has_vpci(v->domain) && vpci_check_pending(v) )

has_vpci(d)

> +      return 1;

Indentation.

> +static int vpci_map_range(unsigned long s, unsigned long e, void *data,
> +                          unsigned long *c)
> +{
> +    const struct map_data *map = data;
> +    int rc;
> +
> +    for ( ; ; )
> +    {
> +        unsigned long size = e - s + 1;
> +
> +        rc = (map->map ? map_mmio_regions : unmap_mmio_regions)
> +             (map->d, _gfn(s), size, _mfn(s));
> +        if ( rc == 0 )
> +        {
> +            *c += size;
> +            break;
> +        }
> +        if ( rc < 0 )
> +        {
> +            printk(XENLOG_G_WARNING
> +                   "Failed to identity %smap [%" PRI_gfn ", %" PRI_gfn ") 
> for d%d: %d\n",
> +                   map ? "" : "un", s, e, map->d->domain_id, rc);
> +            break;
> +        }

ASSERT(rc < size) ?

> +bool vpci_check_pending(struct vcpu *v)

"check" in the function name generally suggests (to me at least) that
the parameter ought to be const. Perhaps vpci_process_pending()?

> +{
> +    if ( v->vpci.mem )
> +    {
> +        int rc = vpci_map_memory(v->domain, v->vpci.mem, v->vpci.map);
> +
> +        if ( rc == -ERESTART )
> +            return true;

There's no real need for the local variable if all other return values
are simply discarded here. However, ...

> +        rangeset_destroy(v->vpci.mem);
> +        v->vpci.mem = NULL;
> +    }
> +
> +    return false;
> +}

... I'm not convinced this is a good error handling model. I don't
recall how previous versions dealt with this, but iirc we agreed to
generally make all such Dom0 handling best effort (here: don't skip the
remaining ranges if mapping of one failed). An exception may want/need
to be -ENOMEM.

> +static int vpci_check_bar_overlap(const struct pci_dev *pdev,
> +                                  const struct vpci_bar *rom,
> +                                  struct rangeset *mem)
> +{
> +    const struct pci_dev *cmp;
> +
> +    /* Check for overlaps with other device's BARs. */
> +    list_for_each_entry(cmp, &pdev->domain->arch.pdev_list, domain_list)
> +    {
> +        unsigned int i;
> +
> +        if ( rom == NULL && pdev == cmp )
> +            continue;

This check looks rather unmotivated (or even bogus) without a comment.
The other special casing of ROM BARs further down also isn't all that
obvious (and right now I can't even convince myself it's correct).

> +static void vpci_modify_bars(const struct pci_dev *pdev, bool map)
> +{
> +    struct vpci_header *header = &pdev->vpci->header;
> +    struct rangeset *mem = rangeset_new(NULL, NULL, 0);
> +    unsigned int i;
> +    int rc;
> +
> +    if ( !mem )
> +        return;
> +
> +    /*
> +     * Create a rangeset that represents the current device BARs memory 
> region
> +     * and compare it against all the currently active BAR memory regions. If
> +     * an overlap is found, subtract it from the region to be
> +     * mapped/unmapped.
> +     *
> +     * NB: the rangeset uses inclusive frame numbers.
> +     */
> +
> +    /* First fill the rangeset with all the BARs of this device. */
> +    for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> +    {
> +        const struct vpci_bar *bar = &header->bars[i];
> +
> +        if ( !MAPPABLE_BAR(bar) ||
> +             (bar->type == VPCI_BAR_ROM && !bar->rom_enabled) )
> +            continue;
> +
> +        rc = rangeset_add_range(mem, PFN_DOWN(bar->addr),
> +                                PFN_DOWN(bar->addr + bar->size - 1));
> +        if ( rc )
> +        {
> +            rangeset_destroy(mem);
> +            return;

I'm afraid -ENOMEM here (which sadly is possible, as we don't maintain
any reserves) would produce a very hard to diagnose misbehavior. I think
you want to log a message here.

> +        }
> +    }
> +
> +    /* Check for overlaps with other device's BARs. */
> +    rc = vpci_check_bar_overlap(pdev, NULL, mem);

Why is this not symmetrical with vpci_modify_rom() (which also checks
overlaps inside the current device)?

> +    if ( rc )
> +    {
> +        rangeset_destroy(mem);
> +        return;

Same error handling comment as above, despite failure here being less
likely (hopefully at least). Perhaps worth joining the two paths.

> +    }
> +
> +    rc = vpci_maybe_defer_map(pdev->domain, mem, map);
> +    if ( !rc )
> +        for ( i = 0; i < ARRAY_SIZE(header->bars); i++ )
> +            if ( header->bars[i].type != VPCI_BAR_ROM ||
> +                 header->bars[i].rom_enabled )
> +            header->bars[i].enabled = map;

Hmm, you're updating state here regardless of possible failure in the
deferred operation (see the discarded error code in
vpci_check_pending()).

> +static uint32_t vpci_cmd_read(const struct pci_dev *pdev, unsigned int reg,
> +                              void *data)
> +{
> +    return pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn),
> +                           PCI_FUNC(pdev->devfn), reg);
> +}

Wouldn't it be worthwhile having generic read functions dealing with
simple cases like this (and the BAR) one?

> +static void vpci_cmd_write(const struct pci_dev *pdev, unsigned int reg,
> +                           uint32_t cmd, void *data)
> +{
> +    uint8_t seg = pdev->seg, bus = pdev->bus;
> +    uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> +    uint16_t current_cmd = pci_conf_read16(seg, bus, slot, func, reg);
> +
> +    /*
> +     * Let the guest play with all the bits directly except for the
> +     * memory decoding one.
> +     */

Please could you make clear it's only Dom0 we apply this lax model to?
Perhaps simply s/the guest/Dom0/.

> +static void vpci_bar_write(const struct pci_dev *pdev, unsigned int reg,
> +                           uint32_t val, void *data)
> +{
> +    struct vpci_bar *bar = data;
> +    uint8_t seg = pdev->seg, bus = pdev->bus;
> +    uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> +    bool hi = false;
> +
> +    if ( pci_conf_read16(seg, bus, slot, func, PCI_COMMAND) &
> +         PCI_COMMAND_MEMORY )
> +    {
> +         gprintk(XENLOG_WARNING,
> +                 "%04x:%02x:%02x.%u: ignored BAR write with memory decoding 
> enabled\n",
> +                 seg, bus, slot, func);

Indentation. Also any chance to log which BAR it was?

> +static void vpci_rom_write(const struct pci_dev *pdev, unsigned int reg,
> +                           uint32_t val, void *data)
> +{
> +    struct vpci_bar *rom = data;
> +    uint8_t seg = pdev->seg, bus = pdev->bus;
> +    uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> +    uint16_t cmd = pci_conf_read16(seg, bus, slot, func, PCI_COMMAND);
> +
> +    if ( (pci_conf_read16(seg, bus, slot, func, PCI_COMMAND) &

Please use cmd here.

> +          PCI_COMMAND_MEMORY) && rom->rom_enabled )
> +    {
> +         gprintk(XENLOG_WARNING,
> +                 "%04x:%02x:%02x.%u: ignored ROM BAR write with memory 
> decoding enabled\n",
> +                 seg, bus, slot, func);

Indentation again.

> +static int vpci_init_bars(struct pci_dev *pdev)
> +{
> +    uint8_t seg = pdev->seg, bus = pdev->bus;
> +    uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn);
> +    uint16_t cmd;
> +    uint64_t addr, size;
> +    unsigned int i, num_bars, rom_reg;
> +    struct vpci_header *header = &pdev->vpci->header;
> +    struct vpci_bar *bars = header->bars;
> +    pci_sbdf_t sbdf = {
> +        .seg = seg,
> +        .bus = bus,
> +        .dev = slot,
> +        .func = func,
> +    };
> +    int rc;
> +
> +    switch ( pci_conf_read8(seg, bus, slot, func, PCI_HEADER_TYPE) & 0x7f )
> +    {
> +    case PCI_HEADER_TYPE_NORMAL:
> +        num_bars = 6;
> +        rom_reg = PCI_ROM_ADDRESS;
> +        break;
> +    case PCI_HEADER_TYPE_BRIDGE:
> +        num_bars = 2;
> +        rom_reg = PCI_ROM_ADDRESS1;
> +        break;
> +    default:
> +        return -EOPNOTSUPP;
> +    }
> +
> +    /* Setup a handler for the command register. */
> +    rc = vpci_add_register(pdev, vpci_cmd_read, vpci_cmd_write, PCI_COMMAND,
> +                           2, header);
> +    if ( rc )
> +        return rc;
> +
> +    /* Disable memory decoding before sizing. */
> +    cmd = pci_conf_read16(seg, bus, slot, func, PCI_COMMAND);
> +    if ( cmd & PCI_COMMAND_MEMORY )
> +        pci_conf_write16(seg, bus, slot, func, PCI_COMMAND,
> +                         cmd & ~PCI_COMMAND_MEMORY);
> +
> +    for ( i = 0; i < num_bars; i++ )
> +    {
> +        uint8_t reg = PCI_BASE_ADDRESS_0 + i * 4;
> +        uint32_t val = pci_conf_read32(seg, bus, slot, func, reg);
> +
> +        if ( i && bars[i - 1].type == VPCI_BAR_MEM64_LO )
> +        {
> +            bars[i].type = VPCI_BAR_MEM64_HI;
> +            rc = vpci_add_register(pdev, vpci_bar_read, vpci_bar_write, reg, 
> 4,
> +                                   &bars[i]);
> +            if ( rc )
> +            {
> +                pci_conf_write16(seg, bus, slot, func, PCI_COMMAND, cmd);
> +                return rc;
> +            }
> +
> +            continue;
> +        }

You don't need val up to here - please defer the read.

> +        if ( (val & PCI_BASE_ADDRESS_SPACE) == PCI_BASE_ADDRESS_SPACE_IO )
> +        {
> +            bars[i].type = VPCI_BAR_IO;
> +            continue;
> +        }
> +        if ( (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) ==
> +             PCI_BASE_ADDRESS_MEM_TYPE_64 )
> +            bars[i].type = VPCI_BAR_MEM64_LO;
> +        else
> +            bars[i].type = VPCI_BAR_MEM32;
> +
> +        /* Size the BAR and map it. */

Isn't the map part of this comment stale now? And without it,
considering the function name called, it is perhaps no longer worth
having it.

> --- a/xen/include/xen/sched.h
> +++ b/xen/include/xen/sched.h
> @@ -20,6 +20,9 @@
>  #include <xen/smp.h>
>  #include <xen/perfc.h>
>  #include <asm/atomic.h>
> +#ifdef CONFIG_HAS_PCI
> +#include <xen/vpci.h>
> +#endif

Perhaps the conditional would better live in that header.

> @@ -264,6 +267,11 @@ struct vcpu
>  
>      struct evtchn_fifo_vcpu *evtchn_fifo;
>  
> +#ifdef CONFIG_HAS_PCI
> +    /* vPCI per-vCPU area, used to store data for long running operations. */
> +    struct vpci_vcpu vpci;
> +#endif

And perhaps the header would better provide an empty structure for the
"else" case. Another option would be to include the fields ...

>      struct arch_vcpu arch;

... in this structure.

> --- a/xen/include/xen/vpci.h
> +++ b/xen/include/xen/vpci.h
> @@ -35,11 +35,52 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, 
> unsigned int size);
>  void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size,
>                  uint32_t data);
>  
> +/*
> + * Check for pending vPCI operations on this vcpu. Returns true if the vcpu
> + * should not run.
> + */
> +bool vpci_check_pending(struct vcpu *v);
> +
>  struct vpci {
>      /* List of vPCI handlers for a device. */
>      struct list_head handlers;
>      spinlock_t lock;
> +
> +#ifdef __XEN__
> +    /* Hide the rest of the vpci struct from the user-space test harness. */
> +    struct vpci_header {
> +        /* Information about the PCI BARs of this device. */
> +        struct vpci_bar {
> +            paddr_t addr;
> +            uint64_t size;
> +            enum {
> +                VPCI_BAR_EMPTY,
> +                VPCI_BAR_IO,
> +                VPCI_BAR_MEM32,
> +                VPCI_BAR_MEM64_LO,
> +                VPCI_BAR_MEM64_HI,
> +                VPCI_BAR_ROM,
> +            } type;
> +            bool prefetchable;
> +            /* Store whether the BAR is mapped into guest p2m. */
> +            bool enabled;
> +            /*
> +             * Store whether the ROM enable bit is set (doesn't imply ROM BAR
> +             * is mapped into guest p2m). Only used for type VPCI_BAR_ROM.
> +             */
> +            bool rom_enabled;

Especially with the error handling issue in mind that I've mentioned
earlier, I wonder whether this field shouldn't be dropped, along the
lines of you also no longer caching the memory decode enable bit in the
command register.

Jan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.