[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v7 for-next 12/12] vpci/msix: add MSI-X handlers
>>> On 18.10.17 at 13:40, <roger.pau@xxxxxxxxxx> wrote: > Changes since v6: > - Reduce the output of the debug keys. > - Fix comments and code to match in vpci_msix_control_write. > - Optimize size of the MSIX structure. > - Convert vpci_msix_mem to a uint32_t in order to reduce the size of > vpci_msix. Introduce some macros to make it easier to get the MSIX > tables related data. This is misleading - there's no vpci_msix_mem in the patch. I think you mean the tables[] member of struct vpci_msix? > +void vpci_msix_arch_print_entry(const struct vpci_msix_entry *entry) > +{ > + printk("vec=%#02x%7s%6s%3sassert%5s%7s dest_id=%lu mask=%u pirq: %d\n", Aiui %#02x is pointless as the 0x already occupies both characters. Please drop the #; imo vectors make no sense to be logged in decimal anyway. > @@ -254,6 +255,20 @@ static void modify_bars(const struct pci_dev *pdev, bool > map, bool rom) > } > } > > + /* Remove any MSIX regions if present. */ > + for ( i = 0; msix && i < ARRAY_SIZE(msix->tables); i++ ) > + { > + paddr_t start = VMSIX_TABLE_ADDR(pdev->vpci, i); > + > + rc = rangeset_remove_range(mem, PFN_DOWN(start), PFN_DOWN(start + > + VMSIX_TABLE_SIZE(pdev->vpci, i) - 1)); Bad indentation. Also if the second PFN_DOWN() is really intended this way (rather than PFN_UP()), please add a comment explaining why that is. > + if ( rc ) > + { > + rangeset_destroy(mem); > + return; Silent discarding of an error also needs an explanation in a comment. > @@ -325,6 +329,22 @@ void vpci_dump_msi(void) > vpci_msi_arch_print(msi); > } > > + if ( msix ) > + { > + unsigned int i; > + > + printk(" MSI-X\n"); > + > + printk(" entries: %u maskall: %d enabled: %d\n", > + msix->max_entries, msix->masked, msix->enabled); > + > + for ( i = 0; i < msix->max_entries; i++ ) > + { > + printk(" %4u ", i); > + vpci_msix_arch_print_entry(&msix->entries[i]); > + } Printing a partial line (i.e. w/o newline) and then calling an arch- dependent function is not ideal. Perhaps bets to also hand the index to the function, so it can decide how best to print everything. Also, btw, " %4" can be had with less .rodata consumption by using "%6u". > +static void control_write(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data) > +{ > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn); > + struct vpci_msix *msix = data; > + bool new_masked = val & PCI_MSIX_FLAGS_MASKALL; > + bool new_enabled = val & PCI_MSIX_FLAGS_ENABLE; > + unsigned int i; > + int rc; > + > + if ( new_masked == msix->masked && new_enabled == msix->enabled ) > + return; > + > + /* > + * According to the PCI 3.0 specification, switching the enable bit to > 1 > + * or the function mask bit to 0 should cause all the cached addresses > + * and data fields to be recalculated. > + * > + * In order to avoid the overhead of disabling and enabling all the > + * entries every time the guest sets the maskall bit, Xen will only > + * perform the disable and enable sequence when the guest has written to > + * the entry. > + */ > + if ( new_enabled && !new_masked && (!msix->enabled || msix->masked) ) > + { > + for ( i = 0; i < msix->max_entries; i++ ) > + { > + if ( msix->entries[i].masked || !msix->entries[i].updated ) > + continue; > + > + rc = vpci_msix_arch_disable_entry(&msix->entries[i], pdev); > + /* Ignore ENOENT, it means the entry wasn't setup. */ > + if ( rc && rc != -ENOENT ) > + { > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: unable to disable entry %u: > %d\n", > + pdev->seg, pdev->bus, slot, func, i, rc); In a log from e.g. a customer, how am I to tell this message from ... > + return; > + } > + > + rc = vpci_msix_arch_enable_entry(&msix->entries[i], pdev, > + VMSIX_TABLE_BASE(pdev->vpci, > + > VPCI_MSIX_TABLE)); > + if ( rc ) > + { > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: unable to enable entry %u: %d\n", > + pdev->seg, pdev->bus, slot, func, i, rc); > + /* Entry is likely not properly configured, skip it. */ > + continue; > + } > + > + /* > + * At this point the PIRQ is still masked. Unmask it, or else the > + * guest won't receive interrupts. This is due to the > + * disable/enable sequence performed above. > + */ > + vpci_msix_arch_mask_entry(&msix->entries[i], pdev, false); > + > + msix->entries[i].updated = false; > + } > + } > + else if ( !new_enabled && msix->enabled ) > + { > + /* Guest has disabled MSIX, disable all entries. */ > + for ( i = 0; i < msix->max_entries; i++ ) > + { > + /* > + * NB: vpci_msix_arch_disable can be called for entries that are > + * not setup, it will return -ENOENT in that case. > + */ > + rc = vpci_msix_arch_disable_entry(&msix->entries[i], pdev); > + switch ( rc ) > + { > + case 0: > + /* > + * Mark the entry successfully disabled as updated, so that > on > + * the next enable the entry is properly setup. This is done > + * so that the following flow works correctly: > + * > + * mask entry -> disable MSIX -> enable MSIX -> unmask entry > + * > + * Without setting 'updated', the 'unmask entry' step will > fail > + * because the entry has not been updated, so it would not be > + * mapped/bound at all. > + */ > + msix->entries[i].updated = true; > + break; > + case -ENOENT: > + /* Ignore non-present entry. */ > + break; > + default: > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: unable to disable entry %u: > %d\n", > + pdev->seg, pdev->bus, slot, func, i, rc); ... this one? > +static bool access_allowed(const struct pci_dev *pdev, unsigned long addr, > + unsigned int len) > +{ > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn); > + > + /* Only allow 32/64b accesses. */ > + if ( len != 4 && len != 8 ) > + { > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: invalid MSI-X table access size: %u\n", > + pdev->seg, pdev->bus, slot, func, len); > + return false; > + } > + > + /* Only allow aligned accesses. */ > + if ( (addr & (len - 1)) != 0 ) > + { > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: MSI-X only allows aligned accesses\n", > + pdev->seg, pdev->bus, slot, func); > + return false; > + } Perhaps fold both error paths and messages (I'm not convinced we need all of them [also considering others you add] anyway)? > +static int msix_write(struct vcpu *v, unsigned long addr, unsigned int len, > + unsigned long data) > +{ > + const struct domain *d = v->domain; > + struct vpci_msix *msix = msix_find(d, addr); > + struct vpci_msix_entry *entry; > + unsigned int offset; > + > + if ( !msix ) > + return X86EMUL_RETRY; > + > + if ( !access_allowed(msix->pdev, addr, len) ) > + return X86EMUL_OKAY; > + > + if ( VMSIX_ADDR_IN_RANGE(addr, msix->pdev->vpci, VPCI_MSIX_PBA) ) > + { > + /* Ignore writes to PBA for DomUs, it's behavior is undefined. */ > + if ( is_hardware_domain(d) ) > + { > + switch ( len ) > + { > + case 4: > + writel(data, addr); > + break; > + case 8: > + writeq(data, addr); > + break; > + default: > + ASSERT_UNREACHABLE(); > + break; > + } > + } > + > + return X86EMUL_OKAY; > + } > + > + spin_lock(&msix->pdev->vpci->lock); > + entry = get_entry(msix, addr); > + offset = addr & (PCI_MSIX_ENTRY_SIZE - 1); > + > + /* > + * NB: Xen allows writes to the data/address registers with the entry > + * unmasked. The specification says this is undefined behavior, and Xen > + * implements it as storing the written value, which will be made > effective > + * in the next mask/unmask cycle. This also mimics the implementation in > + * QEMU. > + */ > + switch ( offset ) > + { > + case PCI_MSIX_ENTRY_LOWER_ADDR_OFFSET: > + entry->updated = true; > + if ( len == 8 ) > + { > + entry->addr = data; > + break; > + } > + entry->addr &= ~0xffffffff; > + entry->addr |= data; > + break; > + case PCI_MSIX_ENTRY_UPPER_ADDR_OFFSET: This switch() is too large to not have blank lines between non-fall- through case blocks. > + entry->updated = true; > + entry->addr &= 0xffffffff; > + entry->addr |= (uint64_t)data << 32; > + break; > + case PCI_MSIX_ENTRY_DATA_OFFSET: > + entry->updated = true; > + entry->data = data; > + > + if ( len == 4 ) > + break; > + > + data >>= 32; > + /* fallthrough */ > + case PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET: > + { > + bool new_masked = data & PCI_MSIX_VECTOR_BITMASK; > + const struct pci_dev *pdev = msix->pdev; > + int rc; > + > + if ( entry->masked == new_masked ) > + /* No change in the mask bit, nothing to do. */ > + break; > + > + if ( !new_masked && msix->enabled && !msix->masked && entry->updated > ) > + { > + /* > + * If MSI-X is enabled, the function mask is not active, the > entry > + * is being unmasked and there have been changes to the address > or > + * data fields Xen needs to disable and enable the entry in order > + * to pick up the changes. > + */ > + rc = vpci_msix_arch_disable_entry(entry, pdev); > + if ( rc && rc != -ENOENT ) > + { > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: unable to disable entry %u: > %d\n", > + pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), VMSIX_ENTRY_NR(msix, entry), > rc); > + break; > + } > + > + rc = vpci_msix_arch_enable_entry(entry, pdev, > + VMSIX_TABLE_BASE(pdev->vpci, > + > VPCI_MSIX_TABLE)); > + if ( rc ) > + { > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: unable to enable entry %u: %d\n", > + pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), VMSIX_ENTRY_NR(msix, entry), > rc); > + break; > + } This very much looks lie it is identical to the earlier disable/enable code sequence - use a helper function? > --- a/xen/include/xen/vpci.h > +++ b/xen/include/xen/vpci.h > @@ -105,6 +105,32 @@ struct vpci { > /* Arch-specific data. */ > struct vpci_arch_msi arch; > } *msi; > + > + /* MSI-X data. */ > + struct vpci_msix { > + struct pci_dev *pdev; > + /* List link. */ > + struct list_head next; > + /* Table information. */ > +#define VPCI_MSIX_TABLE 0 > +#define VPCI_MSIX_PBA 1 > +#define VPCI_MSIX_MEM_NUM 2 > + uint32_t tables[VPCI_MSIX_MEM_NUM]; > + /* Maximum number of vectors supported by the device. */ > + uint16_t max_entries : 11; Isn't this one bit too few (if there are exactly 2k entries)? > @@ -127,6 +153,41 @@ int __must_check vpci_msi_arch_enable(struct vpci_msi > *msi, > void vpci_msi_arch_disable(struct vpci_msi *msi, const struct pci_dev *pdev); > void vpci_msi_arch_init(struct vpci_msi *msi); > void vpci_msi_arch_print(const struct vpci_msi *msi); > + > +/* Arch-specific vPCI MSI-X helpers. */ > +void vpci_msix_arch_mask_entry(struct vpci_msix_entry *entry, > + const struct pci_dev *pdev, bool mask); > +int __must_check vpci_msix_arch_enable_entry(struct vpci_msix_entry *entry, > + const struct pci_dev *pdev, > + paddr_t table_base); > +int __must_check vpci_msix_arch_disable_entry(struct vpci_msix_entry *entry, > + const struct pci_dev *pdev); > +void vpci_msix_arch_init_entry(struct vpci_msix_entry *entry); > +void vpci_msix_arch_print_entry(const struct vpci_msix_entry *entry); > + > +/* > + * Helper functions to fetch MSIX related data. They are used by both the > + * emulated MSIX code and the BAR handlers. > + */ > +#define VMSIX_TABLE_BASE(vpci, nr) \ > + ((vpci)->header.bars[(vpci)->msix->tables[nr] & PCI_MSIX_BIRMASK].addr) > +#define VMSIX_TABLE_ADDR(vpci, nr) \ > + (VMSIX_TABLE_BASE(vpci, nr) + \ > + ((vpci)->msix->tables[nr] & ~PCI_MSIX_BIRMASK)) > + > +/* > + * Note regarding the size calculation of the PBA: the spec mentions "The > last > + * QWORD will not necessarily be fully populated", so it implies that the PBA > + * size is 64-bit aligned. > + */ > +#define VMSIX_TABLE_SIZE(vpci, nr) > \ > + ((nr == VPCI_MSIX_TABLE) ? (vpci)->msix->max_entries * > PCI_MSIX_ENTRY_SIZE \ (nr) Anyway I wonder whether some or all of the macros couldn't, like the earlier comment say, be (inline) functions. > + : > ROUNDUP(DIV_ROUND_UP((vpci)->msix->max_entries, \ > + 8), 8)) > + > +#define VMSIX_ENTRY_NR(msix, entry) \ > + (unsigned int)((entry) - (msix)->entries) Missing outer parentheses. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |