[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v11 07/12] vpci: add header handlers
> -----Original Message----- > From: Roger Pau Monne [mailto:roger.pau@xxxxxxxxxx] > Sent: 20 March 2018 15:16 > To: xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>; Konrad Rzeszutek Wilk > <konrad.wilk@xxxxxxxxxx>; Roger Pau Monne <roger.pau@xxxxxxxxxx>; Ian > Jackson <Ian.Jackson@xxxxxxxxxx>; Wei Liu <wei.liu2@xxxxxxxxxx>; Andrew > Cooper <Andrew.Cooper3@xxxxxxxxxx>; George Dunlap > <George.Dunlap@xxxxxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx>; Julien Grall > <julien.grall@xxxxxxx>; Stefano Stabellini <sstabellini@xxxxxxxxxx>; Tim > (Xen.org) <tim@xxxxxxx>; Paul Durrant <Paul.Durrant@xxxxxxxxxx> > Subject: [PATCH v11 07/12] vpci: add header handlers > > Introduce a set of handlers that trap accesses to the PCI BARs and the > command register, in order to snoop BAR sizing and BAR relocation. > > The command handler is used to detect changes to bit 2 (response to > memory space accesses), and maps/unmaps the BARs of the device into > the guest p2m. A rangeset is used in order to figure out which memory > to map/unmap. This makes it easier to keep track of the possible > overlaps with other BARs, and will also simplify MSI-X support, where > certain regions of a BAR might be used for the MSI-X table or PBA. > > The BAR register handlers are used to detect attempts by the guest to > size or relocate the BARs. > > Note that the long running BAR mapping and unmapping operations are > deferred to be performed by hvm_io_pending, so that they can be safely > preempted. > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> ioreq part Reviewed-by: Paul Durrant <paul.durrant@xxxxxxxxxx> > --- > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > Cc: Wei Liu <wei.liu2@xxxxxxxxxx> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Julien Grall <julien.grall@xxxxxxx> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> > Cc: Tim Deegan <tim@xxxxxxx> > Cc: Paul Durrant <paul.durrant@xxxxxxxxxx> > --- > Changes since v10: > - Fix indirect function call in map_range. > - Use rom->addr instead of fetching it from the ROM BAR register in > modify_decoding. > - Remove ternary operator from modify_decoding. > - Simply apply_map to have a single return. > - Constify pci_dev parameter of apply_map. > - Remove references to maybe_defer_map. > - Use pdev (const) or dev (non-const) consistently in modify_bars. > - Invert part of the logic in rom_write to remove one indentation > level. > - Add comments in rom_write to clarify why rom->addr is updated in > two different places. > - Use lx to print frame numbers in modify_bars. > - Add start/end local variables in the first modify_bars loop. > > Changes since v9: > - Expand comments to clarify the code. > - Rename rom to rom_only in the vpci_cpu struct. > - Change definition style of dummy vpci_cpu. > - Replace incorrect usage of PFN_UP. > - Use system_state in order to check if the mapping functions are > being called from Dom0 builder context. > - Split the maybe_defer_map into two functions and place the Dom0 > builder one in the init section. > > Changes since v8: > - Do not pretend to support ARM in the map_range function. Explain > the required changes in the comment. > - Introduce PCI_HEADER_{NORMAL/BRIDGE}_NR_BARS defines. > - Rename 'rom' boolean variable to 'rom_only', which is more > descriptive of it's meaning. > - Introduce vpci_remove_device which removes all handlers for a > device. > - Simplify error handling when modifying BARs mapping. Any error will > cause the device to be unplugged (by calling vpci_remove_device). > - Return an error code in modify_bars. Add comments describing why > the error is sometimes ignored. > > Changes since v7: > - Order includes. > - Add newline between switch cases. > - Fix typo in comment (hopping). > - Wrap ternary conditional in parentheses. > - Remove CONFIG_HAS_PCI gueard from sched.h vpci_vcpu usage. > - Add comment regarding vpci_vcpu usage. > - Move rom_enabled from BAR struct to header. > - Do not protect vpci_vcpu with __XEN__ guards. > > Changes since v6: > - s/vpci_check_pending/vpci_process_pending/. > - Improve error handling in vpci_process_pending. > - Add a comment that explains how vpci_check_bar_overlap works. > - Add error messages to vpci_modify_bars and vpci_modify_rom. > - Introduce vpci_hw_read16/32, in order to passthrough reads to > the underlying hw. > - Print BAR number on error in vpci_bar_write. > - Place the CONFIG_HAS_PCI guards inside the vpci.h header and > provide an empty vpci_vcpu structure for the !CONFIG_HAS_PCI case. > - Define CONFIG_HAS_PCI in the test harness emul.h header before > including vpci.h > - Add ARM TODOs and an ARM-specific bodge to vpci_map_range due to > the lack of preemption in {un}map_mmio_regions. > - Make vpci_maybe_defer_map void. > - Set rom_enabled in vpci_init_bars. > - Defer enabling/disabling the memory decoding (or the ROM enable > bit) until the memory has been mapped/unmapped. > - Remove vpci_ prefix from static functions. > - Use the same code in order to map the general BARs and the ROM > BARs. > - Remove the seg/bus local variables and use pdev->{seg,bus} instead. > - Convert the bools in the BAR related structs into bool bitfields. > - Add the must_check attribute to vpci_process_pending. > - Open code check_bar_overlap inside modify_bars, which was it's only > user. > > Changes since v5: > - Switch to the new handler type. > - Use pci_sbdf_t to size the BARs. > - Use a single return for vpci_modify_bar. > - Do not return an error code from vpci_modify_bars, just log the > failure. > - Remove the 'sizing' parameter. Instead just let the guest write > directly to the BAR, and read the value back. This simplifies the > BAR register handlers, specially the read one. > - Ignore ROM BAR writes with memory decoding enabled and ROM enabled. > - Do not propagate failures to setup the ROM BAR in vpci_init_bars. > - Add preemption support to the BAR mapping/unmapping operations. > > Changes since v4: > - Expand commit message to mention the reason behind the usage of > rangesets. > - Fix comment related to the inclusiveness of rangesets. > - Fix off-by-one error in the calculation of the end of memory > regions. > - Store the state of the BAR (mapped/unmapped) in the vpci_bar > enabled field, previously was only used by ROMs. > - Fix double negation of return code. > - Modify vpci_cmd_write so it has a single call to pci_conf_write16. > - Print a warning when trying to write to the BAR with memory > decoding enabled (and ignore the write). > - Remove header_type local variable, it's used only once. > - Move the read of the command register. > - Restore previous command register value in the exit paths. > - Only set address to INVALID_PADDR if the initial BAR value matches > ~0 & PCI_BASE_ADDRESS_MEM_MASK. > - Don't disable the enabled bit in the expansion ROM register, memory > decoding is already disabled and takes precedence. > - Don't use INVALID_PADDR, just set the initial BAR address to the > value found in the hardware. > - Introduce rom_enabled to store the status of the > PCI_ROM_ADDRESS_ENABLE bit. > - Reorder fields of the structure to prevent holes. > > Changes since v3: > - Propagate previous changes: drop xen_ prefix and use u8/u16/u32 > instead of the previous half_word/word/double_word. > - Constify some of the paramerters. > - s/VPCI_BAR_MEM/VPCI_BAR_MEM32/. > - Simplify the number of fields stored for each BAR, a single address > field is stored and contains the address of the BAR both on Xen and > in the guest. > - Allow the guest to move the BARs around in the physical memory map. > - Add support for expansion ROM BARs. > - Do not cache the value of the command register. > - Remove a label used in vpci_cmd_write. > - Fix the calculation of the sizing mask in vpci_bar_write. > - Check the memory decode bit in order to decide if a BAR is > positioned or not. > - Disable memory decoding before sizing the BARs in Xen. > - When mapping/unmapping BARs check if there's overlap between BARs, > in order to avoid unmapping memory required by another BAR. > - Introduce a macro to check whether a BAR is mappable or not. > - Add a comment regarding the lack of support for SR-IOV. > - Remove the usage of the GENMASK macro. > > Changes since v2: > - Detect unset BARs and allow the hardware domain to position them. > --- > tools/tests/vpci/emul.h | 1 + > xen/arch/x86/hvm/ioreq.c | 4 + > xen/drivers/vpci/Makefile | 2 +- > xen/drivers/vpci/header.c | 548 > ++++++++++++++++++++++++++++++++++++++++++++++ > xen/drivers/vpci/vpci.c | 45 ++-- > xen/include/xen/sched.h | 4 + > xen/include/xen/vpci.h | 61 ++++++ > 7 files changed, 651 insertions(+), 14 deletions(-) > create mode 100644 xen/drivers/vpci/header.c > > diff --git a/tools/tests/vpci/emul.h b/tools/tests/vpci/emul.h > index fd0317995a..5d47544bf7 100644 > --- a/tools/tests/vpci/emul.h > +++ b/tools/tests/vpci/emul.h > @@ -80,6 +80,7 @@ typedef union { > }; > } pci_sbdf_t; > > +#define CONFIG_HAS_VPCI > #include "vpci.h" > > #define __hwdom_init > diff --git a/xen/arch/x86/hvm/ioreq.c b/xen/arch/x86/hvm/ioreq.c > index 7e66965bcd..90c9e3cd59 100644 > --- a/xen/arch/x86/hvm/ioreq.c > +++ b/xen/arch/x86/hvm/ioreq.c > @@ -26,6 +26,7 @@ > #include <xen/domain.h> > #include <xen/event.h> > #include <xen/paging.h> > +#include <xen/vpci.h> > > #include <asm/hvm/hvm.h> > #include <asm/hvm/ioreq.h> > @@ -48,6 +49,9 @@ bool hvm_io_pending(struct vcpu *v) > struct domain *d = v->domain; > struct hvm_ioreq_server *s; > > + if ( has_vpci(d) && vpci_process_pending(v) ) > + return true; > + > list_for_each_entry ( s, > &d->arch.hvm_domain.ioreq_server.list, > list_entry ) > diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile > index 840a906470..241467212f 100644 > --- a/xen/drivers/vpci/Makefile > +++ b/xen/drivers/vpci/Makefile > @@ -1 +1 @@ > -obj-y += vpci.o > +obj-y += vpci.o header.o > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c > new file mode 100644 > index 0000000000..d7c220a452 > --- /dev/null > +++ b/xen/drivers/vpci/header.c > @@ -0,0 +1,548 @@ > +/* > + * Generic functionality for handling accesses to the PCI header from the > + * configuration space. > + * > + * Copyright (C) 2017 Citrix Systems R&D > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms and conditions of the GNU General Public > + * License, version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public > + * License along with this program; If not, see > <http://www.gnu.org/licenses/>. > + */ > + > +#include <xen/p2m-common.h> > +#include <xen/sched.h> > +#include <xen/softirq.h> > +#include <xen/vpci.h> > + > +#include <asm/event.h> > + > +#define MAPPABLE_BAR(x) \ > + ((x)->type == VPCI_BAR_MEM32 || (x)->type == VPCI_BAR_MEM64_LO > || \ > + (x)->type == VPCI_BAR_ROM) > + > +struct map_data { > + struct domain *d; > + bool map; > +}; > + > +static int map_range(unsigned long s, unsigned long e, void *data, > + unsigned long *c) > +{ > + const struct map_data *map = data; > + int rc; > + > + for ( ; ; ) > + { > + unsigned long size = e - s + 1; > + > + /* > + * ARM TODOs: > + * - On ARM whether the memory is prefetchable or not should be > passed > + * to map_mmio_regions in order to decide which memory attributes > + * should be used. > + * > + * - {un}map_mmio_regions doesn't support preemption. > + */ > + > + rc = map->map ? map_mmio_regions(map->d, _gfn(s), size, _mfn(s)) > + : unmap_mmio_regions(map->d, _gfn(s), size, _mfn(s)); > + if ( rc == 0 ) > + { > + *c += size; > + break; > + } > + if ( rc < 0 ) > + { > + printk(XENLOG_G_WARNING > + "Failed to identity %smap [%lx, %lx] for d%d: %d\n", > + map ? "" : "un", s, e, map->d->domain_id, rc); > + break; > + } > + ASSERT(rc < size); > + *c += rc; > + s += rc; > + if ( general_preempt_check() ) > + return -ERESTART; > + } > + > + return rc; > +} > + > +/* > + * The rom_only parameter is used to signal the map/unmap helpers that > the ROM > + * BAR's enable bit has changed with the memory decoding bit already > enabled. > + * If rom_only is not set then it's the memory decoding bit that changed. > + */ > +static void modify_decoding(const struct pci_dev *pdev, bool map, bool > rom_only) > +{ > + struct vpci_header *header = &pdev->vpci->header; > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn); > + uint16_t cmd; > + unsigned int i; > + > + for ( i = 0; i < ARRAY_SIZE(header->bars); i++ ) > + { > + if ( !MAPPABLE_BAR(&header->bars[i]) ) > + continue; > + > + if ( rom_only && header->bars[i].type == VPCI_BAR_ROM ) > + { > + unsigned int rom_pos = (i == PCI_HEADER_NORMAL_NR_BARS) > + ? PCI_ROM_ADDRESS : PCI_ROM_ADDRESS1; > + uint32_t val = header->bars[i].addr | > + (map ? PCI_ROM_ADDRESS_ENABLE : 0); > + > + header->bars[i].enabled = header->rom_enabled = map; > + pci_conf_write32(pdev->seg, pdev->bus, slot, func, rom_pos, val); > + return; > + } > + > + if ( !rom_only && > + (header->bars[i].type != VPCI_BAR_ROM || header->rom_enabled) > ) > + header->bars[i].enabled = map; > + } > + > + ASSERT(!rom_only); > + cmd = pci_conf_read16(pdev->seg, pdev->bus, slot, func, > PCI_COMMAND); > + cmd &= ~PCI_COMMAND_MEMORY; > + cmd |= map ? PCI_COMMAND_MEMORY : 0; > + pci_conf_write16(pdev->seg, pdev->bus, slot, func, PCI_COMMAND, > + cmd); > +} > + > +bool vpci_process_pending(struct vcpu *v) > +{ > + if ( v->vpci.mem ) > + { > + struct map_data data = { > + .d = v->domain, > + .map = v->vpci.map, > + }; > + int rc = rangeset_consume_ranges(v->vpci.mem, map_range, &data); > + > + if ( rc == -ERESTART ) > + return true; > + > + spin_lock(&v->vpci.pdev->vpci->lock); > + /* Disable memory decoding unconditionally on failure. */ > + modify_decoding(v->vpci.pdev, !rc && v->vpci.map, > + !rc && v->vpci.rom_only); > + spin_unlock(&v->vpci.pdev->vpci->lock); > + > + rangeset_destroy(v->vpci.mem); > + v->vpci.mem = NULL; > + if ( rc ) > + /* > + * FIXME: in case of failure remove the device from the domain. > + * Note that there might still be leftover mappings. While this > is > + * safe for Dom0, for DomUs the domain will likely need to be > + * killed in order to avoid leaking stale p2m mappings on > + * failure. > + */ > + vpci_remove_device(v->vpci.pdev); > + } > + > + return false; > +} > + > +static int __init apply_map(struct domain *d, const struct pci_dev *pdev, > + struct rangeset *mem) > +{ > + struct map_data data = { .d = d, .map = true }; > + int rc; > + > + while ( (rc = rangeset_consume_ranges(mem, map_range, &data)) == - > ERESTART ) > + process_pending_softirqs(); > + rangeset_destroy(mem); > + if ( !rc ) > + modify_decoding(pdev, true, false); > + > + return rc; > +} > + > +static void defer_map(struct domain *d, struct pci_dev *pdev, > + struct rangeset *mem, bool map, bool rom_only) > +{ > + struct vcpu *curr = current; > + > + /* > + * FIXME: when deferring the {un}map the state of the device should not > + * be trusted. For example the enable bit is toggled after the device > + * is mapped. This can lead to parallel mapping operations being > + * started for the same device if the domain is not well-behaved. > + */ > + curr->vpci.pdev = pdev; > + curr->vpci.mem = mem; > + curr->vpci.map = map; > + curr->vpci.rom_only = rom_only; > +} > + > +static int modify_bars(const struct pci_dev *pdev, bool map, bool > rom_only) > +{ > + struct vpci_header *header = &pdev->vpci->header; > + struct rangeset *mem = rangeset_new(NULL, NULL, 0); > + struct pci_dev *tmp, *dev = NULL; > + unsigned int i; > + int rc; > + > + if ( !mem ) > + return -ENOMEM; > + > + /* > + * Create a rangeset that represents the current device BARs memory > region > + * and compare it against all the currently active BAR memory regions. If > + * an overlap is found, subtract it from the region to be > mapped/unmapped. > + * > + * First fill the rangeset with all the BARs of this device or with the > ROM > + * BAR only, depending on whether the guest is toggling the memory > decode > + * bit of the command register, or the enable bit of the ROM BAR > register. > + */ > + for ( i = 0; i < ARRAY_SIZE(header->bars); i++ ) > + { > + const struct vpci_bar *bar = &header->bars[i]; > + unsigned long start = PFN_DOWN(bar->addr); > + unsigned long end = PFN_DOWN(bar->addr + bar->size - 1); > + > + if ( !MAPPABLE_BAR(bar) || > + (rom_only ? bar->type != VPCI_BAR_ROM > + : (bar->type == VPCI_BAR_ROM && > !header->rom_enabled)) ) > + continue; > + > + rc = rangeset_add_range(mem, start, end); > + if ( rc ) > + { > + printk(XENLOG_G_WARNING "Failed to add [%lx, %lx]: %d\n", > + start, end, rc); > + rangeset_destroy(mem); > + return rc; > + } > + } > + > + /* > + * Check for overlaps with other BARs. Note that only BARs that are > + * currently mapped (enabled) are checked for overlaps. > + */ > + list_for_each_entry(tmp, &pdev->domain->arch.pdev_list, domain_list) > + { > + if ( tmp == pdev ) > + { > + /* > + * Need to store the device so it's not constified and defer_map > + * can modify it in case of error. > + */ > + dev = tmp; > + if ( !rom_only ) > + /* > + * If memory decoding is toggled avoid checking against the > + * same device, or else all regions will be removed from the > + * memory map in the unmap case. > + */ > + continue; > + } > + > + for ( i = 0; i < ARRAY_SIZE(tmp->vpci->header.bars); i++ ) > + { > + const struct vpci_bar *bar = &tmp->vpci->header.bars[i]; > + unsigned long start = PFN_DOWN(bar->addr); > + unsigned long end = PFN_DOWN(bar->addr + bar->size - 1); > + > + if ( !bar->enabled || !rangeset_overlaps_range(mem, start, end) > || > + /* > + * If only the ROM enable bit is toggled check against other > + * BARs in the same device for overlaps, but not against the > + * same ROM BAR. > + */ > + (rom_only && tmp == pdev && bar->type == VPCI_BAR_ROM) ) > + continue; > + > + rc = rangeset_remove_range(mem, start, end); > + if ( rc ) > + { > + printk(XENLOG_G_WARNING "Failed to remove [%lx, %lx]: %d\n", > + start, end, rc); > + rangeset_destroy(mem); > + return rc; > + } > + } > + } > + > + ASSERT(dev); > + > + if ( system_state < SYS_STATE_active ) > + { > + /* > + * Mappings might be created when building Dom0 if the memory > decoding > + * bit of PCI devices is enabled. In that case it's not possible to > + * defer the operation, so call apply_map in order to create the > + * mappings right away. Note that at build time this function will > only > + * be called iff the memory decoding bit is enabled, thus the > operation > + * will always be to establish mappings and process all the BARs. > + */ > + ASSERT(map && !rom_only); > + return apply_map(pdev->domain, pdev, mem); > + } > + > + defer_map(dev->domain, dev, mem, map, rom_only); > + > + return 0; > +} > + > +static void cmd_write(const struct pci_dev *pdev, unsigned int reg, > + uint32_t cmd, void *data) > +{ > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn); > + uint16_t current_cmd = pci_conf_read16(pdev->seg, pdev->bus, slot, > func, > + reg); > + > + /* > + * Let Dom0 play with all the bits directly except for the memory > + * decoding one. > + */ > + if ( (cmd ^ current_cmd) & PCI_COMMAND_MEMORY ) > + /* > + * Ignore the error. No memory has been added or removed from the > p2m > + * (because the actual p2m changes are deferred in defer_map) and the > + * memory decoding bit has not been changed, so leave everything as- > is, > + * hoping the guest will realize and try again. > + */ > + modify_bars(pdev, cmd & PCI_COMMAND_MEMORY, false); > + else > + pci_conf_write16(pdev->seg, pdev->bus, slot, func, reg, cmd); > +} > + > +static void bar_write(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data) > +{ > + struct vpci_bar *bar = data; > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn); > + bool hi = false; > + > + if ( pci_conf_read16(pdev->seg, pdev->bus, slot, func, PCI_COMMAND) > & > + PCI_COMMAND_MEMORY ) > + { > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: ignored BAR %lu write with memory > decoding enabled\n", > + pdev->seg, pdev->bus, slot, func, > + bar - pdev->vpci->header.bars); > + return; > + } > + > + if ( bar->type == VPCI_BAR_MEM64_HI ) > + { > + ASSERT(reg > PCI_BASE_ADDRESS_0); > + bar--; > + hi = true; > + } > + else > + val &= PCI_BASE_ADDRESS_MEM_MASK; > + > + /* > + * Update the cached address, so that when memory decoding is enabled > + * Xen can map the BAR into the guest p2m. > + */ > + bar->addr &= ~(0xffffffffull << (hi ? 32 : 0)); > + bar->addr |= (uint64_t)val << (hi ? 32 : 0); > + > + /* Make sure Xen writes back the same value for the BAR RO bits. */ > + if ( !hi ) > + { > + val |= bar->type == VPCI_BAR_MEM32 ? > PCI_BASE_ADDRESS_MEM_TYPE_32 > + : PCI_BASE_ADDRESS_MEM_TYPE_64; > + val |= bar->prefetchable ? PCI_BASE_ADDRESS_MEM_PREFETCH : 0; > + } > + > + pci_conf_write32(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), reg, val); > +} > + > +static void rom_write(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data) > +{ > + struct vpci_header *header = &pdev->vpci->header; > + struct vpci_bar *rom = data; > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn); > + uint16_t cmd = pci_conf_read16(pdev->seg, pdev->bus, slot, func, > + PCI_COMMAND); > + bool new_enabled = val & PCI_ROM_ADDRESS_ENABLE; > + > + if ( (cmd & PCI_COMMAND_MEMORY) && header->rom_enabled && > new_enabled ) > + { > + gprintk(XENLOG_WARNING, > + "%04x:%02x:%02x.%u: ignored ROM BAR write with memory > decoding enabled\n", > + pdev->seg, pdev->bus, slot, func); > + return; > + } > + > + if ( !header->rom_enabled ) > + /* > + * If the ROM BAR is not enabled update the address field so the > + * correct address is mapped into the p2m. > + */ > + rom->addr = val & PCI_ROM_ADDRESS_MASK; > + > + if ( !(cmd & PCI_COMMAND_MEMORY) || header->rom_enabled == > new_enabled ) > + { > + /* Just update the ROM BAR field. */ > + header->rom_enabled = new_enabled; > + pci_conf_write32(pdev->seg, pdev->bus, slot, func, reg, val); > + } > + else if ( modify_bars(pdev, new_enabled, true) ) > + /* > + * No memory has been added or removed from the p2m (because the > actual > + * p2m changes are deferred in defer_map) and the ROM enable bit has > + * not been changed, so leave everything as-is, hoping the guest will > + * realize and try again. It's important to not update rom->addr in > the > + * unmap case if modify_bars has failed, or future attempts would > + * attempt to unmap the wrong address. > + */ > + return; > + > + if ( !new_enabled ) > + rom->addr = val & PCI_ROM_ADDRESS_MASK; > +} > + > +static int init_bars(struct pci_dev *pdev) > +{ > + uint8_t slot = PCI_SLOT(pdev->devfn), func = PCI_FUNC(pdev->devfn); > + uint16_t cmd; > + uint64_t addr, size; > + unsigned int i, num_bars, rom_reg; > + struct vpci_header *header = &pdev->vpci->header; > + struct vpci_bar *bars = header->bars; > + pci_sbdf_t sbdf = { > + .seg = pdev->seg, > + .bus = pdev->bus, > + .dev = slot, > + .func = func, > + }; > + int rc; > + > + switch ( pci_conf_read8(pdev->seg, pdev->bus, slot, func, > PCI_HEADER_TYPE) > + & 0x7f ) > + { > + case PCI_HEADER_TYPE_NORMAL: > + num_bars = PCI_HEADER_NORMAL_NR_BARS; > + rom_reg = PCI_ROM_ADDRESS; > + break; > + > + case PCI_HEADER_TYPE_BRIDGE: > + num_bars = PCI_HEADER_BRIDGE_NR_BARS; > + rom_reg = PCI_ROM_ADDRESS1; > + break; > + > + default: > + return -EOPNOTSUPP; > + } > + > + /* Setup a handler for the command register. */ > + rc = vpci_add_register(pdev->vpci, vpci_hw_read16, cmd_write, > PCI_COMMAND, > + 2, header); > + if ( rc ) > + return rc; > + > + /* Disable memory decoding before sizing. */ > + cmd = pci_conf_read16(pdev->seg, pdev->bus, slot, func, > PCI_COMMAND); > + if ( cmd & PCI_COMMAND_MEMORY ) > + pci_conf_write16(pdev->seg, pdev->bus, slot, func, PCI_COMMAND, > + cmd & ~PCI_COMMAND_MEMORY); > + > + for ( i = 0; i < num_bars; i++ ) > + { > + uint8_t reg = PCI_BASE_ADDRESS_0 + i * 4; > + uint32_t val; > + > + if ( i && bars[i - 1].type == VPCI_BAR_MEM64_LO ) > + { > + bars[i].type = VPCI_BAR_MEM64_HI; > + rc = vpci_add_register(pdev->vpci, vpci_hw_read32, bar_write, > reg, > + 4, &bars[i]); > + if ( rc ) > + { > + pci_conf_write16(pdev->seg, pdev->bus, slot, func, > + PCI_COMMAND, cmd); > + return rc; > + } > + > + continue; > + } > + > + val = pci_conf_read32(pdev->seg, pdev->bus, slot, func, reg); > + if ( (val & PCI_BASE_ADDRESS_SPACE) == > PCI_BASE_ADDRESS_SPACE_IO ) > + { > + bars[i].type = VPCI_BAR_IO; > + continue; > + } > + if ( (val & PCI_BASE_ADDRESS_MEM_TYPE_MASK) == > + PCI_BASE_ADDRESS_MEM_TYPE_64 ) > + bars[i].type = VPCI_BAR_MEM64_LO; > + else > + bars[i].type = VPCI_BAR_MEM32; > + > + rc = pci_size_mem_bar(sbdf, reg, &addr, &size, > + (i == num_bars - 1) ? PCI_BAR_LAST : 0); > + if ( rc < 0 ) > + { > + pci_conf_write16(pdev->seg, pdev->bus, slot, func, PCI_COMMAND, > + cmd); > + return rc; > + } > + > + if ( size == 0 ) > + { > + bars[i].type = VPCI_BAR_EMPTY; > + continue; > + } > + > + bars[i].addr = addr; > + bars[i].size = size; > + bars[i].prefetchable = val & PCI_BASE_ADDRESS_MEM_PREFETCH; > + > + rc = vpci_add_register(pdev->vpci, vpci_hw_read32, bar_write, reg, 4, > + &bars[i]); > + if ( rc ) > + { > + pci_conf_write16(pdev->seg, pdev->bus, slot, func, PCI_COMMAND, > + cmd); > + return rc; > + } > + } > + > + /* Check expansion ROM. */ > + rc = pci_size_mem_bar(sbdf, rom_reg, &addr, &size, PCI_BAR_ROM); > + if ( rc > 0 && size ) > + { > + struct vpci_bar *rom = &header->bars[num_bars]; > + > + rom->type = VPCI_BAR_ROM; > + rom->size = size; > + rom->addr = addr; > + header->rom_enabled = pci_conf_read32(pdev->seg, pdev->bus, slot, > func, > + rom_reg) & > PCI_ROM_ADDRESS_ENABLE; > + > + rc = vpci_add_register(pdev->vpci, vpci_hw_read32, rom_write, > rom_reg, > + 4, rom); > + if ( rc ) > + rom->type = VPCI_BAR_EMPTY; > + } > + > + return (cmd & PCI_COMMAND_MEMORY) ? modify_bars(pdev, true, > false) : 0; > +} > +REGISTER_VPCI_INIT(init_bars); > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * tab-width: 4 > + * indent-tabs-mode: nil > + * End: > + */ > diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c > index 4740d02edf..e5b49b9d82 100644 > --- a/xen/drivers/vpci/vpci.c > +++ b/xen/drivers/vpci/vpci.c > @@ -34,6 +34,23 @@ struct vpci_register { > struct list_head node; > }; > > +void vpci_remove_device(struct pci_dev *pdev) > +{ > + spin_lock(&pdev->vpci->lock); > + while ( !list_empty(&pdev->vpci->handlers) ) > + { > + struct vpci_register *r = list_first_entry(&pdev->vpci->handlers, > + struct vpci_register, > + node); > + > + list_del(&r->node); > + xfree(r); > + } > + spin_unlock(&pdev->vpci->lock); > + xfree(pdev->vpci); > + pdev->vpci = NULL; > +} > + > int __hwdom_init vpci_add_handlers(struct pci_dev *pdev) > { > unsigned int i; > @@ -57,19 +74,7 @@ int __hwdom_init vpci_add_handlers(struct pci_dev > *pdev) > } > > if ( rc ) > - { > - while ( !list_empty(&pdev->vpci->handlers) ) > - { > - struct vpci_register *r = list_first_entry(&pdev->vpci->handlers, > - struct vpci_register, > - node); > - > - list_del(&r->node); > - xfree(r); > - } > - xfree(pdev->vpci); > - pdev->vpci = NULL; > - } > + vpci_remove_device(pdev); > > return rc; > } > @@ -102,6 +107,20 @@ static void vpci_ignored_write(const struct pci_dev > *pdev, unsigned int reg, > { > } > > +uint32_t vpci_hw_read16(const struct pci_dev *pdev, unsigned int reg, > + void *data) > +{ > + return pci_conf_read16(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), reg); > +} > + > +uint32_t vpci_hw_read32(const struct pci_dev *pdev, unsigned int reg, > + void *data) > +{ > + return pci_conf_read32(pdev->seg, pdev->bus, PCI_SLOT(pdev->devfn), > + PCI_FUNC(pdev->devfn), reg); > +} > + > int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, > vpci_write_t *write_handler, unsigned int offset, > unsigned int size, void *data) > diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h > index f89896e59b..57bb142c02 100644 > --- a/xen/include/xen/sched.h > +++ b/xen/include/xen/sched.h > @@ -20,6 +20,7 @@ > #include <xen/smp.h> > #include <xen/perfc.h> > #include <asm/atomic.h> > +#include <xen/vpci.h> > #include <xen/wait.h> > #include <public/xen.h> > #include <public/domctl.h> > @@ -264,6 +265,9 @@ struct vcpu > > struct evtchn_fifo_vcpu *evtchn_fifo; > > + /* vPCI per-vCPU area, used to store data for long running operations. */ > + struct vpci_vcpu vpci; > + > struct arch_vcpu arch; > }; > > diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h > index 9f2864fb0c..6bf8b22b4f 100644 > --- a/xen/include/xen/vpci.h > +++ b/xen/include/xen/vpci.h > @@ -1,6 +1,8 @@ > #ifndef _XEN_VPCI_H_ > #define _XEN_VPCI_H_ > > +#ifdef CONFIG_HAS_VPCI > + > #include <xen/pci.h> > #include <xen/types.h> > #include <xen/list.h> > @@ -20,6 +22,9 @@ typedef int vpci_register_init_t(struct pci_dev *dev); > /* Add vPCI handlers to device. */ > int __must_check vpci_add_handlers(struct pci_dev *dev); > > +/* Remove all handlers and free vpci related structures. */ > +void vpci_remove_device(struct pci_dev *pdev); > + > /* Add/remove a register handler. */ > int __must_check vpci_add_register(struct vpci *vpci, > vpci_read_t *read_handler, > @@ -34,12 +39,68 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, > unsigned int size); > void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size, > uint32_t data); > > +/* Passthrough handlers. */ > +uint32_t vpci_hw_read16(const struct pci_dev *pdev, unsigned int reg, > + void *data); > +uint32_t vpci_hw_read32(const struct pci_dev *pdev, unsigned int reg, > + void *data); > + > +/* > + * Check for pending vPCI operations on this vcpu. Returns true if the vcpu > + * should not run. > + */ > +bool __must_check vpci_process_pending(struct vcpu *v); > + > struct vpci { > /* List of vPCI handlers for a device. */ > struct list_head handlers; > spinlock_t lock; > + > +#ifdef __XEN__ > + /* Hide the rest of the vpci struct from the user-space test harness. */ > + struct vpci_header { > + /* Information about the PCI BARs of this device. */ > + struct vpci_bar { > + uint64_t addr; > + uint64_t size; > + enum { > + VPCI_BAR_EMPTY, > + VPCI_BAR_IO, > + VPCI_BAR_MEM32, > + VPCI_BAR_MEM64_LO, > + VPCI_BAR_MEM64_HI, > + VPCI_BAR_ROM, > + } type; > + bool prefetchable : 1; > + /* Store whether the BAR is mapped into guest p2m. */ > + bool enabled : 1; > +#define PCI_HEADER_NORMAL_NR_BARS 6 > +#define PCI_HEADER_BRIDGE_NR_BARS 2 > + } bars[PCI_HEADER_NORMAL_NR_BARS + 1]; > + /* At most 6 BARS + 1 expansion ROM BAR. */ > + > + /* > + * Store whether the ROM enable bit is set (doesn't imply ROM BAR > + * is mapped into guest p2m) if there's a ROM BAR on the device. > + */ > + bool rom_enabled : 1; > + /* FIXME: currently there's no support for SR-IOV. */ > + } header; > +#endif > +}; > + > +struct vpci_vcpu { > + /* Per-vcpu structure to store state while {un}mapping of PCI BARs. */ > + struct rangeset *mem; > + struct pci_dev *pdev; > + bool map : 1; > + bool rom_only : 1; > }; > > +#else /* !CONFIG_HAS_VPCI */ > +struct vpci_vcpu {}; > +#endif > + > #endif > > /* > -- > 2.16.2 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |