[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v7 for-next 03/12] vpci: introduce basic handlers to trap accesses to the PCI config space
> -----Original Message----- > From: Roger Pau Monne [mailto:roger.pau@xxxxxxxxxx] > Sent: 18 October 2017 12:40 > To: xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: konrad.wilk@xxxxxxxxxx; boris.ostrovsky@xxxxxxxxxx; Roger Pau Monne > <roger.pau@xxxxxxxxxx>; Ian Jackson <Ian.Jackson@xxxxxxxxxx>; Wei Liu > <wei.liu2@xxxxxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx>; Andrew Cooper > <Andrew.Cooper3@xxxxxxxxxx>; Paul Durrant <Paul.Durrant@xxxxxxxxxx> > Subject: [PATCH v7 for-next 03/12] vpci: introduce basic handlers to trap > accesses to the PCI config space > > This functionality is going to reside in vpci.c (and the corresponding > vpci.h header), and should be arch-agnostic. The handlers introduced > in this patch setup the basic functionality required in order to trap > accesses to the PCI config space, and allow decoding the address and > finding the corresponding handler that should handle the access > (although no handlers are implemented). > > Note that the traps to the PCI IO ports registers (0xcf8/0xcfc) are > setup inside of a x86 HVM file, since that's not shared with other > arches. > > A new XEN_X86_EMU_VPCI x86 domain flag is added in order to signal Xen > whether a domain should use the newly introduced vPCI handlers, this > is only enabled for PVH Dom0 at the moment. > > A very simple user-space test is also provided, so that the basic > functionality of the vPCI traps can be asserted. This has been proven > quite helpful during development, since the logic to handle partial > accesses or accesses that expand across multiple registers is not > trivial. > > The handlers for the registers are added to a linked list that's keep > sorted at all times. Both the read and write handlers support accesses > that expand across multiple emulated registers and contain gaps not > emulated. > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx> io parts: Reviewed-by: Paul Durrant <paul.durrant@xxxxxxxxxx> > --- > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > Cc: Wei Liu <wei.liu2@xxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: Paul Durrant <paul.durrant@xxxxxxxxxx> > --- > Changes since v6: > - Align the vpci handlers in the linker script. > - Switch add/remove register functions to take a vpci parameter > instead of a pci_dev. > - Expand comment of merge_result. > - Return X86EMUL_UNHANDLEABLE if accessing cfc and cf8 is disabled. > > Changes since v5: > - Use a spinlock per pci device. > - Use the recently introduced pci_sbdf_t type. > - Fix test harness to use the right handler type and the newly > introduced lock. > - Move the position of the vpci sections in the linker scripts. > - Constify domain and pci_dev in vpci_{read/write}. > - Fix typos in comments. > - Use _XEN_VPCI_H_ as header guard. > > Changes since v4: > * User-space test harness: > - Do not redirect the output of the test. > - Add main.c and emul.h as dependencies of the Makefile target. > - Use the same rule to modify the vpci and list headers. > - Remove underscores from local macro variables. > - Add _check suffix to the test harness multiread function. > - Change the value written by every different size in the multiwrite > test. > - Use { } to initialize the r16 and r20 arrays (instead of { 0 }). > - Perform some of the read checks with the local variable directly. > - Expand some comments. > - Implement a dummy rwlock. > * Hypervisor code: > - Guard the linker script changes with CONFIG_HAS_PCI. > - Rename vpci_access_check to vpci_access_allowed and make it return > bool. > - Make hvm_pci_decode_addr return the register as return value. > - Use ~3 instead of 0xfffc to remove the register offset when > checking accesses to IO ports. > - s/head/prev in vpci_add_register. > - Add parentheses around & in vpci_add_register. > - Fix register removal. > - Change the BUGs in vpci_{read/write}_hw helpers to > ASSERT_UNREACHABLE. > - Make merge_result static and change the computation of the mask to > avoid using a uint64_t. > - Modify vpci_read to only read from hardware the not-emulated gaps. > - Remove the vpci_val union and use a uint32_t instead. > - Change handler read type to return a uint32_t instead of modifying > a variable passed by reference. > - Constify the data opaque parameter of read handlers. > - Change the size parameter of the vpci_{read/write} functions to > unsigned int. > - Place the array of initialization handlers in init.rodata or > .rodata depending on whether late-hwdom is enabled. > - Remove the pci_devs lock, assume the Dom0 is well behaved and won't > remove the device while trying to access it. > - Change the recursive spinlock into a rw lock for performance > reasons. > > Changes since v3: > * User-space test harness: > - Fix spaces in container_of macro. > - Implement a dummy locking functions. > - Remove 'current' macro make current a pointer to the statically > allocated vpcu. > - Remove unneeded parentheses in the pci_conf_readX macros. > - Fix the name of the write test macro. > - Remove the dummy EXPORT_SYMBOL macro (this was needed by the RB > code only). > - Import the max macro. > - Test all possible read/write size combinations with all possible > emulated register sizes. > - Introduce a test for register removal. > * Hypervisor code: > - Use a sorted list in order to store the config space handlers. > - Remove some unneeded 'else' branches. > - Make the IO port handlers always return X86EMUL_OKAY, and set the > data to all 1's in case of read failure (write are simply ignored). > - In hvm_select_ioreq_server reuse local variables when calling > XEN_DMOP_PCI_SBDF. > - Store the pointers to the initialization functions in the .rodata > section. > - Do not ignore the return value of xen_vpci_add_handlers in > setup_one_hwdom_device. > - Remove the vpci_init macro. > - Do not hide the pointers inside of the vpci_{read/write}_t > typedefs. > - Rename priv_data to private in vpci_register. > - Simplify checking for register overlap in vpci_register_cmp. > - Check that the offset and the length match before removing a > register in xen_vpci_remove_register. > - Make vpci_read_hw return a value rather than storing it in a > pointer passed by parameter. > - Handler dispatcher functions vpci_{read/write} no longer return an > error code, errors on reads/writes should be treated like hardware > (writes ignored, reads return all 1's or garbage). > - Make sure pcidevs is locked before calling pci_get_pdev_by_domain. > - Use a recursive spinlock for the vpci lock, so that spin_is_locked > checks that the current CPU is holding the lock. > - Make the code less error-chatty by removing some of the printk's. > - Pass the slot and the function as separate parameters to the > handler dispatchers (instead of passing devfn). > - Allow handlers to be registered with either a read or write > function only, the missing handler will be replaced by a dummy > handler (writes ignored, reads return 1's). > - Introduce PCI_CFG_SPACE_* defines from Linux. > - Simplify the handler dispatchers by removing the recursion, now the > dispatchers iterate over the list of sorted handlers and call them > in order. > - Remove the GENMASK_BYTES, SHIFT_RIGHT_BYTES and ADD_RESULT > macros, > and instead provide a merge_result function in order to merge a > register output into a partial result. > - Rename the fields of the vpci_val union to u8/u16/u32. > - Remove the return values from the read/write handlers, errors > should be handled internally and signaled as would be done on > native hardware. > - Remove the usage of the GENMASK macro. > > Changes since v2: > - Generalize the PCI address decoding and use it for IOREQ code also. > > Changes since v1: > - Allow access to cross a word-boundary. > - Add locking. > - Add cleanup to xen_vpci_add_handlers in case of failure. > --- > .gitignore | 3 + > tools/libxl/libxl_x86.c | 2 +- > tools/tests/Makefile | 1 + > tools/tests/vpci/Makefile | 37 ++++ > tools/tests/vpci/emul.h | 133 +++++++++++ > tools/tests/vpci/main.c | 309 ++++++++++++++++++++++++++ > xen/arch/arm/xen.lds.S | 14 ++ > xen/arch/x86/domain.c | 18 +- > xen/arch/x86/hvm/hvm.c | 2 + > xen/arch/x86/hvm/io.c | 103 +++++++++ > xen/arch/x86/setup.c | 3 +- > xen/arch/x86/xen.lds.S | 14 ++ > xen/drivers/Makefile | 2 +- > xen/drivers/passthrough/pci.c | 10 +- > xen/drivers/vpci/Makefile | 1 + > xen/drivers/vpci/vpci.c | 451 > ++++++++++++++++++++++++++++++++++++++ > xen/include/asm-x86/domain.h | 1 + > xen/include/asm-x86/hvm/io.h | 3 + > xen/include/public/arch-x86/xen.h | 5 +- > xen/include/xen/pci.h | 3 + > xen/include/xen/pci_regs.h | 8 + > xen/include/xen/vpci.h | 53 +++++ > 22 files changed, 1166 insertions(+), 10 deletions(-) > create mode 100644 tools/tests/vpci/Makefile > create mode 100644 tools/tests/vpci/emul.h > create mode 100644 tools/tests/vpci/main.c > create mode 100644 xen/drivers/vpci/Makefile > create mode 100644 xen/drivers/vpci/vpci.c > create mode 100644 xen/include/xen/vpci.h > > diff --git a/.gitignore b/.gitignore > index d64b03d06c..cfe54c6e8f 100644 > --- a/.gitignore > +++ b/.gitignore > @@ -245,6 +245,9 @@ tools/tests/regression/build/* > tools/tests/regression/downloads/* > tools/tests/mem-sharing/memshrtool > tools/tests/mce-test/tools/xen-mceinj > +tools/tests/vpci/list.h > +tools/tests/vpci/vpci.[hc] > +tools/tests/vpci/test_vpci > tools/xcutils/lsevtchn > tools/xcutils/readnotes > tools/xenbackendd/_paths.h > diff --git a/tools/libxl/libxl_x86.c b/tools/libxl/libxl_x86.c > index 5f91fe4f92..8f6a5bc6f2 100644 > --- a/tools/libxl/libxl_x86.c > +++ b/tools/libxl/libxl_x86.c > @@ -9,7 +9,7 @@ int libxl__arch_domain_prepare_config(libxl__gc *gc, > { > switch(d_config->c_info.type) { > case LIBXL_DOMAIN_TYPE_HVM: > - xc_config->emulation_flags = XEN_X86_EMU_ALL; > + xc_config->emulation_flags = (XEN_X86_EMU_ALL & > ~XEN_X86_EMU_VPCI); > break; > case LIBXL_DOMAIN_TYPE_PVH: > if (libxl_defbool_val(d_config->b_info.apic)) > diff --git a/tools/tests/Makefile b/tools/tests/Makefile > index 7162945121..f6942a93fb 100644 > --- a/tools/tests/Makefile > +++ b/tools/tests/Makefile > @@ -13,6 +13,7 @@ endif > SUBDIRS-$(CONFIG_X86) += x86_emulator > SUBDIRS-y += xen-access > SUBDIRS-y += xenstore > +SUBDIRS-$(CONFIG_HAS_PCI) += vpci > > .PHONY: all clean install distclean uninstall > all clean distclean: %: subdirs-% > diff --git a/tools/tests/vpci/Makefile b/tools/tests/vpci/Makefile > new file mode 100644 > index 0000000000..e45fcb5cd9 > --- /dev/null > +++ b/tools/tests/vpci/Makefile > @@ -0,0 +1,37 @@ > +XEN_ROOT=$(CURDIR)/../../.. > +include $(XEN_ROOT)/tools/Rules.mk > + > +TARGET := test_vpci > + > +.PHONY: all > +all: $(TARGET) > + > +.PHONY: run > +run: $(TARGET) > + ./$(TARGET) > + > +$(TARGET): vpci.c vpci.h list.h main.c emul.h > + $(HOSTCC) -g -o $@ vpci.c main.c > + > +.PHONY: clean > +clean: > + rm -rf $(TARGET) *.o *~ vpci.h vpci.c list.h > + > +.PHONY: distclean > +distclean: clean > + > +.PHONY: install > +install: > + > +vpci.c: $(XEN_ROOT)/xen/drivers/vpci/vpci.c > + # Trick the compiler so it doesn't complain about missing symbols > + sed -e '/#include/d' \ > + -e '1s;^;#include "emul.h"\ > + vpci_register_init_t *const __start_vpci_array[1]\;\ > + vpci_register_init_t *const __end_vpci_array[1]\;\ > + ;' <$< >$@ > + > +list.h: $(XEN_ROOT)/xen/include/xen/list.h > +vpci.h: $(XEN_ROOT)/xen/include/xen/vpci.h > +list.h vpci.h: > + sed -e '/#include/d' <$< >$@ > diff --git a/tools/tests/vpci/emul.h b/tools/tests/vpci/emul.h > new file mode 100644 > index 0000000000..ebd676723d > --- /dev/null > +++ b/tools/tests/vpci/emul.h > @@ -0,0 +1,133 @@ > +/* > + * Unit tests for the generic vPCI handler code. > + * > + * Copyright (C) 2017 Citrix Systems R&D > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms and conditions of the GNU General Public > + * License, version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public > + * License along with this program; If not, see > <http://www.gnu.org/licenses/>. > + */ > + > +#ifndef _TEST_VPCI_ > +#define _TEST_VPCI_ > + > +#include <stdlib.h> > +#include <stdio.h> > +#include <stddef.h> > +#include <stdint.h> > +#include <stdbool.h> > +#include <errno.h> > +#include <assert.h> > + > +#define container_of(ptr, type, member) ({ \ > + typeof(((type *)0)->member) *mptr = (ptr); \ > + \ > + (type *)((char *)mptr - offsetof(type, member)); \ > +}) > + > +#define smp_wmb() > +#define prefetch(x) __builtin_prefetch(x) > +#define ASSERT(x) assert(x) > +#define __must_check __attribute__((__warn_unused_result__)) > + > +#include "list.h" > + > +struct domain { > +}; > + > +struct pci_dev { > + struct vpci *vpci; > +}; > + > +struct vcpu > +{ > + const struct domain *domain; > +}; > + > +extern const struct vcpu *current; > +extern const struct pci_dev test_pdev; > + > +typedef bool spinlock_t; > +#define spin_lock_init(l) (*(l) = false) > +#define spin_lock(l) (*(l) = true) > +#define spin_unlock(l) (*(l) = false) > + > +typedef union { > + uint32_t sbdf; > + struct { > + union { > + uint16_t bdf; > + struct { > + union { > + struct { > + uint8_t func : 3, > + dev : 5; > + }; > + uint8_t extfunc; > + }; > + uint8_t bus; > + }; > + }; > + uint16_t seg; > + }; > +} pci_sbdf_t; > + > +#include "vpci.h" > + > +#define __hwdom_init > + > +#define has_vpci(d) true > + > +#define xzalloc(type) ((type *)calloc(1, sizeof(type))) > +#define xmalloc(type) ((type *)malloc(sizeof(type))) > +#define xfree(p) free(p) > + > +#define pci_get_pdev_by_domain(...) &test_pdev > + > +/* Dummy native helpers. Writes are ignored, reads return 1's. */ > +#define pci_conf_read8(...) 0xff > +#define pci_conf_read16(...) 0xffff > +#define pci_conf_read32(...) 0xffffffff > +#define pci_conf_write8(...) > +#define pci_conf_write16(...) > +#define pci_conf_write32(...) > + > +#define PCI_CFG_SPACE_EXP_SIZE 4096 > + > +#define BUG() assert(0) > +#define ASSERT_UNREACHABLE() assert(0) > + > +#define min(x, y) ({ \ > + const typeof(x) tx = (x); \ > + const typeof(y) ty = (y); \ > + \ > + (void) (&tx == &ty); \ > + tx < ty ? tx : ty; \ > +}) > + > +#define max(x, y) ({ \ > + const typeof(x) tx = (x); \ > + const typeof(y) ty = (y); \ > + \ > + (void) (&tx == &ty); \ > + tx > ty ? tx : ty; \ > +}) > + > +#endif > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * indent-tabs-mode: nil > + * End: > + */ > diff --git a/tools/tests/vpci/main.c b/tools/tests/vpci/main.c > new file mode 100644 > index 0000000000..b9a0a6006b > --- /dev/null > +++ b/tools/tests/vpci/main.c > @@ -0,0 +1,309 @@ > +/* > + * Unit tests for the generic vPCI handler code. > + * > + * Copyright (C) 2017 Citrix Systems R&D > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms and conditions of the GNU General Public > + * License, version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public > + * License along with this program; If not, see > <http://www.gnu.org/licenses/>. > + */ > + > +#include "emul.h" > + > +/* Single vcpu (current), and single domain with a single PCI device. */ > +static struct vpci vpci; > + > +const static struct domain d; > + > +const struct pci_dev test_pdev = { > + .vpci = &vpci, > +}; > + > +const static struct vcpu v = { > + .domain = &d > +}; > + > +const struct vcpu *current = &v; > + > +/* Dummy hooks, write stores data, read fetches it. */ > +static uint32_t vpci_read8(const struct pci_dev *pdev, unsigned int reg, > + void *data) > +{ > + return *(uint8_t *)data; > +} > + > +static void vpci_write8(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data) > +{ > + *(uint8_t *)data = val; > +} > + > +static uint32_t vpci_read16(const struct pci_dev *pdev, unsigned int reg, > + void *data) > +{ > + return *(uint16_t *)data; > +} > + > +static void vpci_write16(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data) > +{ > + *(uint16_t *)data = val; > +} > + > +static uint32_t vpci_read32(const struct pci_dev *pdev, unsigned int reg, > + void *data) > +{ > + return *(uint32_t *)data; > +} > + > +static void vpci_write32(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data) > +{ > + *(uint32_t *)data = val; > +} > + > +#define VPCI_READ(reg, size, data) ({ \ > + data = vpci_read((pci_sbdf_t){ .sbdf = 0 }, reg, size); \ > +}) > + > +#define VPCI_READ_CHECK(reg, size, expected) ({ \ > + uint32_t rd; \ > + \ > + VPCI_READ(reg, size, rd); \ > + assert(rd == (expected)); \ > +}) > + > +#define VPCI_WRITE(reg, size, data) ({ \ > + vpci_write((pci_sbdf_t){ .sbdf = 0 }, reg, size, data); \ > +}) > + > +#define VPCI_WRITE_CHECK(reg, size, data) ({ \ > + VPCI_WRITE(reg, size, data); \ > + VPCI_READ_CHECK(reg, size, data); \ > +}) > + > +#define VPCI_ADD_REG(fread, fwrite, off, size, store) \ > + assert(!vpci_add_register(test_pdev.vpci, fread, fwrite, off, size, \ > + &store)) > + > +#define VPCI_ADD_INVALID_REG(fread, fwrite, off, size) \ > + assert(vpci_add_register(test_pdev.vpci, fread, fwrite, off, size, NULL)) > + > +#define VPCI_REMOVE_REG(off, size) \ > + assert(!vpci_remove_register(test_pdev.vpci, off, size)) > + > +#define VPCI_REMOVE_INVALID_REG(off, size) \ > + assert(vpci_remove_register(test_pdev.vpci, off, size)) > + > +/* Read a 32b register using all possible sizes. */ > +void multiread4_check(unsigned int reg, uint32_t val) > +{ > + unsigned int i; > + > + /* Read using bytes. */ > + for ( i = 0; i < 4; i++ ) > + VPCI_READ_CHECK(reg + i, 1, (val >> (i * 8)) & UINT8_MAX); > + > + /* Read using 2bytes. */ > + for ( i = 0; i < 2; i++ ) > + VPCI_READ_CHECK(reg + i * 2, 2, (val >> (i * 2 * 8)) & UINT16_MAX); > + > + VPCI_READ_CHECK(reg, 4, val); > +} > + > +void multiwrite4_check(unsigned int reg) > +{ > + unsigned int i; > + uint32_t val = 0xa2f51732; > + > + /* Write using bytes. */ > + for ( i = 0; i < 4; i++ ) > + VPCI_WRITE_CHECK(reg + i, 1, (val >> (i * 8)) & UINT8_MAX); > + multiread4_check(reg, val); > + > + /* Change the value each time to be sure writes work fine. */ > + val = 0x2b836fda; > + /* Write using 2bytes. */ > + for ( i = 0; i < 2; i++ ) > + VPCI_WRITE_CHECK(reg + i * 2, 2, (val >> (i * 2 * 8)) & UINT16_MAX); > + multiread4_check(reg, val); > + > + val = 0xc4693beb; > + VPCI_WRITE_CHECK(reg, 4, val); > + multiread4_check(reg, val); > +} > + > +int > +main(int argc, char **argv) > +{ > + /* Index storage by offset. */ > + uint32_t r0 = 0xdeadbeef; > + uint8_t r5 = 0xef; > + uint8_t r6 = 0xbe; > + uint8_t r7 = 0xef; > + uint16_t r12 = 0x8696; > + uint8_t r16[4] = { }; > + uint16_t r20[2] = { }; > + uint32_t r24 = 0; > + uint8_t r28, r30; > + unsigned int i; > + int rc; > + > + INIT_LIST_HEAD(&vpci.handlers); > + spin_lock_init(&vpci.lock); > + > + VPCI_ADD_REG(vpci_read32, vpci_write32, 0, 4, r0); > + VPCI_READ_CHECK(0, 4, r0); > + VPCI_WRITE_CHECK(0, 4, 0xbcbcbcbc); > + > + VPCI_ADD_REG(vpci_read8, vpci_write8, 5, 1, r5); > + VPCI_READ_CHECK(5, 1, r5); > + VPCI_WRITE_CHECK(5, 1, 0xba); > + > + VPCI_ADD_REG(vpci_read8, vpci_write8, 6, 1, r6); > + VPCI_READ_CHECK(6, 1, r6); > + VPCI_WRITE_CHECK(6, 1, 0xba); > + > + VPCI_ADD_REG(vpci_read8, vpci_write8, 7, 1, r7); > + VPCI_READ_CHECK(7, 1, r7); > + VPCI_WRITE_CHECK(7, 1, 0xbd); > + > + VPCI_ADD_REG(vpci_read16, vpci_write16, 12, 2, r12); > + VPCI_READ_CHECK(12, 2, r12); > + VPCI_READ_CHECK(12, 4, 0xffff8696); > + > + /* > + * At this point we have the following layout: > + * > + * Note that this refers to the position of the variables, > + * but the value has already changed from the one given at > + * initialization time because write tests have been performed. > + * > + * 32 24 16 8 0 > + * +-----+-----+-----+-----+ > + * | r0 | 0 > + * +-----+-----+-----+-----+ > + * | r7 | r6 | r5 |/////| 32 > + * +-----+-----+-----+-----| > + * |///////////////////////| 64 > + * +-----------+-----------+ > + * |///////////| r12 | 96 > + * +-----------+-----------+ > + * ... > + * / = unhandled. > + */ > + > + /* Try to add an overlapping register handler. */ > + VPCI_ADD_INVALID_REG(vpci_read32, vpci_write32, 4, 4); > + > + /* Try to add a non-aligned register. */ > + VPCI_ADD_INVALID_REG(vpci_read16, vpci_write16, 15, 2); > + > + /* Try to add a register with wrong size. */ > + VPCI_ADD_INVALID_REG(vpci_read16, vpci_write16, 8, 3); > + > + /* Try to add a register with missing handlers. */ > + VPCI_ADD_INVALID_REG(NULL, NULL, 8, 2); > + > + /* Read/write of unset register. */ > + VPCI_READ_CHECK(8, 4, 0xffffffff); > + VPCI_READ_CHECK(8, 2, 0xffff); > + VPCI_READ_CHECK(8, 1, 0xff); > + VPCI_WRITE(10, 2, 0xbeef); > + VPCI_READ_CHECK(10, 2, 0xffff); > + > + /* Read of multiple registers */ > + VPCI_WRITE_CHECK(7, 1, 0xbd); > + VPCI_READ_CHECK(4, 4, 0xbdbabaff); > + > + /* Partial read of a register. */ > + VPCI_WRITE_CHECK(0, 4, 0x1a1b1c1d); > + VPCI_READ_CHECK(2, 1, 0x1b); > + VPCI_READ_CHECK(6, 2, 0xbdba); > + > + /* Write of multiple registers. */ > + VPCI_WRITE_CHECK(4, 4, 0xaabbccff); > + > + /* Partial write of a register. */ > + VPCI_WRITE_CHECK(2, 1, 0xfe); > + VPCI_WRITE_CHECK(6, 2, 0xfebc); > + > + /* > + * Test all possible read/write size combinations. > + * > + * Place 4 1B registers at 128bits (16B), 2 2B registers at 160bits > + * (20B) and finally 1 4B register at 192bits (24B). > + * > + * Then perform all possible write and read sizes on each of them. > + * > + * ... > + * 32 24 16 8 0 > + * +------+------+------+------+ > + * |r16[3]|r16[2]|r16[1]|r16[0]| 16 > + * +------+------+------+------+ > + * | r20[1] | r20[0] | 20 > + * +-------------+-------------| > + * | r24 | 24 > + * +-------------+-------------+ > + * > + */ > + VPCI_ADD_REG(vpci_read8, vpci_write8, 16, 1, r16[0]); > + VPCI_ADD_REG(vpci_read8, vpci_write8, 17, 1, r16[1]); > + VPCI_ADD_REG(vpci_read8, vpci_write8, 18, 1, r16[2]); > + VPCI_ADD_REG(vpci_read8, vpci_write8, 19, 1, r16[3]); > + > + VPCI_ADD_REG(vpci_read16, vpci_write16, 20, 2, r20[0]); > + VPCI_ADD_REG(vpci_read16, vpci_write16, 22, 2, r20[1]); > + > + VPCI_ADD_REG(vpci_read32, vpci_write32, 24, 4, r24); > + > + /* Check the initial value is 0. */ > + multiread4_check(16, 0); > + multiread4_check(20, 0); > + multiread4_check(24, 0); > + > + multiwrite4_check(16); > + multiwrite4_check(20); > + multiwrite4_check(24); > + > + /* > + * Check multiple non-consecutive gaps on the same read/write: > + * > + * 32 24 16 8 0 > + * +------+------+------+------+ > + * |//////| r30 |//////| r28 | 28 > + * +------+------+------+------+ > + * > + */ > + VPCI_ADD_REG(vpci_read8, vpci_write8, 28, 1, r28); > + VPCI_ADD_REG(vpci_read8, vpci_write8, 30, 1, r30); > + VPCI_WRITE_CHECK(28, 4, 0xffacffdc); > + > + /* Finally try to remove a couple of registers. */ > + VPCI_REMOVE_REG(28, 1); > + VPCI_REMOVE_REG(24, 4); > + VPCI_REMOVE_REG(12, 2); > + > + VPCI_REMOVE_INVALID_REG(20, 1); > + VPCI_REMOVE_INVALID_REG(16, 2); > + VPCI_REMOVE_INVALID_REG(30, 2); > + > + return 0; > +} > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * indent-tabs-mode: nil > + * End: > + */ > diff --git a/xen/arch/arm/xen.lds.S b/xen/arch/arm/xen.lds.S > index c9b9546435..98b82680c6 100644 > --- a/xen/arch/arm/xen.lds.S > +++ b/xen/arch/arm/xen.lds.S > @@ -65,6 +65,13 @@ SECTIONS > __param_start = .; > *(.data.param) > __param_end = .; > + > +#if defined(CONFIG_HAS_PCI) && defined(CONFIG_LATE_HWDOM) > + . = ALIGN(POINTER_ALIGN); > + __start_vpci_array = .; > + *(.data.vpci) > + __end_vpci_array = .; > +#endif > } :text > > #if defined(BUILD_ID) > @@ -173,6 +180,13 @@ SECTIONS > *(.init_array) > *(SORT(.init_array.*)) > __ctors_end = .; > + > +#if defined(CONFIG_HAS_PCI) && !defined(CONFIG_LATE_HWDOM) > + . = ALIGN(POINTER_ALIGN); > + __start_vpci_array = .; > + *(.data.vpci) > + __end_vpci_array = .; > +#endif > } :text > __init_end_efi = .; > . = ALIGN(STACK_SIZE); > diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c > index c28ac38fbe..4c22e0952e 100644 > --- a/xen/arch/x86/domain.c > +++ b/xen/arch/x86/domain.c > @@ -397,11 +397,21 @@ static bool emulation_flags_ok(const struct domain > *d, uint32_t emflags) > if ( is_hvm_domain(d) ) > { > if ( is_hardware_domain(d) && > - emflags != (XEN_X86_EMU_LAPIC|XEN_X86_EMU_IOAPIC) ) > - return false; > - if ( !is_hardware_domain(d) && emflags && > - emflags != XEN_X86_EMU_ALL && emflags != XEN_X86_EMU_LAPIC ) > + emflags != (XEN_X86_EMU_LAPIC|XEN_X86_EMU_IOAPIC| > + XEN_X86_EMU_VPCI) ) > return false; > + if ( !is_hardware_domain(d) ) > + { > + switch ( emflags ) > + { > + case XEN_X86_EMU_ALL & ~XEN_X86_EMU_VPCI: > + case XEN_X86_EMU_LAPIC: > + case 0: > + break; > + default: > + return false; > + } > + } > } > else if ( emflags != 0 && emflags != XEN_X86_EMU_PIT ) > { > diff --git a/xen/arch/x86/hvm/hvm.c b/xen/arch/x86/hvm/hvm.c > index 205b4cb685..8ed6718bf6 100644 > --- a/xen/arch/x86/hvm/hvm.c > +++ b/xen/arch/x86/hvm/hvm.c > @@ -36,6 +36,7 @@ > #include <xen/rangeset.h> > #include <xen/monitor.h> > #include <xen/warning.h> > +#include <xen/vpci.h> > #include <asm/shadow.h> > #include <asm/hap.h> > #include <asm/current.h> > @@ -629,6 +630,7 @@ int hvm_domain_initialise(struct domain *d, unsigned > long domcr_flags, > d->arch.hvm_domain.io_bitmap = hvm_io_bitmap; > > register_g2m_portio_handler(d); > + register_vpci_portio_handler(d); > > hvm_ioreq_init(d); > > diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c > index 6579e119ff..6c12cf5d22 100644 > --- a/xen/arch/x86/hvm/io.c > +++ b/xen/arch/x86/hvm/io.c > @@ -25,6 +25,7 @@ > #include <xen/trace.h> > #include <xen/event.h> > #include <xen/hypercall.h> > +#include <xen/vpci.h> > #include <asm/current.h> > #include <asm/cpufeature.h> > #include <asm/processor.h> > @@ -278,6 +279,108 @@ unsigned int hvm_pci_decode_addr(unsigned int > cf8, unsigned int addr, > return CF8_ADDR_LO(cf8) | (addr & 3); > } > > +/* Do some sanity checks. */ > +static bool vpci_access_allowed(unsigned int reg, unsigned int len) > +{ > + /* Check access size. */ > + if ( len != 1 && len != 2 && len != 4 ) > + return false; > + > + /* Check that access is size aligned. */ > + if ( (reg & (len - 1)) ) > + return false; > + > + return true; > +} > + > +/* vPCI config space IO ports handlers (0xcf8/0xcfc). */ > +static bool vpci_portio_accept(const struct hvm_io_handler *handler, > + const ioreq_t *p) > +{ > + return (p->addr == 0xcf8 && p->size == 4) || (p->addr & ~3) == 0xcfc; > +} > + > +static int vpci_portio_read(const struct hvm_io_handler *handler, > + uint64_t addr, uint32_t size, uint64_t *data) > +{ > + struct domain *d = current->domain; > + unsigned int reg; > + pci_sbdf_t sbdf; > + uint32_t cf8; > + > + *data = ~(uint64_t)0; > + > + if ( addr == 0xcf8 ) > + { > + ASSERT(size == 4); > + *data = d->arch.hvm_domain.pci_cf8; > + return X86EMUL_OKAY; > + } > + > + cf8 = ACCESS_ONCE(d->arch.hvm_domain.pci_cf8); > + if ( !CF8_ENABLED(cf8) ) > + return X86EMUL_UNHANDLEABLE; > + > + reg = hvm_pci_decode_addr(cf8, addr, &sbdf); > + > + if ( !vpci_access_allowed(reg, size) ) > + return X86EMUL_OKAY; > + > + *data = vpci_read(sbdf, reg, size); > + > + return X86EMUL_OKAY; > +} > + > +static int vpci_portio_write(const struct hvm_io_handler *handler, > + uint64_t addr, uint32_t size, uint64_t data) > +{ > + struct domain *d = current->domain; > + unsigned int reg; > + pci_sbdf_t sbdf; > + uint32_t cf8; > + > + if ( addr == 0xcf8 ) > + { > + ASSERT(size == 4); > + d->arch.hvm_domain.pci_cf8 = data; > + return X86EMUL_OKAY; > + } > + > + cf8 = ACCESS_ONCE(d->arch.hvm_domain.pci_cf8); > + if ( !CF8_ENABLED(cf8) ) > + return X86EMUL_UNHANDLEABLE; > + > + reg = hvm_pci_decode_addr(cf8, addr, &sbdf); > + > + if ( !vpci_access_allowed(reg, size) ) > + return X86EMUL_OKAY; > + > + vpci_write(sbdf, reg, size, data); > + > + return X86EMUL_OKAY; > +} > + > +static const struct hvm_io_ops vpci_portio_ops = { > + .accept = vpci_portio_accept, > + .read = vpci_portio_read, > + .write = vpci_portio_write, > +}; > + > +void register_vpci_portio_handler(struct domain *d) > +{ > + struct hvm_io_handler *handler; > + > + if ( !has_vpci(d) ) > + return; > + > + handler = hvm_next_io_handler(d); > + if ( !handler ) > + return; > + > + handler->type = IOREQ_TYPE_PIO; > + handler->ops = &vpci_portio_ops; > +} > + > /* > * Local variables: > * mode: C > diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c > index 32bb02e3a5..528cc464ba 100644 > --- a/xen/arch/x86/setup.c > +++ b/xen/arch/x86/setup.c > @@ -1582,7 +1582,8 @@ void __init noreturn __start_xen(unsigned long > mbi_p) > domcr_flags |= DOMCRF_hvm | > ((hvm_funcs.hap_supported && !opt_dom0_shadow) ? > DOMCRF_hap : 0); > - config.emulation_flags = > XEN_X86_EMU_LAPIC|XEN_X86_EMU_IOAPIC; > + config.emulation_flags = > XEN_X86_EMU_LAPIC|XEN_X86_EMU_IOAPIC| > + XEN_X86_EMU_VPCI; > } > > /* Create initial domain 0. */ > diff --git a/xen/arch/x86/xen.lds.S b/xen/arch/x86/xen.lds.S > index d5e8821d41..6c50916ed2 100644 > --- a/xen/arch/x86/xen.lds.S > +++ b/xen/arch/x86/xen.lds.S > @@ -124,6 +124,13 @@ SECTIONS > __param_start = .; > *(.data.param) > __param_end = .; > + > +#if defined(CONFIG_HAS_PCI) && defined(CONFIG_LATE_HWDOM) > + . = ALIGN(POINTER_ALIGN); > + __start_vpci_array = .; > + *(.data.vpci) > + __end_vpci_array = .; > +#endif > } :text > > #if defined(BUILD_ID) > @@ -213,6 +220,13 @@ SECTIONS > *(.init_array) > *(SORT(.init_array.*)) > __ctors_end = .; > + > +#if defined(CONFIG_HAS_PCI) && !defined(CONFIG_LATE_HWDOM) > + . = ALIGN(POINTER_ALIGN); > + __start_vpci_array = .; > + *(.data.vpci) > + __end_vpci_array = .; > +#endif > } :text > > #ifdef EFI > diff --git a/xen/drivers/Makefile b/xen/drivers/Makefile > index 19391802a8..d51c766453 100644 > --- a/xen/drivers/Makefile > +++ b/xen/drivers/Makefile > @@ -1,6 +1,6 @@ > subdir-y += char > subdir-$(CONFIG_HAS_CPUFREQ) += cpufreq > -subdir-$(CONFIG_HAS_PCI) += pci > +subdir-$(CONFIG_HAS_PCI) += pci vpci > subdir-$(CONFIG_HAS_PASSTHROUGH) += passthrough > subdir-$(CONFIG_ACPI) += acpi > subdir-$(CONFIG_VIDEO) += video > diff --git a/xen/drivers/passthrough/pci.c b/xen/drivers/passthrough/pci.c > index 469dfc6c3d..519993d536 100644 > --- a/xen/drivers/passthrough/pci.c > +++ b/xen/drivers/passthrough/pci.c > @@ -31,6 +31,7 @@ > #include <xen/radix-tree.h> > #include <xen/softirq.h> > #include <xen/tasklet.h> > +#include <xen/vpci.h> > #include <xsm/xsm.h> > #include <asm/msi.h> > #include "ats.h" > @@ -1052,10 +1053,10 @@ static void __hwdom_init > setup_one_hwdom_device(const struct setup_hwdom *ctxt, > struct pci_dev *pdev) > { > u8 devfn = pdev->devfn; > + int err; > > do { > - int err = ctxt->handler(devfn, pdev); > - > + err = ctxt->handler(devfn, pdev); > if ( err ) > { > printk(XENLOG_ERR "setup %04x:%02x:%02x.%u for d%d failed > (%d)\n", > @@ -1067,6 +1068,11 @@ static void __hwdom_init > setup_one_hwdom_device(const struct setup_hwdom *ctxt, > devfn += pdev->phantom_stride; > } while ( devfn != pdev->devfn && > PCI_SLOT(devfn) == PCI_SLOT(pdev->devfn) ); > + > + err = vpci_add_handlers(pdev); > + if ( err ) > + printk(XENLOG_ERR "setup of vPCI for d%d failed: %d\n", > + ctxt->d->domain_id, err); > } > > static int __hwdom_init _setup_hwdom_pci_devices(struct pci_seg *pseg, > void *arg) > diff --git a/xen/drivers/vpci/Makefile b/xen/drivers/vpci/Makefile > new file mode 100644 > index 0000000000..840a906470 > --- /dev/null > +++ b/xen/drivers/vpci/Makefile > @@ -0,0 +1 @@ > +obj-y += vpci.o > diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c > new file mode 100644 > index 0000000000..788825f5fd > --- /dev/null > +++ b/xen/drivers/vpci/vpci.c > @@ -0,0 +1,451 @@ > +/* > + * Generic functionality for handling accesses to the PCI configuration space > + * from guests. > + * > + * Copyright (C) 2017 Citrix Systems R&D > + * > + * This program is free software; you can redistribute it and/or > + * modify it under the terms and conditions of the GNU General Public > + * License, version 2, as published by the Free Software Foundation. > + * > + * This program is distributed in the hope that it will be useful, > + * but WITHOUT ANY WARRANTY; without even the implied warranty of > + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the > GNU > + * General Public License for more details. > + * > + * You should have received a copy of the GNU General Public > + * License along with this program; If not, see > <http://www.gnu.org/licenses/>. > + */ > + > +#include <xen/sched.h> > +#include <xen/vpci.h> > + > +extern vpci_register_init_t *const __start_vpci_array[]; > +extern vpci_register_init_t *const __end_vpci_array[]; > +#define NUM_VPCI_INIT (__end_vpci_array - __start_vpci_array) > + > +/* Internal struct to store the emulated PCI registers. */ > +struct vpci_register { > + vpci_read_t *read; > + vpci_write_t *write; > + unsigned int size; > + unsigned int offset; > + void *private; > + struct list_head node; > +}; > + > +int __hwdom_init vpci_add_handlers(struct pci_dev *pdev) > +{ > + unsigned int i; > + int rc = 0; > + > + if ( !has_vpci(pdev->domain) ) > + return 0; > + > + pdev->vpci = xzalloc(struct vpci); > + if ( !pdev->vpci ) > + return -ENOMEM; > + > + INIT_LIST_HEAD(&pdev->vpci->handlers); > + spin_lock_init(&pdev->vpci->lock); > + > + for ( i = 0; i < NUM_VPCI_INIT; i++ ) > + { > + rc = __start_vpci_array[i](pdev); > + if ( rc ) > + break; > + } > + > + if ( rc ) > + { > + while ( !list_empty(&pdev->vpci->handlers) ) > + { > + struct vpci_register *r = list_first_entry(&pdev->vpci->handlers, > + struct vpci_register, > + node); > + > + list_del(&r->node); > + xfree(r); > + } > + xfree(pdev->vpci); > + pdev->vpci = NULL; > + } > + > + return rc; > +} > + > +static int vpci_register_cmp(const struct vpci_register *r1, > + const struct vpci_register *r2) > +{ > + /* Return 0 if registers overlap. */ > + if ( r1->offset < r2->offset + r2->size && > + r2->offset < r1->offset + r1->size ) > + return 0; > + if ( r1->offset < r2->offset ) > + return -1; > + if ( r1->offset > r2->offset ) > + return 1; > + > + ASSERT_UNREACHABLE(); > + return 0; > +} > + > +/* Dummy hooks, writes are ignored, reads return 1's */ > +static uint32_t vpci_ignored_read(const struct pci_dev *pdev, unsigned int > reg, > + void *data) > +{ > + return ~(uint32_t)0; > +} > + > +static void vpci_ignored_write(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data) > +{ > +} > + > +int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, > + vpci_write_t *write_handler, unsigned int offset, > + unsigned int size, void *data) > +{ > + struct list_head *prev; > + struct vpci_register *r; > + > + /* Some sanity checks. */ > + if ( (size != 1 && size != 2 && size != 4) || > + offset >= PCI_CFG_SPACE_EXP_SIZE || (offset & (size - 1)) || > + (!read_handler && !write_handler) ) > + return -EINVAL; > + > + r = xmalloc(struct vpci_register); > + if ( !r ) > + return -ENOMEM; > + > + r->read = read_handler ?: vpci_ignored_read; > + r->write = write_handler ?: vpci_ignored_write; > + r->size = size; > + r->offset = offset; > + r->private = data; > + > + spin_lock(&vpci->lock); > + > + /* The list of handlers must be kept sorted at all times. */ > + list_for_each ( prev, &vpci->handlers ) > + { > + const struct vpci_register *this = > + list_entry(prev, const struct vpci_register, node); > + int cmp = vpci_register_cmp(r, this); > + > + if ( cmp < 0 ) > + break; > + if ( cmp == 0 ) > + { > + spin_unlock(&vpci->lock); > + xfree(r); > + return -EEXIST; > + } > + } > + > + list_add_tail(&r->node, prev); > + spin_unlock(&vpci->lock); > + > + return 0; > +} > + > +int vpci_remove_register(struct vpci *vpci, unsigned int offset, > + unsigned int size) > +{ > + const struct vpci_register r = { .offset = offset, .size = size }; > + struct vpci_register *rm; > + > + spin_lock(&vpci->lock); > + list_for_each_entry ( rm, &vpci->handlers, node ) > + { > + int cmp = vpci_register_cmp(&r, rm); > + > + /* > + * NB: do not use a switch so that we can use break to > + * get out of the list loop earlier if required. > + */ > + if ( !cmp && rm->offset == offset && rm->size == size ) > + { > + list_del(&rm->node); > + spin_unlock(&vpci->lock); > + xfree(rm); > + return 0; > + } > + if ( cmp <= 0 ) > + break; > + } > + spin_unlock(&vpci->lock); > + > + return -ENOENT; > +} > + > +/* Wrappers for performing reads/writes to the underlying hardware. */ > +static uint32_t vpci_read_hw(pci_sbdf_t sbdf, unsigned int reg, > + unsigned int size) > +{ > + uint32_t data; > + > + switch ( size ) > + { > + case 4: > + data = pci_conf_read32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg); > + break; > + case 3: > + /* > + * This is possible because a 4byte read can have 1byte trapped and > + * the rest passed-through. > + */ > + if ( reg & 1 ) > + { > + data = pci_conf_read8(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, > + reg); > + data |= pci_conf_read16(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, > + reg + 1) << 8; > + } > + else > + { > + data = pci_conf_read16(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, > + reg); > + data |= pci_conf_read8(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, > + reg + 2) << 16; > + } > + break; > + case 2: > + data = pci_conf_read16(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg); > + break; > + case 1: > + data = pci_conf_read8(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg); > + break; > + default: > + ASSERT_UNREACHABLE(); > + data = ~(uint32_t)0; > + break; > + } > + > + return data; > +} > + > +static void vpci_write_hw(pci_sbdf_t sbdf, unsigned int reg, unsigned int > size, > + uint32_t data) > +{ > + switch ( size ) > + { > + case 4: > + pci_conf_write32(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg, data); > + break; > + case 3: > + /* > + * This is possible because a 4byte write can have 1byte trapped and > + * the rest passed-through. > + */ > + if ( reg & 1 ) > + { > + pci_conf_write8(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg, > + data); > + pci_conf_write16(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg + > 1, > + data >> 8); > + } > + else > + { > + pci_conf_write16(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg, > + data); > + pci_conf_write8(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg + 2, > + data >> 16); > + } > + break; > + case 2: > + pci_conf_write16(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg, data); > + break; > + case 1: > + pci_conf_write8(sbdf.seg, sbdf.bus, sbdf.dev, sbdf.func, reg, data); > + break; > + default: > + ASSERT_UNREACHABLE(); > + break; > + } > +} > + > +/* > + * Merge new data into a partial result. > + * > + * Copy the value found in 'new' from [0, size) left shifted by > + * 'offset' into 'data'. Note that both 'size' and 'offset' are > + * in byte units. > + */ > +static uint32_t merge_result(uint32_t data, uint32_t new, unsigned int size, > + unsigned int offset) > +{ > + uint32_t mask = 0xffffffff >> (32 - 8 * size); > + > + return (data & ~(mask << (offset * 8))) | ((new & mask) << (offset * 8)); > +} > + > +uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) > +{ > + const struct domain *d = current->domain; > + const struct pci_dev *pdev; > + const struct vpci_register *r; > + unsigned int data_offset = 0; > + uint32_t data = ~(uint32_t)0; > + > + /* Find the PCI dev matching the address. */ > + pdev = pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.extfunc); > + if ( !pdev ) > + return vpci_read_hw(sbdf, reg, size); > + > + spin_lock(&pdev->vpci->lock); > + > + /* Read from the hardware or the emulated register handlers. */ > + list_for_each_entry ( r, &pdev->vpci->handlers, node ) > + { > + const struct vpci_register emu = { > + .offset = reg + data_offset, > + .size = size - data_offset > + }; > + int cmp = vpci_register_cmp(&emu, r); > + uint32_t val; > + unsigned int read_size; > + > + if ( cmp < 0 ) > + break; > + if ( cmp > 0 ) > + continue; > + > + if ( emu.offset < r->offset ) > + { > + /* Heading gap, read partial content from hardware. */ > + read_size = r->offset - emu.offset; > + val = vpci_read_hw(sbdf, emu.offset, read_size); > + data = merge_result(data, val, read_size, data_offset); > + data_offset += read_size; > + } > + > + val = r->read(pdev, r->offset, r->private); > + > + /* Check if the read is in the middle of a register. */ > + if ( r->offset < emu.offset ) > + val >>= (emu.offset - r->offset) * 8; > + > + /* Find the intersection size between the two sets. */ > + read_size = min(emu.offset + emu.size, r->offset + r->size) - > + max(emu.offset, r->offset); > + /* Merge the emulated data into the native read value. */ > + data = merge_result(data, val, read_size, data_offset); > + data_offset += read_size; > + if ( data_offset == size ) > + break; > + ASSERT(data_offset < size); > + } > + > + if ( data_offset < size ) > + { > + /* Tailing gap, read the remaining. */ > + uint32_t tmp_data = vpci_read_hw(sbdf, reg + data_offset, > + size - data_offset); > + > + data = merge_result(data, tmp_data, size - data_offset, data_offset); > + } > + spin_unlock(&pdev->vpci->lock); > + > + return data & (0xffffffff >> (32 - 8 * size)); > +} > + > +/* > + * Perform a maybe partial write to a register. > + * > + * Note that this will only work for simple registers, if Xen needs to > + * trap accesses to rw1c registers (like the status PCI header register) > + * the logic in vpci_write will have to be expanded in order to correctly > + * deal with them. > + */ > +static void vpci_write_helper(const struct pci_dev *pdev, > + const struct vpci_register *r, unsigned int > size, > + unsigned int offset, uint32_t data) > +{ > + ASSERT(size <= r->size); > + > + if ( size != r->size ) > + { > + uint32_t val; > + > + val = r->read(pdev, r->offset, r->private); > + data = merge_result(val, data, size, offset); > + } > + > + r->write(pdev, r->offset, data & (0xffffffff >> (32 - 8 * r->size)), > + r->private); > +} > + > +void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size, > + uint32_t data) > +{ > + const struct domain *d = current->domain; > + const struct pci_dev *pdev; > + const struct vpci_register *r; > + unsigned int data_offset = 0; > + > + /* > + * Find the PCI dev matching the address. > + * Passthrough everything that's not trapped. > + */ > + pdev = pci_get_pdev_by_domain(d, sbdf.seg, sbdf.bus, sbdf.extfunc); > + if ( !pdev ) > + { > + vpci_write_hw(sbdf, reg, size, data); > + return; > + } > + > + spin_lock(&pdev->vpci->lock); > + > + /* Write the value to the hardware or emulated registers. */ > + list_for_each_entry ( r, &pdev->vpci->handlers, node ) > + { > + const struct vpci_register emu = { > + .offset = reg + data_offset, > + .size = size - data_offset > + }; > + int cmp = vpci_register_cmp(&emu, r); > + unsigned int write_size; > + > + if ( cmp < 0 ) > + break; > + if ( cmp > 0 ) > + continue; > + > + if ( emu.offset < r->offset ) > + { > + /* Heading gap, write partial content to hardware. */ > + vpci_write_hw(sbdf, emu.offset, r->offset - emu.offset, > + data >> (data_offset * 8)); > + data_offset += r->offset - emu.offset; > + } > + > + /* Find the intersection size between the two sets. */ > + write_size = min(emu.offset + emu.size, r->offset + r->size) - > + max(emu.offset, r->offset); > + vpci_write_helper(pdev, r, write_size, reg + data_offset - r->offset, > + data >> (data_offset * 8)); > + data_offset += write_size; > + if ( data_offset == size ) > + break; > + ASSERT(data_offset < size); > + } > + > + if ( data_offset < size ) > + /* Tailing gap, write the remaining. */ > + vpci_write_hw(sbdf, reg + data_offset, size - data_offset, > + data >> (data_offset * 8)); > + > + spin_unlock(&pdev->vpci->lock); > +} > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * tab-width: 4 > + * indent-tabs-mode: nil > + * End: > + */ > diff --git a/xen/include/asm-x86/domain.h b/xen/include/asm-x86/domain.h > index 4d0b77dc28..72a3dd8e89 100644 > --- a/xen/include/asm-x86/domain.h > +++ b/xen/include/asm-x86/domain.h > @@ -430,6 +430,7 @@ struct arch_domain > #define has_vpit(d) (!!((d)->arch.emulation_flags & > XEN_X86_EMU_PIT)) > #define has_pirq(d) (!!((d)->arch.emulation_flags & \ > XEN_X86_EMU_USE_PIRQ)) > +#define has_vpci(d) (!!((d)->arch.emulation_flags & > XEN_X86_EMU_VPCI)) > > #define has_arch_pdevs(d) (!list_empty(&(d)->arch.pdev_list)) > > diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h > index 707665fbba..ff0bea5d53 100644 > --- a/xen/include/asm-x86/hvm/io.h > +++ b/xen/include/asm-x86/hvm/io.h > @@ -160,6 +160,9 @@ unsigned int hvm_pci_decode_addr(unsigned int cf8, > unsigned int addr, > */ > void register_g2m_portio_handler(struct domain *d); > > +/* HVM port IO handler for vPCI accesses. */ > +void register_vpci_portio_handler(struct domain *d); > + > #endif /* __ASM_X86_HVM_IO_H__ */ > > > diff --git a/xen/include/public/arch-x86/xen.h b/xen/include/public/arch- > x86/xen.h > index ff918310f6..06ef4772cd 100644 > --- a/xen/include/public/arch-x86/xen.h > +++ b/xen/include/public/arch-x86/xen.h > @@ -293,12 +293,15 @@ struct xen_arch_domainconfig { > #define XEN_X86_EMU_PIT (1U<<_XEN_X86_EMU_PIT) > #define _XEN_X86_EMU_USE_PIRQ 9 > #define XEN_X86_EMU_USE_PIRQ (1U<<_XEN_X86_EMU_USE_PIRQ) > +#define _XEN_X86_EMU_VPCI 10 > +#define XEN_X86_EMU_VPCI (1U<<_XEN_X86_EMU_VPCI) > > #define XEN_X86_EMU_ALL (XEN_X86_EMU_LAPIC | > XEN_X86_EMU_HPET | \ > XEN_X86_EMU_PM | XEN_X86_EMU_RTC | > \ > XEN_X86_EMU_IOAPIC | XEN_X86_EMU_PIC | > \ > XEN_X86_EMU_VGA | XEN_X86_EMU_IOMMU | > \ > - XEN_X86_EMU_PIT | XEN_X86_EMU_USE_PIRQ) > + XEN_X86_EMU_PIT | XEN_X86_EMU_USE_PIRQ > |\ > + XEN_X86_EMU_VPCI) > uint32_t emulation_flags; > }; > > diff --git a/xen/include/xen/pci.h b/xen/include/xen/pci.h > index dd5ec43a70..b7a6abfc53 100644 > --- a/xen/include/xen/pci.h > +++ b/xen/include/xen/pci.h > @@ -112,6 +112,9 @@ struct pci_dev { > #define PT_FAULT_THRESHOLD 10 > } fault; > u64 vf_rlen[6]; > + > + /* Data for vPCI. */ > + struct vpci *vpci; > }; > > #define for_each_pdev(domain, pdev) \ > diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h > index ecd6124d91..cc4ee3b83e 100644 > --- a/xen/include/xen/pci_regs.h > +++ b/xen/include/xen/pci_regs.h > @@ -23,6 +23,14 @@ > #define LINUX_PCI_REGS_H > > /* > + * Conventional PCI and PCI-X Mode 1 devices have 256 bytes of > + * configuration space. PCI-X Mode 2 and PCIe devices have 4096 bytes of > + * configuration space. > + */ > +#define PCI_CFG_SPACE_SIZE 256 > +#define PCI_CFG_SPACE_EXP_SIZE 4096 > + > +/* > * Under PCI, each device has 256 bytes of configuration address space, > * of which the first 64 bytes are standardized as follows: > */ > diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h > new file mode 100644 > index 0000000000..9f2864fb0c > --- /dev/null > +++ b/xen/include/xen/vpci.h > @@ -0,0 +1,53 @@ > +#ifndef _XEN_VPCI_H_ > +#define _XEN_VPCI_H_ > + > +#include <xen/pci.h> > +#include <xen/types.h> > +#include <xen/list.h> > + > +typedef uint32_t vpci_read_t(const struct pci_dev *pdev, unsigned int reg, > + void *data); > + > +typedef void vpci_write_t(const struct pci_dev *pdev, unsigned int reg, > + uint32_t val, void *data); > + > +typedef int vpci_register_init_t(struct pci_dev *dev); > + > +#define REGISTER_VPCI_INIT(x) \ > + static vpci_register_init_t *const x##_entry \ > + __used_section(".data.vpci") = x > + > +/* Add vPCI handlers to device. */ > +int __must_check vpci_add_handlers(struct pci_dev *dev); > + > +/* Add/remove a register handler. */ > +int __must_check vpci_add_register(struct vpci *vpci, > + vpci_read_t *read_handler, > + vpci_write_t *write_handler, > + unsigned int offset, unsigned int size, > + void *data); > +int __must_check vpci_remove_register(struct vpci *vpci, unsigned int > offset, > + unsigned int size); > + > +/* Generic read/write handlers for the PCI config space. */ > +uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size); > +void vpci_write(pci_sbdf_t sbdf, unsigned int reg, unsigned int size, > + uint32_t data); > + > +struct vpci { > + /* List of vPCI handlers for a device. */ > + struct list_head handlers; > + spinlock_t lock; > +}; > + > +#endif > + > +/* > + * Local variables: > + * mode: C > + * c-file-style: "BSD" > + * c-basic-offset: 4 > + * tab-width: 4 > + * indent-tabs-mode: nil > + * End: > + */ > -- > 2.13.5 (Apple Git-94) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |