[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v9 10/16] vpci/header: emulate PCI_COMMAND register for guests
On Tue, Aug 29, 2023 at 11:19:44PM +0000, Volodymyr Babchuk wrote: > From: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> > > Xen and/or Dom0 may have put values in PCI_COMMAND which they expect > to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the > guest's view of this will want to be zero initially, the host having set > it to 1 may not easily be overwritten with 0, or else we'd effectively > imply giving the guest control of the bit. Thus, PCI_COMMAND register needs > proper emulation in order to honor host's settings. > > According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2 > Device Control" the reset state of the command register is typically 0, > so when assigning a PCI device use 0 as the initial state for the guest's view > of the command register. > > Here is the full list of command register bits with notes about > emulation: > > PCI_COMMAND_IO - Allow guest to control it. > PCI_COMMAND_MEMORY - Already handled. > PCI_COMMAND_MASTER - Allow guest to control it. > PCI_COMMAND_SPECIAL - Guest can generate special cycles only if it has > access to host bridge that supports software generation of special > cycles. In our case guest has no access to host bridges at all. Value > after reset is 0. > PCI_COMMAND_INVALIDATE - Allows "Memory Write and Invalidate" commands > to be generated. It requires additional configuration via Cacheline > Size register. We are not emulating this register right now and we > can't expect guest to properly configure it. > PCI_COMMAND_VGA_PALETTE - Enable VGA palette snooping. This bit is set > by firmware and we want to leave it as is. > PCI_COMMAND_PARITY - Controls how device response to parity > errors. We want this bit to be set by a hardware domain. > PCI_COMMAND_WAIT - Reserved. Should be 0. > PCI_COMMAND_SERR - Controls if device can assert SERR. > The same as for COMMAND_PARITY. > PCI_COMMAND_FAST_BACK - Optional bit that allows fast back-to-back > transactions. It is configured by firmware, so we don't want guest to > control it. > PCI_COMMAND_INTX_DISABLE - Disables INTx signals. If MSI(X) is > enabled, device is prohibited from asserting INTx. Value after reset > is 0. Guest can control it freely. I'm kind of confused by the text above, does "Guest can control it freely" imply that the guest is able to modify the bit in the device command register vs the emulated view that we provide? If so INTX_DISABLE should not be allowed direct guest modification. I'm thinking that we might want to allow guest access to the first 3 bits only, while the rest of the values won't be propagated to hardware, iow: guest will get a fake view of them. Have you checked with QEMU how are those bits handled? That's our current passthrough reference implementation, and we should aim to handle those similarly in vPCI. > > Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@xxxxxxxx> > Signed-off-by: Volodymyr Babchuk <volodymyr_babchuk@xxxxxxxx> > --- > Since v9: > - Reworked guest_cmd_read > - Added handling for more bits > Since v6: > - fold guest's logic into cmd_write > - implement cmd_read, so we can report emulated INTx state to guests > - introduce header->guest_cmd to hold the emulated state of the > PCI_COMMAND register for guests > Since v5: > - add additional check for MSI-X enabled while altering INTX bit > - make sure INTx disabled while guests enable MSI/MSI-X > Since v3: > - gate more code on CONFIG_HAS_MSI > - removed logic for the case when MSI/MSI-X not enabled > --- > xen/drivers/vpci/header.c | 54 ++++++++++++++++++++++++++++++++++++--- > xen/drivers/vpci/msi.c | 10 ++++++++ > xen/drivers/vpci/msix.c | 4 +++ > xen/include/xen/vpci.h | 3 +++ > 4 files changed, 67 insertions(+), 4 deletions(-) > > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c > index 1e82217200..e351db4620 100644 > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -502,14 +502,37 @@ static int modify_bars(const struct pci_dev *pdev, > uint16_t cmd, bool rom_only) > return 0; > } > > +/* TODO: Add proper emulation for all bits of the command register. */ > static void cf_check cmd_write( > const struct pci_dev *pdev, unsigned int reg, uint32_t cmd, void *data) > { > struct vpci_header *header = data; > > + if ( !is_hardware_domain(pdev->domain) ) > + { > + if ( IS_ENABLED(CONFIG_HAS_PCI_MSI) ) > + { > + /* Tell guest that device does not support this */ > + cmd &= ~PCI_COMMAND_FAST_BACK; > + } > + > + header->guest_cmd = cmd; > + > + if ( IS_ENABLED(CONFIG_HAS_PCI_MSI) ) > + { > + /* Do not touch INVALIDATE, PARITY and SERR */ > + const uint16_t excluded = PCI_COMMAND_INVALIDATE | > + PCI_COMMAND_PARITY | PCI_COMMAND_SERR; > + > + cmd &= ~excluded; > + cmd |= pci_conf_read16(pdev->sbdf, reg) & excluded; > + } I'm not following why allowing guest setting of those bits is conditional on HAS_PCI_MSI being build time enabled. Isn't it equally good or bad to let the guest play with certain bits regardless of Xen build time configuration? As said above, I would look at QEMU and see how bits are handled there. > + } > + > /* > - * Let Dom0 play with all the bits directly except for the memory > - * decoding one. > + * Let guest play with all the bits directly except for the memory > + * decoding one. Bits that are not allowed for DomU are already > + * handled above. > */ > if ( header->bars_mapped != !!(cmd & PCI_COMMAND_MEMORY) ) > /* > @@ -523,6 +546,14 @@ static void cf_check cmd_write( > pci_conf_write16(pdev->sbdf, reg, cmd); > } > > +static uint32_t guest_cmd_read(const struct pci_dev *pdev, unsigned int reg, > + void *data) > +{ > + const struct vpci_header *header = data; > + > + return header->guest_cmd; > +} > + > static void cf_check bar_write( > const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data) > { > @@ -732,8 +763,12 @@ static int cf_check init_bars(struct pci_dev *pdev) > } > > /* Setup a handler for the command register. */ > - rc = vpci_add_register(pdev->vpci, vpci_hw_read16, cmd_write, > PCI_COMMAND, > - 2, header); > + if ( is_hwdom ) > + rc = vpci_add_register(pdev->vpci, vpci_hw_read16, cmd_write, > PCI_COMMAND, > + 2, header); > + else > + rc = vpci_add_register(pdev->vpci, guest_cmd_read, cmd_write, > PCI_COMMAND, > + 2, header); You have used the ternary operator in other places, I would recommend to also do it here to avoid line duplication. > if ( rc ) > return rc; > > @@ -745,6 +780,17 @@ static int cf_check init_bars(struct pci_dev *pdev) > if ( cmd & PCI_COMMAND_MEMORY ) > pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd & ~PCI_COMMAND_MEMORY); > > + header->guest_cmd = cmd & ~PCI_COMMAND_MEMORY; Memory Enable is cleared from the guest view, yet at the end of init_bars() mappings will be established if the bit is enabled in cmd. Won't this create a mismatch between the guest view and the contents of the physmap? > + > + /* > + * According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2 > + * Device Control" the reset state of the command register is > + * typically all 0's, so this is used as initial value for the guests. > + */ > + if ( header->guest_cmd != 0 ) > + gprintk(XENLOG_WARNING, "%pp: CMD is not zero: %x", &pdev->sbdf, > + header->guest_cmd); I think it's unlikely for the command register to be zeroed out, I haven't looked, but I would assume that after a device reset by pciback it won't be unlikely for some initial state to be set. > + > for ( i = 0; i < num_bars; i++ ) > { > uint8_t reg = PCI_BASE_ADDRESS_0 + i * 4; > diff --git a/xen/drivers/vpci/msi.c b/xen/drivers/vpci/msi.c > index a0733bb2cb..df0f0199b8 100644 > --- a/xen/drivers/vpci/msi.c > +++ b/xen/drivers/vpci/msi.c > @@ -70,6 +70,16 @@ static void cf_check control_write( > > if ( vpci_msi_arch_enable(msi, pdev, vectors) ) > return; > + > + /* > + * Make sure guest doesn't enable INTx while enabling MSI. > + * Opposite action (enabling INTx) will be performed in > + * vpci_msi_arch_disable call path. I'm not seeing such code in vpci_msi_arch_disable(). However the updating of the INTX field should be done after MSI(X) has been disabled, and hence can only be done after the pci_conf_write16() in control_write(). I would be fine if you want to leave forcing the setting of INTX to enabled once MSI has been disabled, any sane guest will do that itself, but the comment needs updating. Thanks, Roger.
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |