[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Dealing with non-existent BDF devices in VT-d and in the hardware.
On Fri, Mar 14, 2014 at 02:18:52AM +0000, Zhang, Yang Z wrote: > Konrad Rzeszutek Wilk wrote on 2014-03-12: > > On Tue, Mar 11, 2014 at 05:36:36PM +0000, Andrew Cooper wrote: > > > On 11/03/14 17:30, Konrad Rzeszutek Wilk wrote: > > > > Hey, > > > > > > > > I am one of those lucky folks who had purchased a motherboard that has > > bugs. > > > > > > You say this as if you expect someone has managed to find a bugfree > > > motherboard :) > > > > One can dream :-) > > > > > > > > > > > I figured I would post this email as way for a starting point for > > > > some discussion on this - and perhaps have a similar as 'pci-phantom' > > > > way of instructing the hypervisor what to do with them. > > > > > > > > The problem I am seeing is that this device: > > > > > > > > 08:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22A > > > > IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx] > > > > > > > > Can't be passed in the guest. Or rather it can - but everytime the > > > > guest (or domain0) tries to access I see: > > > > > > > > (XEN) [VT-D]iommu.c:885: iommu_fault_status: Fault Overflow > > > > (XEN) [VT-D]iommu.c:887: iommu_fault_status: Primary Pending Fault > > > > (XEN) [VT-D]iommu.c:865: DMAR:[DMA Write] Request device > > > > [0000:08:00.0] fault addr 0, iommu reg = ffff82c3ffd53000 > > > > (XEN) DMAR:[fault reason 02h] Present bit in context entry is clear > > > > (XEN) print_vtd_entries: iommu ffff83043dca99b0 dev 0000:08:00.0 gmfn 0 > > > > (XEN) root_entry = ffff83043dc6b000 > > > > (XEN) root_entry[8] = 3326b5001 > > > > (XEN) context = ffff8303326b5000 > > > > (XEN) context[0] = 0_0 > > > > (XEN) ctxt_entry[0] not present > > > > > > > > > > > > Of course the '08:00.0' device does not exist. It is rather this > > > > chipset: > > > > 07:00.0 PCI bridge: Tundra Semiconductor Corp. Device 8113 (rev 01) > > > > > > > > that is buggy and using the wrong BDF when forwarding DMA requests > > > > from devices underneath it (like this Firewire chip). > > > > > > > > The hack I came up with was to create in the Xen code that deals > > > > with PCI passthrough a copy of the bridge (so 07:00.0) but with a > > > > new > > > > BDF: 08:00.0. And link it to the PCI device that I am passing to the > > > > guest (so 08:03.0). > > > > > > > > The end result is that when loading the driver (hack.c) one should > > > > see: > > > > > > > > (XEN) 0000:08:00.0 linked with 08:03.0 > > > > (XEN) [VT-D]iommu.c:1456: d0:PCI: map 0000:08:00.0 > > > > (XEN) [VT-D]iommu.c:1476: d0:PCI: map 0000:08:03.0 > > > > (XEN) PCI add link 0000:08:00.0 > > > > > > > > And when launching a guest with the BDF: > > > > pci = ["08:03.0"] > > > > > > > > the hypervisor will automatically also create an VT-d context for > > > > the > > > > 08:00.0 device. > > > > > > > > To use this hack, apply the > > > > 0001-xen-pci-Introduce-a-way-to-deal-with-buggy-hardware-.patch > > > > to your hypervisor, compile and install. > > > > > > > > And also compile the 'hack.c' module. There is an attached 'Makefile' > > > > that will do it for you. Make sure you edit it to set the right BDF > > > > entries in it. > > > > > > > > Once done install your new hypervisor, and insmod ./hack.ko and try > > > > passing in the device to your guest (or use it normally). The > > > > 'DMAR:[DMA Write]' error should go away. > > > > > > > > This should be generic enough for most devices. It needn't be a > > > > bridge that is spewing out these DMAR errors. > > > > > > > > > Do you have an lspci -tv for the system? > > > > Yes of course: > > > > -[0000:00]-+-00.0 Intel Corporation Xeon E3-1200 v3 Processor DRAM > > Controller > > +-01.0-[01]--+-00.0 Intel Corporation 82576 Gigabit Network > > Connection > > | \-00.1 Intel Corporation 82576 Gigabit Network > > Connection > > +-01.1-[02]----00.0 LSI Logic / Symbios Logic SAS2008 > > PCI-Express Fusion-MPT SAS-2 [Falcon] > > +-02.0 Intel Corporation Xeon E3-1200 v3 Processor Integrated > > Graphics Controller > > +-03.0 Intel Corporation Xeon E3-1200 v3/4th Gen Core > > Processor HD Audio Controller > > +-14.0 Intel Corporation 8 Series/C220 Series Chipset Family > > USB xHCI > > +-16.0 Intel Corporation 8 Series/C220 Series Chipset Family > > MEI Controller #1 > > +-19.0 Intel Corporation Ethernet Connection I217-LM > > +-1a.0 Intel Corporation 8 Series/C220 Series Chipset Family > > USB EHCI #2 > > +-1b.0 Intel Corporation 8 Series/C220 Series Chipset High > > Definition Audio Controller > > +-1c.0-[03]----00.0 Intel Corporation 82574L Gigabit Network > > Connection > > +-1c.1-[04]----00.0 Intel Corporation 82574L Gigabit Network > > Connection > > +-1c.3-[05]----00.0 Intel Corporation I210 Gigabit Network > > Connection > > +-1c.4-[06]--+-00.0 Intel Corporation 82571EB Gigabit Ethernet > > Controller > > | \-00.1 Intel Corporation 82571EB Gigabit > > Ethernet Controller > > +-1c.5-[07-09]----00.0-[08-09]--+-01.0-[09]--+-08.0 Brooktree > > Corporation Bt878 Video Capture > > | | +-08.1 > > Brooktree Corporation Bt878 Audio Capture > > | | +-09.0 > > Brooktree Corporation Bt878 Video Capture > > | | +-09.1 > > Brooktree Corporation Bt878 Audio Capture > > | | +-0a.0 > > Brooktree Corporation Bt878 Video Capture > > | | +-0a.1 > > Brooktree Corporation Bt878 Audio Capture > > | | +-0b.0 > > Brooktree Corporation Bt878 Video Capture > > | | \-0b.1 > > Brooktree Corporation Bt878 Audio Capture > > | \-03.0 Texas > > Instruments TSB43AB22A IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx] > > +-1c.6-[0a]----00.0 Renesas Technology Corp. uPD720202 USB > > 3.0 Host Controller > > +-1c.7-[0b]----00.0 ASMedia Technology Inc. ASM1062 Serial ATA > > Controller > > +-1d.0 Intel Corporation 8 Series/C220 Series Chipset Family > > USB EHCI #1 > > +-1f.0 Intel Corporation C226 Series Chipset Family Server > > Advanced SKU LPC Controller > > +-1f.2 Intel Corporation 8 Series/C220 Series Chipset Family > > 6-port SATA Controller 1 [AHCI mode] > > +-1f.3 Intel Corporation 8 Series/C220 Series Chipset Family > > SMBus Controller > > \-1f.6 Intel Corporation 8 Series Chipset Family Thermal > > Management Controller > > > > What happens if you assign the devices under bus 09 to another guest? Hadn't tried that. I think it would all blow up as the the non-existent bridge is now assigned to one guest and the phantom DMA requests for the 09 would show up under the 08 device. I think I would corrupt the guest memory with random DMA writes. > Is it better to add Xen command line to add such devices to a group and > assign the whole group to a guest when trying to assign a device of the group > to guest? Or implement the group assigment in QEMU or libxl so that nobody tries doing it. > > > > > > > It is genuinely the case that the bridge doesn't exist, or simply that > > > it is not correctly attributed in the DMAR table? > > > > It does not exist. The DMAR looks correct. > > > > (XEN) [VT-D]dmar.c:778: Host address width 39 > > (XEN) [VT-D]dmar.c:792: found ACPI_DMAR_DRHD: > > (XEN) [VT-D]dmar.c:472: dmaru->address = fed90000 > > (XEN) [VT-D]iommu.c:1158: drhd->address = fed90000 iommu->reg = > > ffff82c3ffd54000 > > (XEN) [VT-D]iommu.c:1160: cap = c0000020660462 ecap = f0101a > > (XEN) [VT-D]dmar.c:383: endpoint: 0000:00:02.0 > > (XEN) [VT-D]dmar.c:792: found ACPI_DMAR_DRHD: > > (XEN) [VT-D]dmar.c:472: dmaru->address = fed91000 > > (XEN) [VT-D]iommu.c:1158: drhd->address = fed91000 iommu->reg = > > ffff82c3ffd53000 > > (XEN) [VT-D]iommu.c:1160: cap = d2008020660462 ecap = f010da > > (XEN) [VT-D]dmar.c:397: IOAPIC: 0000:f0:1f.0 > > (XEN) [VT-D]dmar.c:361: MSI HPET: 0000:f0:0f.0 > > (XEN) [VT-D]dmar.c:486: flags: INCLUDE_ALL > > (XEN) [VT-D]dmar.c:797: found ACPI_DMAR_RMRR: > > (XEN) [VT-D]dmar.c:383: endpoint: 0000:00:1d.0 > > (XEN) [VT-D]dmar.c:383: endpoint: 0000:00:1a.0 > > (XEN) [VT-D]dmar.c:383: endpoint: 0000:00:14.0 > > (XEN) [VT-D]dmar.c:666: RMRR region: base_addr b7530000 end_address > > b753cfff > > (XEN) [VT-D]dmar.c:797: found ACPI_DMAR_RMRR: > > (XEN) [VT-D]dmar.c:383: endpoint: 0000:00:02.0 > > (XEN) [VT-D]dmar.c:666: RMRR region: base_addr bc000000 end_address > > be1fffff > > > > As it has the INCLUDE_ALL flag. > > > > > > If the latter, it Xen can probably gain some DMAR[$FOO]=$BAR command > > > line workarounds similar to the IVRS ones for AMD systems. > > > > > > > > > ~Andrew> > > _______________________________________________ > > Xen-devel mailing list > > Xen-devel@xxxxxxxxxxxxx > > http://lists.xen.org/xen-devel > > > Best regards, > Yang > > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |