[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Dealing with non-existent BDF devices in VT-d and in the hardware.



On Fri, Mar 14, 2014 at 02:18:52AM +0000, Zhang, Yang Z wrote:
> Konrad Rzeszutek Wilk wrote on 2014-03-12:
> > On Tue, Mar 11, 2014 at 05:36:36PM +0000, Andrew Cooper wrote:
> > > On 11/03/14 17:30, Konrad Rzeszutek Wilk wrote:
> > > > Hey,
> > > >
> > > > I am one of those lucky folks who had purchased a motherboard that has
> > bugs.
> > >
> > > You say this as if you expect someone has managed to find a bugfree
> > > motherboard :)
> > 
> > One can dream :-)
> > >
> > > >
> > > > I figured I would post this email as way for a starting point for
> > > > some discussion on this - and perhaps have a similar as 'pci-phantom'
> > > > way of instructing the hypervisor what to do with them.
> > > >
> > > > The problem I am seeing is that this device:
> > > >
> > > > 08:03.0 FireWire (IEEE 1394): Texas Instruments TSB43AB22A
> > > > IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx]
> > > >
> > > > Can't be passed in the guest. Or rather it can - but everytime the
> > > > guest (or domain0) tries to access I see:
> > > >
> > > > (XEN) [VT-D]iommu.c:885: iommu_fault_status: Fault Overflow
> > > > (XEN) [VT-D]iommu.c:887: iommu_fault_status: Primary Pending Fault
> > > > (XEN) [VT-D]iommu.c:865: DMAR:[DMA Write] Request device
> > > > [0000:08:00.0] fault addr 0, iommu reg = ffff82c3ffd53000
> > > > (XEN) DMAR:[fault reason 02h] Present bit in context entry is clear
> > > > (XEN) print_vtd_entries: iommu ffff83043dca99b0 dev 0000:08:00.0 gmfn 0
> > > > (XEN)     root_entry = ffff83043dc6b000
> > > > (XEN)     root_entry[8] = 3326b5001
> > > > (XEN)     context = ffff8303326b5000
> > > > (XEN)     context[0] = 0_0
> > > > (XEN)     ctxt_entry[0] not present
> > > >
> > > >
> > > > Of course the '08:00.0' device does not exist. It is rather this 
> > > > chipset:
> > > > 07:00.0 PCI bridge: Tundra Semiconductor Corp. Device 8113 (rev 01)
> > > >
> > > > that is buggy and using the wrong BDF when forwarding DMA requests
> > > > from devices underneath it (like this Firewire chip).
> > > >
> > > > The hack I came up with was to create in the Xen code that deals
> > > > with PCI passthrough a copy of the bridge (so 07:00.0) but with a
> > > > new
> > > > BDF: 08:00.0. And link it to the PCI device that I am passing to the
> > > > guest (so 08:03.0).
> > > >
> > > > The end result is that when loading the driver (hack.c) one should
> > > > see:
> > > >
> > > > (XEN) 0000:08:00.0 linked with 08:03.0
> > > > (XEN) [VT-D]iommu.c:1456: d0:PCI: map 0000:08:00.0
> > > > (XEN) [VT-D]iommu.c:1476: d0:PCI: map 0000:08:03.0
> > > > (XEN) PCI add link 0000:08:00.0
> > > >
> > > > And when launching a guest with the BDF:
> > > > pci = ["08:03.0"]
> > > >
> > > > the hypervisor will automatically also create an VT-d context for
> > > > the
> > > > 08:00.0 device.
> > > >
> > > > To use this hack, apply the
> > > > 0001-xen-pci-Introduce-a-way-to-deal-with-buggy-hardware-.patch
> > > > to your hypervisor, compile and install.
> > > >
> > > > And also compile the 'hack.c' module. There is an attached 'Makefile'
> > > > that will do it for you. Make sure you edit it to set the right BDF
> > > > entries in it.
> > > >
> > > > Once done install your new hypervisor, and insmod ./hack.ko and try
> > > > passing in the device to your guest (or use it normally). The
> > > > 'DMAR:[DMA Write]' error should go away.
> > > >
> > > > This should be generic enough for most devices. It needn't be a
> > > > bridge that is spewing out these DMAR errors.
> > >
> > >
> > > Do you have an lspci -tv for the system?
> > 
> > Yes of course:
> > 
> > -[0000:00]-+-00.0  Intel Corporation Xeon E3-1200 v3 Processor DRAM
> > Controller
> >            +-01.0-[01]--+-00.0  Intel Corporation 82576 Gigabit Network
> > Connection
> >            |            \-00.1  Intel Corporation 82576 Gigabit Network
> > Connection
> >            +-01.1-[02]----00.0  LSI Logic / Symbios Logic SAS2008
> > PCI-Express Fusion-MPT SAS-2 [Falcon]
> >            +-02.0  Intel Corporation Xeon E3-1200 v3 Processor Integrated
> > Graphics Controller
> >            +-03.0  Intel Corporation Xeon E3-1200 v3/4th Gen Core
> > Processor HD Audio Controller
> >            +-14.0  Intel Corporation 8 Series/C220 Series Chipset Family
> > USB xHCI
> >            +-16.0  Intel Corporation 8 Series/C220 Series Chipset Family
> > MEI Controller #1
> >            +-19.0  Intel Corporation Ethernet Connection I217-LM
> >            +-1a.0  Intel Corporation 8 Series/C220 Series Chipset Family
> > USB EHCI #2
> >            +-1b.0  Intel Corporation 8 Series/C220 Series Chipset High
> > Definition Audio Controller
> >            +-1c.0-[03]----00.0  Intel Corporation 82574L Gigabit Network
> > Connection
> >            +-1c.1-[04]----00.0  Intel Corporation 82574L Gigabit Network
> > Connection
> >            +-1c.3-[05]----00.0  Intel Corporation I210 Gigabit Network
> > Connection
> >            +-1c.4-[06]--+-00.0  Intel Corporation 82571EB Gigabit Ethernet
> > Controller
> >            |            \-00.1  Intel Corporation 82571EB Gigabit
> > Ethernet Controller
> >            +-1c.5-[07-09]----00.0-[08-09]--+-01.0-[09]--+-08.0  Brooktree
> > Corporation Bt878 Video Capture
> >            |                               |            +-08.1
> > Brooktree Corporation Bt878 Audio Capture
> >            |                               |            +-09.0
> > Brooktree Corporation Bt878 Video Capture
> >            |                               |            +-09.1
> > Brooktree Corporation Bt878 Audio Capture
> >            |                               |            +-0a.0
> > Brooktree Corporation Bt878 Video Capture
> >            |                               |            +-0a.1
> > Brooktree Corporation Bt878 Audio Capture
> >            |                               |            +-0b.0
> > Brooktree Corporation Bt878 Video Capture
> >            |                               |            \-0b.1
> > Brooktree Corporation Bt878 Audio Capture
> >            |                               \-03.0  Texas
> > Instruments TSB43AB22A IEEE-1394a-2000 Controller (PHY/Link) [iOHCI-Lynx]
> >            +-1c.6-[0a]----00.0  Renesas Technology Corp. uPD720202 USB
> > 3.0 Host Controller
> >            +-1c.7-[0b]----00.0  ASMedia Technology Inc. ASM1062 Serial ATA
> > Controller
> >            +-1d.0  Intel Corporation 8 Series/C220 Series Chipset Family
> > USB EHCI #1
> >            +-1f.0  Intel Corporation C226 Series Chipset Family Server
> > Advanced SKU LPC Controller
> >            +-1f.2  Intel Corporation 8 Series/C220 Series Chipset Family
> > 6-port SATA Controller 1 [AHCI mode]
> >            +-1f.3  Intel Corporation 8 Series/C220 Series Chipset Family
> > SMBus Controller
> >            \-1f.6  Intel Corporation 8 Series Chipset Family Thermal
> > Management Controller
> > 
> 
> What happens if you assign the devices under bus 09 to another guest?

Hadn't tried that. I think it would all blow up as the the non-existent
bridge is now assigned to one guest and the phantom DMA requests for the
09 would show up under the 08 device. I think I would corrupt the guest
memory with random DMA writes.

> Is it better to add Xen command line to add such devices to a group and 
> assign the whole group to a guest when trying to assign a device of the group 
> to guest?

Or implement the group assigment in QEMU or libxl so that nobody
tries doing it.
> 
> > >
> > > It is genuinely the case that the bridge doesn't exist, or simply that
> > > it is not correctly attributed in the DMAR table?
> > 
> > It does not exist. The DMAR looks correct.
> > 
> > (XEN) [VT-D]dmar.c:778: Host address width 39
> > (XEN) [VT-D]dmar.c:792: found ACPI_DMAR_DRHD:
> > (XEN) [VT-D]dmar.c:472:   dmaru->address = fed90000
> > (XEN) [VT-D]iommu.c:1158: drhd->address = fed90000 iommu->reg =
> > ffff82c3ffd54000
> > (XEN) [VT-D]iommu.c:1160: cap = c0000020660462 ecap = f0101a
> > (XEN) [VT-D]dmar.c:383:  endpoint: 0000:00:02.0
> > (XEN) [VT-D]dmar.c:792: found ACPI_DMAR_DRHD:
> > (XEN) [VT-D]dmar.c:472:   dmaru->address = fed91000
> > (XEN) [VT-D]iommu.c:1158: drhd->address = fed91000 iommu->reg =
> > ffff82c3ffd53000
> > (XEN) [VT-D]iommu.c:1160: cap = d2008020660462 ecap = f010da
> > (XEN) [VT-D]dmar.c:397:  IOAPIC: 0000:f0:1f.0
> > (XEN) [VT-D]dmar.c:361:  MSI HPET: 0000:f0:0f.0
> > (XEN) [VT-D]dmar.c:486:   flags: INCLUDE_ALL
> > (XEN) [VT-D]dmar.c:797: found ACPI_DMAR_RMRR:
> > (XEN) [VT-D]dmar.c:383:  endpoint: 0000:00:1d.0
> > (XEN) [VT-D]dmar.c:383:  endpoint: 0000:00:1a.0
> > (XEN) [VT-D]dmar.c:383:  endpoint: 0000:00:14.0
> > (XEN) [VT-D]dmar.c:666:   RMRR region: base_addr b7530000 end_address
> > b753cfff
> > (XEN) [VT-D]dmar.c:797: found ACPI_DMAR_RMRR:
> > (XEN) [VT-D]dmar.c:383:  endpoint: 0000:00:02.0
> > (XEN) [VT-D]dmar.c:666:   RMRR region: base_addr bc000000 end_address
> > be1fffff
> > 
> > As it has the INCLUDE_ALL flag.
> > >
> > > If the latter, it Xen can probably gain some DMAR[$FOO]=$BAR command
> > > line workarounds similar to the IVRS ones for AMD systems.
> 
> 
> 
> > >
> > > ~Andrew> 
> > _______________________________________________
> > Xen-devel mailing list
> > Xen-devel@xxxxxxxxxxxxx
> > http://lists.xen.org/xen-devel
> 
> 
> Best regards,
> Yang
> 
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.