[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] AMD_IOV: IO_PAGE_FALT trying to pass through Mellanox ConnectX HCA (debian testing)
Hi list, I'm having some problems trying to pass through a Mellaxnox ConnectX HCA to a domU. This is on Xen 4.0.1, with the latest Debian Testing packages: ii xen-hypervisor-4.0-amd64 4.0.1-2 ii linux-image-2.6.32-5-xen-amd64 2.6.32-30 The hardware is Supermicro H8DGT-HIBQF, BIOS revision 1.0c (date 10/29/10). It has two AMD Opteron 6128 CPUs, for a total of 16 cores. The machine has 32GiB of ram. The Mellannox adapter looks like this in the dom0: 02:00.0 InfiniBand: Mellanox Technologies MT26428 [ConnectX VPI PCIe 2.0 5GT/s - IB QDR / 10GigE] (rev b0) Subsystem: Super Micro Computer Inc Device 0048 Flags: fast devsel, IRQ 19 Memory at fea00000 (64-bit, non-prefetchable) [size=1M] Memory at fc800000 (64-bit, prefetchable) [size=8M] Capabilities: [40] Power Management version 3 Capabilities: [48] Vital Product Data Capabilities: [9c] MSI-X: Enable- Count=256 Masked- Capabilities: [60] Express Endpoint, MSI 00 Capabilities: [100] Alternative Routing-ID Interpretation (ARI) Kernel driver in use: pciback I've attached the output of xm dmesg (xm.dmesg.txt). I have the following in the domU config files: pci = ['0000:02:00.0'] I've attached the boot log from trying to boot the same kernel as a HVM guest (testsqueezehvm.bootlog.txt). Doing so generates these four lines of output in xm dmesg: (XEN) AMD_IOV: IO_PAGE_FALT: domain:1, device id:0x200, fault address:0x255c000 (XEN) AMD_IOV: IO_PAGE_FALT: domain:1, device id:0x200, fault address:0x255c080 (XEN) AMD_IOV: IO_PAGE_FALT: domain:1, device id:0x200, fault address:0x255c040 (XEN) AMD_IOV: IO_PAGE_FALT: domain:1, device id:0x200, fault address:0x255c0c0 The mlx4_core driver in the domU is not happy: [ 0.411867] mlx4_core: Mellanox ConnectX core driver v0.01 (May 1, 2007) [ 0.411879] mlx4_core: Initializing 0000:00:00.0 [ 0.412027] mlx4_core 0000:00:00.0: enabling device (0000 -> 0002) [ 0.412027] mlx4_core 0000:00:00.0: Xen PCI enabling IRQ: 19 [ 1.417477] mlx4_core 0000:00:00.0: Installed FW has unsupported command interface revision 0. [ 1.417509] mlx4_core 0000:00:00.0: (Installed FW version is 0.0.000) [ 1.417527] mlx4_core 0000:00:00.0: This driver version supports only revisions 2 to 3. [ 1.417549] mlx4_core 0000:00:00.0: QUERY_FW command failed, aborting. When trying to boot a PV domU with kernel options iommu=soft and swiotlb=force, the output is slightly different. The full bootlog is attached (testsqueeze.bootlog.txt). Here's the relevant excerpt: [ 0.441684] mlx4_core: Mellanox ConnectX core driver v1.0-ofed1.5.2 (August 4, 2010) [ 0.441696] mlx4_core: Initializing 0000:00:00.0 [ 0.442044] mlx4_core 0000:00:00.0: enabling device (0000 -> 0002) [ 0.442741] mlx4_core 0000:00:00.0: Xen PCI enabling IRQ: 19 [ 2.752125] mlx4_core 0000:00:00.0: NOP command failed to generate MSI-X interrupt IRQ 54). [ 2.752158] mlx4_core 0000:00:00.0: Trying again without MSI-X. [ 2.884105] mlx4_core 0000:00:00.0: NOP command failed to generate interrupt (IRQ 54), aborting. [ 2.884138] mlx4_core 0000:00:00.0: BIOS or ACPI interrupt routing problem? [ 2.916920] mlx4_core: probe of 0000:00:00.0 failed with error -16 And xm dmesg quickly fills up with many, many lines like this: (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43000 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43020 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43040 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43060 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43080 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a430a0 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a430c0 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a430e0 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43100 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43120 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43140 (XEN) AMD_IOV: IO_PAGE_FALT: domain:2, device id:0x200, fault address:0x70a4309170a43160 ... Booting a PV domU with only the swiotlb=force option makes the output much more like the HVM output. Any thoughts on what could be going on here? Thanks, Ward. Attachment:
xm.dmesg.txt Attachment:
testsqueeze.bootlog.txt Attachment:
testsqueezehvm.bootlog.txt _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |