[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] swiotlb=force in Konrad's xen-pcifront-0.8.2 pvops domU kernel with PCI passthrough
Here's another datapoint: with iommu=1,passthrough,no-intremap,verbose in the Xen command line and iommu=soft in the pvops domU command line also results in an NMI (see below). Replacing iommu=soft with swiotlb=force in pvops domU works reliably but with the I/O performance degradation. It seems that regardless of whether iommu is enabled or disabled in the hypervisor, swiotlb=force is necessary in the pvops domU. (XEN) (XEN) NMI - I/O ERROR (XEN) ----[ Xen-4.1-unstable x86_64 debug=y Not tainted ]---- (XEN) CPU: 0 (XEN) RIP: e008:[<ffff82c48015c006>] do_IRQ+0x375/0x59c (XEN) RFLAGS: 0000000000000002 CONTEXT: hypervisor (XEN) rax: ffff83011dae4460 rbx: ffff8301616a6990 rcx: 000000000000010c (XEN) rdx: 000000000000010c rsi: 0000000000000086 rdi: 0000000000000001 (XEN) rbp: ffff82c480287e28 rsp: ffff82c480287db8 r8: 000000000000007a (XEN) r9: ffff8300df4d4060 r10: ffff83019fffac88 r11: 000001958595f304 (XEN) r12: ffff83011dae2000 r13: 0000000000000000 r14: 000000000000007f (XEN) r15: ffff83019fe02200 cr0: 000000008005003b cr4: 00000000000026f0 (XEN) cr3: 00000001261ff000 cr2: 0000000000783000 (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 (XEN) Xen stack trace from rsp=ffff82c480287db8: (XEN) 0000000000000043 0000000000000043 ffff83019fe02234 0000000000000000 (XEN) 000000000000010c ffff830000000000 ffff82c4802c2400 0000000000000002 (XEN) ffff82c480287e10 ffff82c480287f18 ffff82c48024f6c0 ffff82c480287f18 (XEN) ffff82c4802c2300 0000000000000002 00007d3b7fd781a7 ffff82c480154ee6 (XEN) 0000000000000002 ffff82c4802c2300 ffff82c480287f18 ffff82c48024f6c0 (XEN) ffff82c480287ee0 ffff82c480287f18 000001958595f304 ffff83019fffac88 (XEN) ffff8300df4d4060 ffff83019fffa9f0 ffff82c4802c23a0 0000000000000000 (XEN) 0000000000000000 ffff82c4802c2e80 0000000000000000 0000007a00000000 (XEN) ffff82c48014e3c3 000000000000e008 0000000000000246 ffff82c480287ee0 (XEN) 000000000000e010 ffff82c480287f10 ffff82c480150664 0000000000000000 (XEN) ffff8300df2fc000 ffff8300df4d4000 00000000ffffffff ffff82c480287db8 (XEN) 0000000000000000 ffffffffffffffff ffffffff81787160 ffffffff81669fd8 (XEN) ffffffff81669ed0 ffffffff81668000 0000000000000246 ffff8800067c0200 (XEN) 0000019575abe291 0000000000000000 0000000000000000 ffffffff810093aa (XEN) 0000000400000000 00000000deadbeef 00000000deadbeef 0000010000000000 (XEN) ffffffff810093aa 000000000000e033 0000000000000246 ffffffff81669eb8 (XEN) 000000000000e02b 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 ffff8300df2fc000 0000000000000000 (XEN) 0000000000000000 (XEN) Xen call trace: (XEN) [<ffff82c48015c006>] do_IRQ+0x375/0x59c (XEN) [<ffff82c480154ee6>] common_interrupt+0x26/0x30 (XEN) [<ffff82c48014e3c3>] default_idle+0x82/0x87 (XEN) [<ffff82c480150664>] idle_loop+0x5a/0x68 (XEN) (XEN) (XEN) **************************************** (XEN) Panic on CPU 0: (XEN) FATAL TRAP: vector = 2 (nmi) (XEN) [error_code=0000] , IN INTERRUPT CONTEXT (XEN) **************************************** (XEN) (XEN) Reboot in five seconds... - Dante On Thu, Nov 11, 2010 at 2:32 PM, Dante Cinco <dantecinco@xxxxxxxxx> wrote: > With iommu=off,verbose in the Xen commandline, pvops domU works only > with swiotlb=force and with the same performance degradation. Without > swiotlb=force, there's no NMI but DMA does not work (see Ray Lin's > reply on Thu 11/11/2010 11:42 AM). > > The XenPCIpassthrough wiki > (http://wiki.xensource.com/xenwiki/XenPCIpassthrough) talks about > setting iommu=pv in order to use the hardware IOMMU (VT-d) passthru > for PV guests but I didn't see any difference compared to my original > setting (iommu=1,passthrough,no-intremap). Is iommu=pv still required > for this particular pvops domU kernel (xen-pcifront-0.8.2) and if it > is, what should I be looking for in the Xen log (xm dmesg) to verify > its efficacy? > > With my original setting (iommu=1,passthrough,no-intremap), here's what I see: > > (XEN) [VT-D]dmar.c:702: Host address width 39 > (XEN) [VT-D]dmar.c:717: found ACPI_DMAR_DRHD: > (XEN) [VT-D]dmar.c:413: dmaru->address = e7ffe000 > (XEN) [VT-D]iommu.c:1136: drhd->address = e7ffe000 iommu->reg = > ffff82c3fff57000 > (XEN) [VT-D]iommu.c:1138: cap = c90780106f0462 ecap = f0207e > (XEN) [VT-D]dmar.c:356: IOAPIC: 0:1e.1 > (XEN) [VT-D]dmar.c:356: IOAPIC: 0:13.0 > (XEN) [VT-D]dmar.c:427: flags: INCLUDE_ALL > (XEN) [VT-D]dmar.c:722: found ACPI_DMAR_RMRR: > (XEN) [VT-D]dmar.c:341: endpoint: 0:1d.7 > (XEN) [VT-D]dmar.c:594: RMRR region: base_addr df7fc000 end_address df7fdfff > (XEN) [VT-D]dmar.c:722: found ACPI_DMAR_RMRR: > (XEN) [VT-D]dmar.c:341: endpoint: 0:1d.0 > (XEN) [VT-D]dmar.c:341: endpoint: 0:1d.1 > (XEN) [VT-D]dmar.c:341: endpoint: 0:1d.2 > (XEN) [VT-D]dmar.c:341: endpoint: 0:1d.3 > (XEN) [VT-D]dmar.c:341: endpoint: 2:0.0 > (XEN) [VT-D]dmar.c:341: endpoint: 2:0.2 > (XEN) [VT-D]dmar.c:341: endpoint: 2:0.4 > (XEN) [VT-D]dmar.c:594: RMRR region: base_addr df7f5000 end_address df7fafff > (XEN) [VT-D]dmar.c:722: found ACPI_DMAR_RMRR: > (XEN) [VT-D]dmar.c:341: endpoint: 5:0.0 > (XEN) [VT-D]dmar.c:341: endpoint: 2:0.0 > (XEN) [VT-D]dmar.c:341: endpoint: 2:0.2 > (XEN) [VT-D]dmar.c:594: RMRR region: base_addr df63e000 end_address df63ffff > (XEN) [VT-D]dmar.c:727: found ACPI_DMAR_ATSR: > (XEN) [VT-D]dmar.c:622: atsru->all_ports: 0 > (XEN) [VT-D]dmar.c:327: bridge: 0:a.0 start = 0 sec = 7 sub = 7 > (XEN) [VT-D]dmar.c:327: bridge: 0:9.0 start = 0 sec = 8 sub = a > (XEN) [VT-D]dmar.c:327: bridge: 0:8.0 start = 0 sec = b sub = d > (XEN) [VT-D]dmar.c:327: bridge: 0:7.0 start = 0 sec = e sub = 10 > (XEN) [VT-D]dmar.c:327: bridge: 0:6.0 start = 0 sec = 18 sub = 1a > (XEN) [VT-D]dmar.c:327: bridge: 0:5.0 start = 0 sec = 15 sub = 17 > (XEN) [VT-D]dmar.c:327: bridge: 0:4.0 start = 0 sec = 14 sub = 14 > (XEN) [VT-D]dmar.c:327: bridge: 0:3.0 start = 0 sec = 11 sub = 13 > (XEN) [VT-D]dmar.c:327: bridge: 0:2.0 start = 0 sec = 6 sub = 6 > (XEN) [VT-D]dmar.c:327: bridge: 0:1.0 start = 0 sec = 5 sub = 5 > (XEN) Intel VT-d Snoop Control not enabled. > (XEN) Intel VT-d Dom0 DMA Passthrough not enabled. > (XEN) Intel VT-d Queued Invalidation enabled. > (XEN) Intel VT-d Interrupt Remapping not enabled. > (XEN) I/O virtualisation enabled > (XEN) - Dom0 mode: Relaxed > (XEN) Enabled directed EOI with ioapic_ack_old on! > (XEN) [VT-D]iommu.c:743: iommu_enable_translation: iommu->reg = > ffff82c3fff57000 > > domU bringup: > > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 11:0.3 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 11:0.3 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 11:0.2 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 11:0.2 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 11:0.1 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 11:0.1 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 11:0.0 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 11:0.0 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 8:0.3 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 8:0.3 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 8:0.2 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 8:0.2 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 8:0.1 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 8:0.1 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 8:0.0 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 8:0.0 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 15:0.0 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 15:0.0 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 15:0.1 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 15:0.1 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 18:0.0 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 18:0.0 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = 18:0.1 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = 18:0.1 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = b:0.0 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = b:0.0 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = b:0.1 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = b:0.1 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = e:0.0 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = e:0.0 > (XEN) [VT-D]iommu.c:1514: d0:PCIe: unmap bdf = e:0.1 > (XEN) [VT-D]iommu.c:1387: d1:PCIe: map bdf = e:0.1 > mapping kernel into physical memory > about to get started... > > - Dante > > On Thu, Nov 11, 2010 at 11:03 AM, Konrad Rzeszutek Wilk > <konrad.wilk@xxxxxxxxxx> wrote: >> On Thu, Nov 11, 2010 at 10:31:48AM -0800, Dante Cinco wrote: >>> Konrad, >>> >>> Without swiotlb=force, I don't see "PCI-DMA: Using software bounce >>> buffering for IO" in /var/log/kern.log. >>> >>> With iommu=soft and without swiotlb=force, I see the "software bounce >>> buffering" in /var/log/kern.log and an NMI (see below) when I load the >>> kernel module drivers. I made sure the NMI is reproducible and not a >> >> What is the kernel module doing to cause this? DMA? >>> one-time event. >> >> So doing 64-bit DMA causes an NMI. Do you have the Hypervisor's IOMMU VT-d >> enabled or disabled? (iommu=off,verbose) If you turn it off does this work? >>> >>> /var/log/kern.log (iommu=soft): >>> PCI-DMA: Using software bounce buffering for IO (SWIOTLB) >>> Placing 64MB software IO TLB between ffff880005800000 - ffff880009800000 >>> software IO TLB at phys 0x5800000 - 0x9800000 >>> >>> (XEN) >>> (XEN) >>> (XEN) NMI - I/O ERROR >>> (XEN) ----[ Xen-4.1-unstable x86_64 debug=y Not tainted ]---- >>> (XEN) CPU: 0 >>> (XEN) RIP: e008:[<ffff82c4801701b2>] smp_send_event_check_mask+0x1/0x10 >>> (XEN) RFLAGS: 0000000000000012 CONTEXT: hypervisor >>> (XEN) rax: 0000000000000080 rbx: ffff82c480287c48 rcx: 0000000000000000 >>> (XEN) rdx: 0000000000000080 rsi: 0000000000000080 rdi: ffff82c480287c48 >>> (XEN) rbp: ffff82c480287c78 rsp: ffff82c480287c38 r8: 0000000000000000 >>> (XEN) r9: 0000000000000037 r10: 0000ffff0000ffff r11: 00ff00ff00ff00ff >>> (XEN) r12: ffff82c48029f080 r13: 0000000000000001 r14: 0000000000000008 >>> (XEN) r15: ffff82c4802b0c20 cr0: 000000008005003b cr4: 00000000000026f0 >>> (XEN) cr3: 00000001250a9000 cr2: 00007f6165ae9428 >>> (XEN) ds: 0000 es: 0000 fs: 0000 gs: 0000 ss: e010 cs: e008 >>> (XEN) Xen stack trace from rsp=ffff82c480287c38: >>> (XEN) ffff82c480287c78 ffff82c48012001f 0000000000000100 0000000000000000 >>> (XEN) ffff82c480287ca8 ffff83011dadd8b0 ffff83019fffa9d0 ffff82c4802c2300 >>> (XEN) ffff82c480287cc8 ffff82c480117d0d ffff82c48029f080 0000000000000001 >>> (XEN) 0000000000000100 0000000000000000 0000000000000002 ffff8300df606000 >>> (XEN) 000000411de66867 ffff82c4802c2300 ffff82c480287d28 ffff82c48011f299 >>> (XEN) 0000000000000100 0000000000000086 ffff83019e3fa000 ffff83011dadd8b0 >>> (XEN) ffff83019fffa9d0 ffff8300df606000 0000000000000000 0000000000000000 >>> (XEN) 000000000000007f ffff83019fe02200 ffff82c480287d38 ffff82c48011f6ea >>> (XEN) ffff82c480287d58 ffff82c48014e4c1 ffff83011dae2000 0000000000000066 >>> (XEN) ffff82c480287d68 ffff82c48014e54d ffff82c480287d98 ffff82c480105d59 >>> (XEN) ffff82c480287da8 ffff8301616a6990 ffff83011dae2000 0000000000000000 >>> (XEN) ffff82c480287da8 ffff82c480105f81 ffff82c480287e28 ffff82c48015c043 >>> (XEN) 0000000000000043 0000000000000043 ffff83019fe02234 0000000000000000 >>> (XEN) 000000000000010c 0000000000000000 0000000000000000 0000000000000002 >>> (XEN) ffff82c480287e10 ffff82c480287f18 ffff82c48024f6c0 ffff82c480287f18 >>> (XEN) ffff82c4802c2300 0000000000000002 00007d3b7fd781a7 ffff82c480154ee6 >>> (XEN) 0000000000000002 ffff82c4802c2300 ffff82c480287f18 ffff82c48024f6c0 >>> (XEN) ffff82c480287ee0 ffff82c480287f18 00ff00ff00ff00ff 0000ffff0000ffff >>> (XEN) 0000000000000000 0000000000000000 ffff82c4802c23a0 0000000000000000 >>> (XEN) 0000000000000000 ffff82c4802c2e80 0000000000000000 0000007a00000000 >>> (XEN) Xen call trace: >>> (XEN) [<ffff82c4801701b2>] smp_send_event_check_mask+0x1/0x10 >>> (XEN) [<ffff82c480117d0d>] csched_vcpu_wake+0x2e1/0x302 >>> (XEN) [<ffff82c48011f299>] vcpu_wake+0x243/0x43e >>> (XEN) [<ffff82c48011f6ea>] vcpu_unblock+0x4a/0x4c >>> (XEN) [<ffff82c48014e4c1>] vcpu_kick+0x21/0x7f >>> (XEN) [<ffff82c48014e54d>] vcpu_mark_events_pending+0x2e/0x32 >>> (XEN) [<ffff82c480105d59>] evtchn_set_pending+0xbf/0x190 >>> (XEN) [<ffff82c480105f81>] send_guest_pirq+0x54/0x56 >>> (XEN) [<ffff82c48015c043>] do_IRQ+0x3b2/0x59c >>> (XEN) [<ffff82c480154ee6>] common_interrupt+0x26/0x30 >>> (XEN) [<ffff82c48014e3c3>] default_idle+0x82/0x87 >>> (XEN) [<ffff82c480150664>] idle_loop+0x5a/0x68 >>> (XEN) >>> (XEN) >>> (XEN) **************************************** >>> (XEN) Panic on CPU 0: >>> (XEN) FATAL TRAP: vector = 2 (nmi) >>> (XEN) [error_code=0000] , IN INTERRUPT CONTEXT >>> (XEN) **************************************** >>> (XEN) >>> (XEN) Reboot in five seconds... >>> >>> Dante >>> >>> >>> On Thu, Nov 11, 2010 at 8:04 AM, Konrad Rzeszutek Wilk >>> <konrad.wilk@xxxxxxxxxx> wrote: >>> > On Wed, Nov 10, 2010 at 05:16:14PM -0800, Dante Cinco wrote: >>> >> We have Fibre Channel HBA devices that we PCI passthrough to our pvops >>> >> domU kernel. Without swiotlb=force in the domU's kernel command line, >>> >> both domU and dom0 lock up after loading the kernel module drivers for >>> >> the HBA devices. With swiotlb=force, the domU and dom0 are stable >>> > >>> > Whoa. That is not good - what happens if you just pass in iommu=soft? >>> > Does the PCI-DMA: Using.. show up if you don't pass in any of those >>> > parameters? >>> > (I don't think it does, but just doing 'iommu=soft' should enable it). >>> > >>> > >>> >> after loading the kernel module drivers but the I/O performance is at >>> >> least an order of magnitude worse than what we were seeing with the >>> >> HVM kernel. I see the following in /var/log/kern.log in the pvops >>> >> domU: >>> >> >>> >> PCI-DMA: Using software bounce buffering for IO (SWIOTLB) >>> >> Placing 64MB software IO TLB between ffff880005800000 - ffff880009800000 >>> >> software IO TLB at phys 0x5800000 - 0x9800000 >>> >> >>> >> Is swiotlb=force responsible for the I/O performance degradation? I >>> >> don't understand what swiotlb=force does so I would appreciate an >>> >> explanation or a pointer. >>> > >>> > So, you should only need to use 'iommu=soft'. It will enable the Linux >>> > kernel IOMMU >>> > to translate the pseudo-PFNs to the real machine frame numbers (bus >>> > addresses). >>> > >>> > If your card is 64-bit, then that is all it would do. If however your >>> > card is 32-bit >>> > and your are DMA-ing data from above the 32-bit limit, it would copy the >>> > user-space page >>> > to memory below 4GB, DMA that, and when done, copy it back to the where >>> > the user-space >>> > page is. This is called bounce-buffering and this is why you would use a >>> > mix of >>> > pci_map_page, pci_sync_single_for_[cpu|device] calls around your driver. >>> > >>> > However, I think your cards are 64-bit, so you don't need this >>> > bounce-buffering. But >>> > if you say 'swiotlb=force' it will force _all_ DMAs to go through the >>> > bounce-buffer. >>> > >>> > So, try just 'iommu=soft' and see what happens. >>> > >> > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |