[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RE: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem
>-----Original Message----- >From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx] >Sent: Wednesday, September 29, 2010 12:19 AM >To: Lin, Ray; JBeulich@xxxxxxxxxx >Cc: Bruce Edge; Jiang, Yunhong; xen-devel@xxxxxxxxxxxxxxxxxxx >Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts on Nehalem > >On Tue, Sep 28, 2010 at 10:08:57AM -0600, Lin, Ray wrote: >> I just checked the "xen dmesg". Look like DMA/iommu is the root cause of >this issue. In order to tell the source of interrupt, Tachyon chip needs to do >the DMA >write to a dword memory location to indicate the source of interrupt. What >iommu >option do you recommend to use ? > >Lets get Jan involved in this discussion. > >Jan, would some of your patches that inhibit the MSI write affect this >in a PV guest? As far as I can tell, this patch should not cause issue to VT-d side, instead, it's more about access from CPU. Can you try to pass a option to xen as "iommu=off", and check the result? Thanks --jyh > >> >> (XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary Pending Fault >> (XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] Request device [07:00.0] fault addr >c00000, iommu reg = ffff82c3fff57000 >> (XEN) DMAR:[fault reason 05h] PTE Write access is not set >> (XEN) print_vtd_entries: iommu = ffff83019fffa370 bdf = 7:0.0 gmfn = c00 >> (XEN) root_entry = ffff83019ff70000 >> (XEN) root_entry[7] = 19cf52001 >> (XEN) context = ffff83019cf52000 >> (XEN) context[0] = 102_706dc005 >> (XEN) l4 = ffff8300706dc000 >> (XEN) l4_index = 0 >> (XEN) l4[0] = 706db003 >> (XEN) l3 = ffff8300706db000 >> (XEN) l3_index = 0 >> (XEN) l3[0] = 706da003 >> (XEN) l2 = ffff8300706da000 >> (XEN) l2_index = 6 >> (XEN) l2[6] = 0 >> >> >> -Ray >> >> >> ________________________________ >> From: xen-devel-bounces@xxxxxxxxxxxxxxxxxxx >[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx] On Behalf Of Bruce Edge >> Sent: Monday, September 27, 2010 9:46 PM >> To: Jiang, Yunhong >> Cc: xen-devel@xxxxxxxxxxxxxxxxxxx; Konrad Rzeszutek Wilk >> Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts on >> Nehalem >> >> On Mon, Sep 27, 2010 at 8:26 PM, Jiang, Yunhong ><yunhong.jiang@xxxxxxxxx<mailto:yunhong.jiang@xxxxxxxxx>> wrote: >> "xm dmesg" should gives xen's boot log, and sometimes it contain some helpful >information, I think, especially loglvl and guest_loglvl is set to all. >> >> I looked at the xm dmesg output and there's nothing more than what I already >provided, aside from a bunch of commands from me poking at it. >> >> -Bruce >> >> >> Thanks >> --jyh >> >> From: Bruce Edge >[mailto:bruce.edge@xxxxxxxxx<mailto:bruce.edge@xxxxxxxxx>] >> Sent: Tuesday, September 28, 2010 11:16 AM >> To: Jiang, Yunhong >> Cc: Konrad Rzeszutek Wilk; >xen-devel@xxxxxxxxxxxxxxxxxxx<mailto:xen-devel@xxxxxxxxxxxxxxxxxxx> >> >> Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts on >> Nehalem >> >> On Mon, Sep 27, 2010 at 6:15 PM, Jiang, Yunhong ><yunhong.jiang@xxxxxxxxx<mailto:yunhong.jiang@xxxxxxxxx>> wrote: >> Is the 07:0.0 your tachyon device? The VT-d fault is suspcious. >> >> Yes, there is 1 quad port card is this sytem: >> >> 07:00.0 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08) >> 07:00.1 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08) >> 07:00.2 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08) >> 07:00.3 Fibre Channel: PMC-Sierra Inc. Device 8032 (rev 08) >> >> >> Also is it possible to share the xen output? >> >> I attached the dom0 boot output. Let me know if you wanted something else. >> >> Also, here's the dom0 console output upon starting the VM: This lockdep error >started with the release of 2.6.32.21. Note that I'm running the same kernel >for >the domU and dom0. >> >> [ 1817.684097] ------------[ cut here ]------------ >> [ 1817.684113] WARNING: at kernel/lockdep.c:2323 >trace_hardirqs_on_caller+0x12f/0x190() >> [ 1817.684119] Hardware name: ProLiant DL380 G6 >> [ 1817.684122] Modules linked in: xt_physdev ipv6 osa_mfgdom0 xenfs >> xen_gntdev >fbcon tileblit font bitblit softcursor xen_evtchn xen_pciback radeon ttm >drm_kms_helper tun drm i2c_algo_bit ipmi_si i2c_core ipmi_msghandler joydev >serio_raw hpwdt hpilo bridge stp llc usbhid hid cciss usb_storage >> [ 1817.684190] Pid: 11, comm: xenwatch Not tainted 2.6.32.21-xenoprof-1 #1 >> [ 1817.684195] Call Trace: >> [ 1817.684197] <IRQ> [<ffffffff810aa18f>] ? >trace_hardirqs_on_caller+0x12f/0x190 >> [ 1817.684209] [<ffffffff8106bed0>] warn_slowpath_common+0x80/0xd0 >> [ 1817.684217] [<ffffffff815f2b80>] ? _spin_unlock_irq+0x30/0x40 >> [ 1817.684223] [<ffffffff8106bf34>] warn_slowpath_null+0x14/0x20 >> [ 1817.684229] [<ffffffff810aa18f>] trace_hardirqs_on_caller+0x12f/0x190 >> [ 1817.684234] [<ffffffff810aa1fd>] trace_hardirqs_on+0xd/0x10 >> [ 1817.684240] [<ffffffff815f2b80>] _spin_unlock_irq+0x30/0x40 >> [ 1817.684266] [<ffffffff813c4fc5>] add_to_net_schedule_list_tail+0x85/0xd0 >> [ 1817.684271] [<ffffffff813c6216>] netif_be_int+0x36/0x160 >> [ 1817.684278] [<ffffffff810e10d0>] handle_IRQ_event+0x70/0x180 >> [ 1817.684284] [<ffffffff810e36e9>] handle_edge_irq+0xc9/0x170 >> [ 1817.684291] [<ffffffff813b8d7f>] __xen_evtchn_do_upcall+0x1bf/0x1f0 >> [ 1817.684297] [<ffffffff813b92fd>] xen_evtchn_do_upcall+0x3d/0x60 >> [ 1817.684304] [<ffffffff8101647e>] xen_do_hypervisor_callback+0x1e/0x30 >> [ 1817.684308] <EOI> [<ffffffff8100940a>] ? hypercall_page+0x40a/0x1010 >> [ 1817.684319] [<ffffffff8100940a>] ? hypercall_page+0x40a/0x1010 >> [ 1817.684325] [<ffffffff813bce54>] ? xb_write+0x1e4/0x290 >> [ 1817.684330] [<ffffffff813bd8ca>] ? xs_talkv+0x6a/0x1f0 >> [ 1817.684336] [<ffffffff813bd8d8>] ? xs_talkv+0x78/0x1f0 >> [ 1817.684341] [<ffffffff813bdbcd>] ? xs_single+0x4d/0x60 >> [ 1817.684346] [<ffffffff813be502>] ? xenbus_read+0x52/0x80 >> [ 1817.684352] [<ffffffff813c87fc>] ? frontend_changed+0x48c/0x770 >> [ 1817.684358] [<ffffffff813bf76d>] ? xenbus_otherend_changed+0xdd/0x1b0 >> [ 1817.684365] [<ffffffff8101122f>] ? xen_restore_fl_direct_end+0x0/0x1 >> [ 1817.684371] [<ffffffff810ac830>] ? lock_release+0xb0/0x230 >> [ 1817.684376] [<ffffffff813bfae0>] ? frontend_changed+0x10/0x20 >> [ 1817.684382] [<ffffffff813bd4f5>] ? xenwatch_thread+0x55/0x160 >> [ 1817.684389] [<ffffffff81093400>] ? autoremove_wake_function+0x0/0x40 >> [ 1817.684394] [<ffffffff813bd4a0>] ? xenwatch_thread+0x0/0x160 >> [ 1817.684400] [<ffffffff81093086>] ? kthread+0x96/0xb0 >> [ 1817.684405] [<ffffffff8101632a>] ? child_rip+0xa/0x20 >> [ 1817.684410] [<ffffffff81015c90>] ? restore_args+0x0/0x30 >> [ 1817.684415] [<ffffffff81016320>] ? child_rip+0x0/0x20 >> >> -Bruce >> >> >> >> Thanks >> --jyh >> >> >-----Original Message----- >> >From: >xen-devel-bounces@xxxxxxxxxxxxxxxxxxx<mailto:xen-devel-bounces@xxxxxxxxxxxxxx >e.com> >> >[mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxx<mailto:xen-devel-bounces@list >s.xensource.com>] On Behalf Of Bruce Edge >> >Sent: Tuesday, September 28, 2010 7:54 AM >> >To: Konrad Rzeszutek Wilk >> >Cc: xen-devel@xxxxxxxxxxxxxxxxxxx<mailto:xen-devel@xxxxxxxxxxxxxxxxxxx> >> >Subject: Re: [Xen-devel] pv-ops domU not working with MSI interrupts on >Nehalem >> > >> >On Mon, Sep 27, 2010 at 12:54 PM, Konrad Rzeszutek Wilk >> ><konrad.wilk@xxxxxxxxxx<mailto:konrad.wilk@xxxxxxxxxx>> wrote: >> >> On Mon, Sep 27, 2010 at 12:16:50PM -0700, Bruce Edge wrote: >> >>> On Mon, Sep 27, 2010 at 10:24 AM, Konrad Rzeszutek Wilk >> >>> <konrad.wilk@xxxxxxxxxx<mailto:konrad.wilk@xxxxxxxxxx>> wrote: >> >>> > >> >>> > On Mon, Sep 27, 2010 at 08:52:39AM -0700, Bruce Edge wrote: >> >>> > > One of our developers who is working on a tachyon driver is >> >>> > > complaining that the pvops domU kernel is not working for these MSI >> >>> > > interrupts. >> >>> > > This is using the current head of xen/2.6.32.x on both a single >> >>> > > Nahelam 920 and a dual E5540. This behavior is consistent with Xen >> >>> > > 4.0.1, 4.0.2.rc1-pre and 4.1. >> >>> > > >> >>> > > Here are his comments: >> >>> > > >> >>> > > - the driver has no problem to enable msi interrupt and request the >> >>> > > interrupt through kernel functions pci_enable_msi & request_irq >> >>> > >> >>> > What shows up in the Xen console when you send the 'q' key? Does it >> >>> > show that the vector is assigned to the appropiate guest? >> >>> >> >>> The Xen console q key shows that the domU is assigned: >> >>> >> >>> (XEN) Interrupts { 32, 41-42, 47 } >> >> >> >> Aha! >> >> >> >>> >> >>> but the domU thinks it has: >> >>> >> >>> 124/125/126/127 >> >>> >> >>> Is there some mapping that's taking place, or is this plain wrong? >> >> >> >> That looks wrong. The IRQ numbers (even though they are MSI vectors) are >> >> setup as IRQ numbers in the DomU guest. You should have seen >> >> >> >> 32: >> >> 41: >> >> 42: >> >> 47: >> >> in you /proc/interrupts on your DomU guest. >> >> >> >> I wonder what broke - can you use >> >git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git<http://git.kernel.org >/pub/scm/linux/kernel/git/konrad/xen.git> >> >> devel/xen-pcifront-0.5 (or pv/pcifront-2.6.32)? >> > >> >Please forgive the git ignorance. >> > >> >Is this the right syntax? >> > >> >git clone >git://git.kernel.org/pub/scm/linux/kernel/git/konrad:pv/pcifront-2.6.32<http://git.ke >rnel.org/pub/scm/linux/kernel/git/konrad:pv/pcifront-2.6.32> >> >linux-2.6.32-pv-pcifront >> > >> >Initialized empty Git repository in >> >/import/kaan/bedge/src/xen/kernel/pv-ops/linux-2.6.32-pv-pcifront/.git/ >> >fatal: The remote end hung up unexpectedly >> > >> >Or: >> > >> > git clone >git://git.kernel.org/pub/scm/linux/kernel/git/konrad/xen.git<http://git.kernel.org/p >ub/scm/linux/kernel/git/konrad/xen.git> >> > >> >Initialized empty Git repository in >> >/import/kaan/bedge/src/xen/kernel/pv-ops/xen/.git/ >> >remote: error: Could not read 59eab2f8f04147c5aadc99f2034ca7e5b81e890f >> >remote: fatal: Failed to traverse parents of commit >> >979e121cb348add17ed8171bf447b27a3a9d1be3 >> >remote: aborting due to possible repository corruption on the remote side. >> >fatal: early EOF >> >fatal: index-pack failed >> > >> >> >> >> It has the latest pcifront driver but without the PVonHVM enhancments >> >> so we can try to eliminate the PvONHVM logic out of the picture. >> >> >> >>> >> >>> > >> >>> > > - the interrupt does happen. But the interrupt service routine of >> >>> > > tachyon driver doesn't detect any interrupt status related to this >> >>> > > interrupt, which inhibits the tachyon chip from coming on-line. And >> >>> > > there are high count of tachyon interrupt in /proc/interrupts >> >>> > >> >>> > Is it checking the PCI_STATUS_INTERRUPT or the appropiate register >> >>> > in the MMIO BAR? >> >>> > >> >>> >> >>> The driver would check the appropriate register (tachyon registers) in >> >>> the MMIO to determine the source of interrupts. >> >> >> >> OK, so that isn't it. Is there anything at these vectors: >> >> 7c, 7d, 7e, and 7f? When you use xen debug-keys 'i' or 'q' it should give >> >> you >> >> an inkling what device this is set for. >> > >> >When I run a distro kernel in hvm mode, I get the expected irq mappings: >> > >> >'i' - Note 66 - 69 >> >(XEN) IRQ: 66 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:3a >> >type=PCI-MSI status=00000010 in-flight=0 >> >domain-list=10:127(----), >> >(XEN) IRQ: 67 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:42 >> >type=PCI-MSI status=00000010 in-flight=0 >> >domain-list=10:126(----), >> >(XEN) IRQ: 68 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:4a >> >type=PCI-MSI status=00000010 in-flight=0 >> >domain-list=10:125(----), >> >(XEN) IRQ: 69 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:52 >> >type=PCI-MSI status=00000010 in-flight=0 >> >domain-list=10:124(----) >> > >> > >> >'q' >> >(XEN) Interrupts { 32, 41-42, 47, 124-127 } >> > >> > >> >The same data with pv-ops kernel shows: >> > >> >'i' >> >IRQ numbers stop at 65, no 66 - 69 present: >> > >> >(XEN) IRQ: 63 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:91 >> >type=PCI-MSI status=00000010 in-flight=0 >> >domain-list=0:289(----), >> >(XEN) IRQ: 64 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:99 >> >type=PCI-MSI status=00000002 mapped, unbound >> >(XEN) IRQ: 65 affinity:ffffffff,ffffffff,ffffffff,ffffffff vec:b1 >> >type=PCI-MSI status=00000010 in-flight=0 >> >domain-list=0:287(----), >> >(XEN) IO-APIC interrupt information: >> > >> >'q' >> >(XEN) Interrupts { 32, 41-42, 47 } >> > >> >> >> >>> >> >>> > > >> >>> > > kaan-18-dpm:~# cat /proc/interrupts | grep TACH >> >>> > > >> >124: 760415 0 0 0 0 >> > 0 >> >>> > > 0 0 0 0 0 >> > 0 >> >>> > > 0 0 xen-pirq-pcifront-msi HW_TACHYON >> >>> > > >> >125: 762234 0 0 0 0 >> > 0 >> >>> > > 0 0 0 0 0 >> > 0 >> >>> > > 0 0 xen-pirq-pcifront-msi HW_TACHYON >> >>> > > >> >126: 764180 0 0 0 0 >> > 0 >> >>> > > 0 0 0 0 0 >> > 0 >> >>> > > 0 0 xen-pirq-pcifront-msi HW_TACHYON >> >>> > > >> >127: 764164 0 0 0 0 >> > 0 >> >>> > > 0 0 0 0 0 >> > 0 >> >>> > > 0 0 xen-pirq-pcifront-msi HW_TACHYON >> >>> > >> >>> > Can you provide the full dmesg output? >> >>> >> >>> Attached. >> >>> >> >>> Some possibly related messages on dom0 console: >> >>> >> >>> [ 1882.269778] pciback 0000:07:00.0: enabling device (0000 -> 0003) >> >>> [ 1882.269800] xen: registering gsi 32 triggering 0 polarity 1 >> >>> [ 1882.269827] xen_allocate_pirq: returning irq 32 for gsi 32 >> >>> [ 1882.269834] xen: --> irq=32 >> >>> [ 1882.269841] Already setup the GSI :32 >> >>> [ 1882.269847] pciback 0000:07:00.0: PCI INT A -> GSI 32 (level, low) -> >> >>> IRQ 32 >> >>> [ 1882.269866] pciback 0000:07:00.0: setting latency timer to 64 >> >>> [ 1882.270463] pciback 0000:07:00.0: Driver tried to write to a >> >>> read-only configuration space field at offset 0x62, size 2. This may >> >>> be harmless, but if you have problems with your device: >> >> >> >> Uhhh, for that I think you need to do 'lspci -vvv -xxx -s 07:00.00' >> >> to find out what is at the configuration space. You could enable >> >> it using the permissive attribute. >> >> >> >>> [ 1882.270465] 1) see permissive attribute in sysfs >> >>> [ 1882.270467] 2) report problems to the xen-devel mailing list along >> >>> with details of your device obtained from lspci. >> >>> [ 1882.270615] alloc irq_desc for 478 on node 0 >> >>> [ 1882.270625] alloc kstat_irqs on node 0 >> >> >> >> So for 478: what do you see? xen-pciback I presume? >> >>> [ 1882.348411] pciback 0000:07:00.1: enabling device (0000 -> 0003) >> >>> [ 1882.348433] xen: registering gsi 42 triggering 0 polarity 1 >> >>> [ 1882.348440] xen_allocate_pirq: returning irq 42 for gsi 42 >> >>> [ 1882.348445] xen: --> irq=42 >> >>> [ 1882.348472] Already setup the GSI :42 >> >>> [ 1882.348479] pciback 0000:07:00.1: PCI INT B -> GSI 42 (level, low) -> >> >>> IRQ 42 >> >>> [ 1882.348497] pciback 0000:07:00.1: setting latency timer to 64 >> >>> [ 1882.349063] pciback 0000:07:00.1: Driver tried to write to a >> >>> read-only configuration space field at offset 0x62, size 2. This may >> >>> be harmless, but if you have problems with your device: >> >>> [ 1882.349066] 1) see permissive attribute in sysfs >> >>> [ 1882.349067] 2) report problems to the xen-devel mailing list along >> >>> with details of your device obtained from lspci. >> >>> [ 1882.349205] alloc irq_desc for 477 on node 0 >> >>> [ 1882.349215] alloc kstat_irqs on node 0 >> >>> [ 1882.402893] pciback 0000:07:00.2: enabling device (0000 -> 0003) >> >>> [ 1882.402908] xen: registering gsi 47 triggering 0 polarity 1 >> >>> [ 1882.402913] xen_allocate_pirq: returning irq 47 for gsi 47 >> >>> [ 1882.402916] xen: --> irq=47 >> >>> [ 1882.402921] Already setup the GSI :47 >> >>> [ 1882.402925] pciback 0000:07:00.2: PCI INT C -> GSI 47 (level, low) -> >> >>> IRQ 47 >> >>> [ 1882.402938] pciback 0000:07:00.2: setting latency timer to 64 >> >>> [ 1882.403280] pciback 0000:07:00.2: Driver tried to write to a >> >>> read-only configuration space field at offset 0x62, size 2. This may >> >>> be harmless, but if you have problems with your device: >> >>> [ 1882.403282] 1) see permissive attribute in sysfs >> >>> [ 1882.403282] 2) report problems to the xen-devel mailing list along >> >>> with details of your device obtained from lspci. >> >>> [ 1882.403380] alloc irq_desc for 476 on node 0 >> >>> [ 1882.403386] alloc kstat_irqs on node 0 >> >>> (XEN) [VT-D]iommu.c:824: iommu_fault_status: Primary Pending Fault >> >>> (XEN) [VT-D]iommu.c:799: DMAR:[DMA Write] Request device [07:00.0] >> >>> fault addr e6f80000, iommu reg = ffff82c3fff57000 >> >>> (XEN) DMAR:[fault reason 05h] PTE Write access is not set >> >>> (XEN) print_vtd_entries: iommu = ffff83019fffa370 bdf = 7:0.0 gmfn = >> >>> e6f80 >> >>> (XEN) root_entry = ffff83019ff70000 >> >>> (XEN) root_entry[7] = 19cf52001 >> >>> (XEN) context = ffff83019cf52000 >> >>> (XEN) context[0] = 102_706dc005 >> >>> (XEN) l4 = ffff8300706dc000 >> >>> (XEN) l4_index = 0 >> >>> (XEN) l4[0] = 706db003 >> >>> (XEN) l3 = ffff8300706db000 >> >>> (XEN) l3_index = 3 >> >>> (XEN) l3[3] = 702b6003 >> >>> (XEN) l2 = ffff8300702b6000 >> >>> (XEN) l2_index = 137 >> >>> (XEN) l2[137] = 0 >> >>> (XEN) l2[137] not present >> >>> (XEN) traps.c:466:d0 Unhandled nmi fault/trap [#2] on VCPU 0 [ec=0000] >> >> >> >> That is not good. What changed from your earlier emails that this was >triggered? >> > >> >Nothing >> >> Or was it triggered all along? >> > >> >Yes, I just included it for completeness >> > >> >> What happens if you run the system without the iommu enabled? >> > >> >Haven't tried yet. Will check that next. >> > >> >-Bruce >> > >> >_______________________________________________ >> >Xen-devel mailing list >> >Xen-devel@xxxxxxxxxxxxxxxxxxx<mailto:Xen-devel@xxxxxxxxxxxxxxxxxxx> >> >http://lists.xensource.com/xen-devel >> >> _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |