[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Some MSI related bugs when trying to use VT-d. Help identify software vs hardware problem?



On Wed, May 08, 2013 at 09:45:14AM +0100, George Dunlap wrote:
> On Tue, May 7, 2013 at 11:38 AM, Andrew Bobulsky <rulerof@xxxxxxxxx> wrote:
> > Hello List!
> >
> > I'm having another [rather fruitless] go at trying to get PCIe passthrough
> > to work on my Radeon 6990 card(s).  I have an i7 920 chip in a Gigabyte
> > GA-EX58-EXTREME board that I've flashed a modded BIOS into to add VT-d
> > support... I found the BIOS image on a BIOS modding forum maybe a year or
> > two ago.
> >
> >
> > I stuck an extra Highpoint RocketU 1144A USB 3 card into the board, because
> > I know it works *very well* with IOMMU and its architecture is really
> > convenient... each port on the back is essentially its own PCIe device[1].
> > I was able to "xl pci-assignable-add" the usb controllers, and attach and
> > detach them at will to a Server 2012 DomU.  The dmesg output from that event
> > looked like this:
> >
> >> [49857.921550] xhci_hcd 0000:06:00.0: remove, state 4
> >> [49857.921555] usb usb15: USB disconnect, device number 1
> >> [49857.921600] xHCI xhci_drop_endpoint called for root hub
> >> [49857.921602] xHCI xhci_check_bandwidth called for root hub
> >> [49857.921681] xhci_hcd 0000:06:00.0: USB bus 15 deregistered
> >> [49857.921686] xhci_hcd 0000:06:00.0: remove, state 1
> >> [49857.921689] usb usb13: USB disconnect, device number 1
> >> [49857.921691] usb 13-1: USB disconnect, device number 4
> >> [49857.953278] xHCI xhci_drop_endpoint called for root hub
> >> [49857.953280] xHCI xhci_check_bandwidth called for root hub
> >> [49857.963158] xhci_hcd 0000:06:00.0: USB bus 13 deregistered
> >> [49857.963455] pciback 0000:06:00.0: seizing device
> >> [49857.963500] xen: registering gsi 17 triggering 0 polarity 1
> >> [49857.963503] Already setup the GSI :17
> >> [49857.963514] pciback 0000:06:00.0: MSI-X preparation failed (-38)
> >
> >
> > Nonetheless, it works quite well!  I played audio through a USB headset from
> > the DomU to confirm it as well.
> >
> >
> >
> > However, when I try to "xl pci-assignable-add" one of my VGA controllers
> > from the Radeon, the action completes, and "xl pci-assignable-list" shows
> > the device as available, but "xl pci-attach" never completes, and attemping
> > to "xl pci-assignable-add" the HDMI audo device never returns to the CLI
> > either.

OK, and is 0e:00.0 your VGA controller?

How do you assign the VGA controller? Do you do:

echo "0000:0e.00.0" > /sys/../radeon/unbind
echo "0000:0e.00.0" > /sys/../pciback/new_slot
echo "0000:0e.00.0" > /sys/../pciback/bind
?
> >
> > There is no visible output in dmesg from the attempt on the HDMI audio
> > device, but when I run it against the VGA controller, I get this:
> >
> >> [55817.715309] pciback 0000:0e:00.0: seizing device
> >> [55817.737444] ------------[ cut here ]------------
> >> [55817.737447] kernel BUG at drivers/pci/msi.c:346!
> >> [55817.737449] invalid opcode: 0000 [#1] PREEMPT SMP
> >> [55817.737451] Modules linked in: xt_physdev iptable_filter ip_tables
> >> x_tables tun parport_pc ppdev lp parport bnep rfcomm bluetooth rfkill crc16
> >> cpufreq_stats binfmt_misc fuse bridge stp llc ext2 loop snd_hda_codec_hdmi
> >> snd_hda_codec_realtek joydev mperf coretemp crc32c_intel fglrx(PO)
> >> snd_hda_intel snd_hda_codec snd_usb_audio microcode hid_generic mxm_wmi
> >> snd_usbmidi_lib evdev psmouse snd_seq_midi snd_seq_midi_event i2c_i801
> >> pcspkr tpm_tis snd_hwdep tpm snd_rawmidi serio_raw snd_pcm i2c_core 
> >> tpm_bios
> >> snd_seq snd_timer snd_seq_device lpc_ich mfd_core snd ehci_pci soundcore
> >> snd_page_alloc wmi xhci_hcd button processor thermal_sys sg sr_mod cdrom
> >> ext3 jbd mbcache dm_mirror dm_region_hash dm_log dm_mod sd_mod crc_t10dif
> >> usb_storage usbhid hid ahci libahci uhci_hcd ehci_hcd usbcore usb_common
> >> e1000e
> >> [55817.737482] CPU 6
> >> [55817.737484] Pid: 18055, comm: xl Tainted: P           O 3.8.11 #1
> >> Gigabyte Technology Co., Ltd. EX58-EXTREME/EX58-EXTREME
> >> [55817.737485] RIP: e030:[<ffffffff811eab09>]  [<ffffffff811eab09>]
> >> free_msi_irqs+0x5d/0x11b
> >> [55817.737490] RSP: e02b:ffff88034211dd08  EFLAGS: 00010282
> >> [55817.737491] RAX: ffff880420bbd600 RBX: ffff88042096fa80 RCX:
> >> 0000000000000000
> >> [55817.737492] RDX: 0000000000000000 RSI: 0000000000000091 RDI:
> >> 0000000000000011
> >> [55817.737493] RBP: ffff880421f81000 R08: ffff88042096fa80 R09:
> >> ffff88034211dce4
> >> [55817.737494] R10: ffff88034211dd16 R11: 0000000000000000 R12:
> >> ffff880421f81858
> >> [55817.737495] R13: 0000000000000001 R14: 0000000000000000 R15:
> >> 0000000000000001
> >> [55817.737498] FS:  00007fa609475740(0000) GS:ffff88043a2c0000(0000)
> >> knlGS:0000000000000000
> >> [55817.737499] CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
> >> [55817.737500] CR2: ffffffffff600400 CR3: 000000033a673000 CR4:
> >> 0000000000002660
> >> [55817.737501] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> >> 0000000000000000
> >> [55817.737503] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> >> 0000000000000400
> >> [55817.737504] Process xl (pid: 18055, threadinfo ffff88034211c000, task
> >> ffff8804217b5ca0)
> >> [55817.737504] Stack:
> >> [55817.737505]  00000000000000a2 ffff880421f81000 0000000000000000
> >> ffff8803ca2ab6c0
> >> [55817.737507]  ffff880421f81098 ffff880421f810f8 ffff8803ca2abc00
> >> ffffffff811eb21d
> >> [55817.737509]  ffff880421f81000 ffffffff81240822 ffff880421f81098
> >> ffff880421f81000
> >> [55817.737510] Call Trace:
> >> [55817.737513]  [<ffffffff811eb21d>] ? pci_disable_msi+0x28/0x41
> >> [55817.737516]  [<ffffffff81240822>] ? xen_pcibk_reset_device+0x3a/0xa0
> >> [55817.737518]  [<ffffffff8123fd9f>] ? pcistub_init_device+0x167/0x19c
> >> [55817.737521]  [<ffffffff81108661>] ? __kmalloc+0xd6/0xe2
> >> [55817.737523]  [<ffffffff8123ff08>] ? pcistub_probe+0x134/0x1b8
> >> [55817.737525]  [<ffffffff811df0f5>] ? local_pci_probe+0x37/0x5d
> >> [55817.737527]  [<ffffffff811dff7c>] ? pci_device_probe+0xc2/0xe3
> >> [55817.737529]  [<ffffffff81277d71>] ? driver_probe_device+0xa1/0x1ac
> >> [55817.737532]  [<ffffffff81276e38>] ? driver_bind+0x7e/0xc7
> >> [55817.737534]  [<ffffffff81165859>] ? sysfs_write_file+0xd3/0x10f
> >> [55817.737537]  [<ffffffff8110fb7d>] ? vfs_write+0xa4/0xfe
> >> [55817.737539]  [<ffffffff813c22e5>] ? _raw_spin_lock+0xe/0x2a
> >> [55817.737541]  [<ffffffff8110fcc8>] ? sys_write+0x58/0x92
> >> [55817.737543]  [<ffffffff813c7f29>] ? system_call_fastpath+0x16/0x1b
> >> [55817.737544] Code: 8a 3b 45 31 f6 41 d0 ef 44 89 f9 45 89 ef 83 e1 07 41
> >> d3 e7 eb 1c 8b 7b 0c 44 01 f7 e8 59 74 eb ff 48 83 b8 90 00 00 00 00 74 04
> >> <0f> 0b eb fe 41 ff c6 45 39 fe 7c df 48 8b 5b >10 48 83 eb 10 48
> >> [55817.737560] RIP  [<ffffffff811eab09>] free_msi_irqs+0x5d/0x11b
> >> [55817.737562]  RSP <ffff88034211dd08>
> >> [55817.737563] ---[ end trace e1c5a8a903358804 ]---
> >
> >
> > Any chance anyone could help me identify the source of this problem?  Can I
> > work around it with software, or do I need a different motherboard or video
> > card to make it work?
> 
> cc'ing Konrad and a couple of other people who might be able to take a
> look at the BUG

That looks to be:

344 #ifdef CONFIG_GENERIC_HARDIRQS                                              
    
345                 for (i = 0; i < nvec; i++)                                  
    
346                         BUG_ON(irq_has_action(entry->irq + i));             
    
347 #endif                   

I have to say I hadn't actually compiled the kernel with GENERIC_HARDIRQS
in a while.

Looking at the code the issue seems that the MSI is enabled when
the PCI device was assigned to xen-pciback. That looks like a bug
in the radeon driver.

And based on our config (thanks!) you could also do this on your Linux
command line:

xen-pciback.hide=(0e:00.0)

If you do that and try to pass in the radeon device does it work?

Thanks!
> 
> > I'm running Debian 6.0.7 x86_64 with Kernel 3.8.11, compiled with the
> > following config: http://tny.cz/e28f7351
> 
> What version of Xen are you using?
> 
> Thanks,
>  -George
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.