[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen-unstable: pci-passthrough "irq 16: nobody cared" on HVM guest shutdown on irq of device not passed through.

Wednesday, October 8, 2014, 2:56:53 PM, you wrote:

> On Tue, Oct 07, 2014 at 03:50:03PM +0100, Jan Beulich wrote:
>> >>> On 07.10.14 at 15:41, <konrad.wilk@xxxxxxxxxx> wrote:
>> > Could you attach also the full dmesg under baremetal with 'debug' and all
>> > kinds of debug enabled ? That should help a bit in figuring out why
>> > they get MSIs under baremetal but legacy interrupts under Xen.
>> The messages he sent don't really suggest that. The legacy pin
>> based IRQ always gets set up when a device gets enabled, no
>> matter whether in the end it would actually get used. And afaict
>> other messages clearly hint at MSI being used for the PCIe stuff.

> Correct. I fear that in the domain0 we have set an event for this
> particular GSI (16) which is also in use in the guest (and then somehow
> we did not tear this down when the PCIe setup the MSI).

> Xen will send events to both domains - and since domain0 does not
> have an IRQ handler for it - it will activate its anti-IRQ storm
> routine and disabling the IRQ line.

>> Jan

Hi Konrad / Jan,

Sorry for the late response, been kind of busy.
Added some debug code around the pcieport and PME code, and also round the pci 
reset code (since it is triggered on guest shutdown).

I have attached to this mail:

- Booting of the machine with booting all guests, including guests which have 
  pci devices passed through and which function fine and which can be perfectly 
  restarted without any issue, at the end is the start and shutdown of the 
  having the vga card passed through (09:00.0 and 09:00.1) which causes the irq 
16: nobody cared when i shut it down.
  I marked the logs where this gets started.
        - xen-xl-dmesg.txt
        - xen-dmesg.txt
        - xen-lspci.txt
        - xen-lspci-tv.txt
        - xen-lspci-vvvknn.txt
        - xen-proc-interrupts-before.txt (before guest with vga card passed 
through started)
        - xen-proc-interrupts-after.txt  (after guest with vga card passed 
through started)

- Booting of the machine with a baremetal kernel and booting and shutting down 
  kvm/qemu + vfio-pci guest having the same vga card passed through (though 
it's primary passthrough with KVM 
  and secondary with Xen, but that shouldn't make a difference).
  The kvm guest is started with this qemu commandline:

        /usr/local/bin/qemu-system-x86_64 -machine type=pc,accel=kvm -cpu host 
-smp 2,sockets=1,cores=2 -hda /dev/xen_vms/xbmc_kvm -m 1024 -boot c -vnc -k en-us -device 
vfio-pci,host=09:00.0,x-vga=on,rombar=0,romfile=/root/07rom.bin -vga none 
-device e1000,netdev=net0,mac=DE:AD:BE:EF:AA:13 -netdev 

        - kvm-dmesg.txt
        - kvm-lspci.txt
        - kvm-lspci-tv.txt
        - kvm-lspci-vvvknn.txt
        - kvm-proc-interrupts-before.txt (before guest with vga card passed 
through started)
        - kvm-proc-interrupts-after.txt  (after guest with vga card passed 
through started)
- kernel-debug.patch (patch with the extra debug code i added to a v3.17 kernel)
- config-3.17.0-20141008-vanilla-kvm-debug4+ (kernel .config used for both xen 
  dom0 and kvm boot)
- grub.cfg

Some oddities i noticed (don't know their relevance but you never know):

- it doesn't happen on *every* shutdown of the guest with the vga card passed 
  through, it happens *most* of the time, so it has the character of a race ..

- The pcieport's all get consecutive relative high irq's assigned (52 to 54
  although these also seem to double ..) , accept  0000:00:15.0 which gets irq 
16 ..
  could this be due to some pcie lanes being connected to the southbridge ?
  (see for a graph of the chipset 

- The soundcard that also gets irq 16 assigned is also on the southbridge, 
which could make it a bios/acpi-table issue ?

- However the device i'm passing through is NOT on that pcieport .. so that 
would undermine the forgoing...

- The device on that pcieport is A vga card (dom0's vga console) and it does 
also have a snd_hda_intel (hdmi)

- The irq's action handler when it trips the 'irq 16 nobody card' is 
'azx_interrupt()' which IS the interrupt handler for the snd_hda_intel driver

- However under KVM this issue doesn't seem to be there .. so that would 
undermine the forgoing... 

- The other passed through pci devices (which guests don't give an issue when 
restarted) all have a different reset on guest start:
  - first a pm_reset, then a secondary bus reset
  - the vga card only gets a pm_reset, no secondary bus reset ...

- To my surprise these resets on guest start don't seem to directly originate 
from xen-pciback code ?
  (i introduced a pci_reset_xen() to see if i could skip the pm_reset for the 
vga card and see if that would change something,
  but since it seems to originate from some other generic code that didn't 

- It all doesn't seem to connect .. *sigh*

Hope you will be able to spot something and make sense of it :-)




Attachment: config-3.17.0-20141008-vanilla-kvm-debug4+
Description: Binary data

Attachment: grub.cfg
Description: Binary data

Attachment: kernel-debug.patch
Description: Binary data

Attachment: xen-dmesg.txt
Description: Text document

Attachment: xen-lspci.txt
Description: Text document

Attachment: xen-lspci-tv.txt
Description: Text document

Attachment: xen-lspci-vvvknn.txt
Description: Text document

Attachment: xen-proc-interrupts-after.txt
Description: Text document

Attachment: xen-proc-interrupts-before.txt
Description: Text document

Attachment: xen-xl-dmesg.txt
Description: Text document

Attachment: kvm-dmesg.txt
Description: Text document

Attachment: kvm-lspci.txt
Description: Text document

Attachment: kvm-lspci-tv.txt
Description: Text document

Attachment: kvm-lspci-vvvknn.txt
Description: Text document

Attachment: kvm-proc-interrupts-after.txt
Description: Text document

Attachment: kvm-proc-interrupts-before.txt
Description: Text document

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.