[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Re: pci passthrough xhci host controller



Ok the freezing on a kernel compile with "make -j6" was a cpu0 stall, so it's 
locked, by that amount that i can't use ctrl-a to get in the hypervisor.
Removing the noirqbalance makes it possible to compile the kernel in dom0 while 
videograbbing in domU.

I can start a fish shop with all my red herrings :(


Saturday, October 2, 2010, 1:33:36 AM, you wrote:

> Hmmm i can get it to freeze with or without the mem=4G now.

> Letting the domU grab video, and let dom0 compile a kernel with make -j6 lets 
> the machine freeze after a very short while ..
> With all the debug things the machine seems a bit slow any how for a six 
> core, but it seems to choke on the interrupts generated by the xhci 
> controller.

> With the host controller now using 32bit instead of 64bit DMA it now shows 
> with or without the mem=4G some warnings before freezing:

> Oct  2 00:23:07 security kernel: [  524.020717] xhci_hcd 0000:07:00.0: 
> Spurious interrupt.
> Oct  2 00:23:10 security kernel: [  526.926654] xhci_hcd 0000:07:00.0: 
> Spurious interrupt.
> Oct  2 00:23:11 security kernel: [  527.714567] xhci_hcd 0000:07:00.0: 
> Spurious interrupt.
> Oct  2 00:23:42 security kernel: [  558.402659] xhci_hcd 0000:07:00.0: 
> Spurious interrupt.
> Oct  2 00:25:00 security kernel: [  636.278406] xhci_hcd 0000:07:00.0: 
> Spurious interrupt.

> When i do the kernel compile with the domU started, but not grabbing video, 
> the kernel compile completes without a problem.
> With the domU running cpuburn, it does complete without a problem.
> I do have the feeling the videograbbing does cause a lot of interrupts .. 
> (this is still booting xen with noirqbalance and dom0 and domU with 
> pci=nomsi).

> So the 4G is then probably a red herring ...

> --
> Sander




> Friday, October 1, 2010, 10:54:17 PM, you wrote:

>> On Thu, Sep 30, 2010 at 09:24:48PM +0200, Sander Eikelenboom wrote:
>>> Hello Konrad,
>>> 
>>> I have done some more tests, the results:
>>> 
>>> - boot xen with mem=4G, > 2 days uptime with passthrough and videograbbing
>>> - boot xen without mem=4G, < 1 day freeze with passthrough and videograbbing
>>> - on both no problems as long as you don't grab video (so the controller 
>>> doesn't do much)
>>> - on both no problems when grabbing video with usb2, so it's xhci specific
>>> 
>>> I haven't changed anything else, same number of VM's running etc. etc., 
>>> videograbbing is working on both (until the freeze or until i ended the 
>>> test)
>>> I'm reading some messages about msi(-x) interrupt problems with xen on 
>>> xen-devel, and suggestions to try noirqbalance with xen, so on both i use 
>>> noirqbalance.
>>> 
>>> So it seems to be related to the amount of mem available.
>>> I do see one difference on the domU, with mem=4G i see some occasional 
>>> warnings in syslog:
>>> Sep 28 17:55:02 security kernel: [81744.078288] xhci_hcd 0000:07:00.0: 
>>> WARN: transfer error on endpoint
>>> Sep 28 17:55:02 security kernel: [81744.092653] xhci_hcd 0000:07:00.0: 
>>> WARN: transfer error on endpoint
>>> Sep 28 17:55:02 security kernel: [81744.093647] xhci_hcd 0000:07:00.0: 
>>> WARN: transfer error on endpoint
>>> Sep 28 17:55:02 security kernel: [81744.093647] xhci_hcd 0000:07:00.0: 
>>> WARN: transfer error on endpoint
>>> Sep 28 17:55:02 security kernel: [81744.093647] xhci_hcd 0000:07:00.0: 
>>> WARN: transfer error on endpoint
>>> 
>>> I don't see these warnings in the syslog when no mem=4G is used, so a hunch 
>>> would be it goes wrong there while the xhci code tries to clean something 
>>> up.
>>> It could do something "strange" that seems to work on bare metal and on xen 
>>> with mem=4G, but freezes everything with mem > 4G and gives no time to 
>>> write the warning to the syslog / disk in time.
>>> 
>>> in the syslog of dom0 i do see some occasional memleaks going by, but one 
>>> set could be related:
>>> Sep 28 17:55:19 localhost kernel: [81962.053321] kmemleak: 22 new suspected 
>>> memory leaks (see /sys/kernel/debug/kmemleak)
>>> 
>>> I will add a script that cat's the content of /sys/kernel/debug/kmemleak to 
>>> syslog when kmemleak reports new suspected leaks.
>>> 
>>> Any suggestions to try to debug this further ?

>> <shakes his head>
>> Do you have the name of the grabber + USB3 device? If it is not too much I 
>> might
>> as well get it and see what happens on my boxes.






-- 
Best regards,
 Sander                            mailto:linux@xxxxxxxxxxxxxx


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.