[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: xen-swiotlb
Hi Konrad, It seems i have a working setup now, passing through a xhci controller to a guest, and keep grabbing video, it has ran for 2 days now. But i have changed a lot of parameters, so I will try to find out which one seems to be the problem, or that everything is miraculously fixed :-) (it's on other hardware (intel platform) now, as the only VM, newer dom0 kernel, newer domU kernel, no swiotlb=force anymore, the boot option pasi suggested, no irq balance etc.) Will post again when i know something more ! -- Sander Thursday, August 26, 2010, 11:43:21 PM, you wrote: >> > Looking at the output, the physical addresses that DMA-ed are: >> >> > 0x1f2962dc0 >> > 0x1f24f2e68 >> >> > and they look to be called quite often. In fact, there looks to be a >> > loop that does something like this: >> >> > again: >> > p = kmalloc(..) >> >> > dma = pci_map_single(p) >> > pci_dma_mapping_error(dma); >> > /* get some data.. */ >> > /* parse the: (pipe 0x80000280): IN: c0 00 00 00 0c 00 01 00 */ >> > pci_unmap_sg(dma); >> > goto again; >> >> > As the virtual address sent to pci_map_single looks to be sequentially >> > increasing. > I am not sure what virtual address I was thinking off. Looking again > at the dmesg output it looks actually correct. SWIOTLB is giving out > chunks, increasing its own index, and then unmaping those pages right > away. > This stack-trace in the middle of the log: > > Aug 15 23:14:02 security kernel: [ 103.646497] em28xx #0 em28xx_isoc_copy > :Video frame 1, length=960, odd > Aug 15 23:14:02 security kernel: [ 103.650177] xhci_hcd 0000:08:00.0: Miss > service interval error, set skip flag > Aug 15 23:14:02 security kernel: [ 103.651371] em28xx #0 em28xx_irq_callback > :urb completition error -18. > Aug 15 23:14:02 security kernel: [ 103.651406] em28xx #0 print_err_status > :URB status -18 [Unknown]. > Aug 15 23:14:02 security kernel: [ 103.651432] em28xx #0 em28xx_isoc_copy > :Video frame 2, length=2888, even > Aug 15 23:14:02 security kernel: [ 103.651454] em28xx #0 get_next_buf :No > active queue to serve > Aug 15 23:14:02 security kernel: [ 103.651477] em28xx #0 print_err_status > :URB packet 63, status -18 [Unknown]. > Aug 15 23:14:02 security kernel: [ 103.653143] motion: page allocation > failure. order:1, mode:0x20 > Aug 15 23:14:02 security kernel: [ 103.653177] Pid: 1418, comm: motion Not > tainted 2.6.352.6.36pre+pci-front-xhci-20100813-xhcidebug+ #14 > Aug 15 23:14:02 security kernel: [ 103.653205] Call Trace: > Aug 15 23:14:02 security kernel: [ 103.653219] <IRQ> [<ffffffff810a7ebf>] > __alloc_pages_nodemask+0x667/0x6c7 > Aug 15 23:14:02 security kernel: [ 103.653272] [<ffffffff810071ef>] ? > xen_restore_fl_direct_end+0x0/0x1 > Aug 15 23:14:02 security kernel: [ 103.653300] [<ffffffff810ca32f>] > cache_alloc_refill+0x3b3/0x851 > Aug 15 23:14:02 security kernel: [ 103.653326] [<ffffffff81006b4d>] ? > xen_force_evtchn_callback+0xd/0xf > Aug 15 23:14:02 security kernel: [ 103.653352] [<ffffffff81007202>] ? > check_events+0x12/0x20 > Aug 15 23:14:02 security kernel: [ 103.653375] [<ffffffff810071ef>] ? > xen_restore_fl_direct_end+0x0/0x1 > Aug 15 23:14:02 security kernel: [ 103.653401] [<ffffffff810ce9bd>] ? > create_object+0x2e/0x2a1 > Aug 15 23:14:02 security kernel: [ 103.653425] [<ffffffff810c9e3f>] > kmem_cache_alloc+0xf7/0x1c4 > Aug 15 23:14:02 security kernel: [ 103.653451] [<ffffffff8135e085>] ? > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:02 security kernel: [ 103.653475] [<ffffffff810ce9bd>] > create_object+0x2e/0x2a1 > Aug 15 23:14:02 security kernel: [ 103.653499] [<ffffffff8135e085>] ? > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:02 security kernel: [ 103.653523] [<ffffffff8135e085>] ? > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:02 security kernel: [ 103.653548] [<ffffffff810c8f68>] ? > cache_alloc_debugcheck_after+0x1bf/0x239 > Aug 15 23:14:02 security kernel: [ 103.653575] [<ffffffff8135e085>] ? > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:02 security kernel: [ 103.653602] [<ffffffff8143e7d7>] > kmemleak_alloc+0x21/0x3e > Aug 15 23:14:02 security kernel: [ 103.653625] [<ffffffff810c9edf>] > kmem_cache_alloc+0x197/0x1c4 > Aug 15 23:14:02 security kernel: [ 103.653649] [<ffffffff8135e085>] > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:02 security kernel: [ 103.653675] [<ffffffff81340450>] > usb_hcd_submit_urb+0xb1f/0xbf5 > Aug 15 23:14:02 security kernel: [ 103.653701] [<ffffffff810071ef>] ? > xen_restore_fl_direct_end+0x0/0x1 > Aug 15 23:14:02 security kernel: [ 103.653726] [<ffffffff81006b4d>] ? > xen_force_evtchn_callback+0xd/0xf > Aug 15 23:14:02 security kernel: [ 103.653751] [<ffffffff81007202>] ? > check_events+0x12/0x20 > Aug 15 23:14:02 security kernel: [ 103.653777] [<ffffffff812b6267>] ? > vt_console_print+0x43/0x346 > Aug 15 23:14:02 security kernel: [ 103.653802] [<ffffffff810071ef>] ? > xen_restore_fl_direct_end+0x0/0x1 > Aug 15 23:14:02 security kernel: [ 103.653830] [<ffffffff810071ef>] ? > xen_restore_fl_direct_end+0x0/0x1 > Aug 15 23:14:02 security kernel: [ 103.653857] [<ffffffff8106d5b0>] ? > lock_release+0x1e0/0x1ed > Aug 15 23:14:03 security kernel: [ 103.653882] [<ffffffff81340d45>] > usb_submit_urb+0x249/0x260 > Aug 15 23:14:03 security kernel: [ 103.653919] [<ffffffffa0060615>] > em28xx_irq_callback+0xc9/0xfd [em28xx] > Aug 15 23:14:03 security kernel: [ 103.653946] [<ffffffff8133f16a>] > usb_hcd_giveback_urb+0x84/0xb8 > Aug 15 23:14:03 security kernel: [ 103.653972] [<ffffffff8136480a>] > xhci_handle_event+0xced/0xdf7 > Aug 15 23:14:03 security kernel: [ 103.653998] [<ffffffff813648a8>] > xhci_handle_event+0xd8b/0xdf7 > Aug 15 23:14:03 security kernel: [ 103.654024] [<ffffffff813648a8>] > xhci_handle_event+0xd8b/0xdf7 > Aug 15 23:14:03 security kernel: [ 103.654050] [<ffffffff81364a31>] > xhci_irq+0x11d/0x1af > Aug 15 23:14:03 security kernel: [ 103.654074] [<ffffffff81364ade>] > xhci_msi_irq+0x1b/0x1d > Aug 15 23:14:03 security kernel: [ 103.654103] [<ffffffff8108e4aa>] > handle_IRQ_event+0x20/0xc7 > Aug 15 23:14:03 security kernel: [ 103.654128] [<ffffffff810902a4>] > handle_level_irq+0x98/0xf3 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff812997a8>] > __xen_evtchn_do_upcall+0xe1/0x183 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8129a423>] > xen_evtchn_do_upcall+0x2a/0x3c > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8100a9ee>] > xen_do_hypervisor_callback+0x1e/0x30 > Aug 15 23:14:03 security kernel: [ 103.654151] <EOI> [<ffffffff810012eb>] > ? hypercall_page+0x2eb/0x1006 > Aug 15 23:14:03 security kernel: [ 103.654151] Mem-Info: > Aug 15 23:14:03 security kernel: [ 103.654151] DMA per-cpu: > Aug 15 23:14:03 security kernel: [ 103.654151] CPU 0: hi: 0, btch: 1 > usd: 0 > Aug 15 23:14:03 security kernel: [ 103.654151] DMA32 per-cpu: > Aug 15 23:14:03 security kernel: [ 103.654151] CPU 0: hi: 186, btch: 31 > usd: 168 > Aug 15 23:14:03 security kernel: [ 103.654151] active_anon:4503 > inactive_anon:4767 isolated_anon:0 > Aug 15 23:14:03 security kernel: [ 103.654151] active_file:4071 > inactive_file:9087 isolated_file:0 > Aug 15 23:14:03 security kernel: [ 103.654151] unevictable:0 dirty:1081 > writeback:0 unstable:0 > Aug 15 23:14:03 security kernel: [ 103.654151] free:1584 > slab_reclaimable:5991 slab_unreclaimable:153956 > Aug 15 23:14:03 security kernel: [ 103.654151] mapped:3401 shmem:48 > pagetables:1452 bounce:0 > Aug 15 23:14:03 security kernel: [ 103.654151] DMA free:3992kB min:60kB > low:72kB high:88kB active_anon:84kB inactive_anon:436kB active_file:24kB > inactive_file:148kB unevictable:0kB isolated(anon):0kB isolated(file):0kB > present:15712kB mlocked:0kB dirty:20kB writeback:0kB mapped:16kB shmem:0kB > slab_reclaimable:264kB slab_unreclaimable:8932kB kernel_stack:0kB > pagetables:384kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 > all_unreclaimable? no > Aug 15 23:14:03 security kernel: [ 103.654151] lowmem_reserve[]: 0 994 994 > 994 > Aug 15 23:14:03 security kernel: [ 103.654151] DMA32 free:2344kB min:4000kB > low:5000kB high:6000kB active_anon:17928kB inactive_anon:18632kB > active_file:16260kB inactive_file:36200kB unevictable:0kB isolated(anon):0kB > isolated(file):0kB present:1018080kB mlocked:0kB dirty:4304kB writeback:0kB > mapped:13588kB shmem:192kB slab_reclaimable:23700kB > slab_unreclaimable:606892kB kernel_stack:720kB pagetables:5424kB unstable:0kB > bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no > Aug 15 23:14:03 security kernel: [ 103.654151] lowmem_reserve[]: 0 0 0 0 > Aug 15 23:14:03 security kernel: [ 103.654151] DMA: 88*4kB 1*8kB 1*16kB > 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3992kB > Aug 15 23:14:03 security kernel: [ 103.654151] DMA32: 406*4kB 0*8kB 1*16kB > 0*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2344kB > Aug 15 23:14:03 security kernel: [ 103.654151] 13236 total pagecache pages > Aug 15 23:14:03 security kernel: [ 103.654151] 21 pages in swap cache > Aug 15 23:14:03 security kernel: [ 103.654151] Swap cache stats: add 32, > delete 11, find 27/27 > Aug 15 23:14:03 security kernel: [ 103.654151] Free swap = 524200kB > Aug 15 23:14:03 security kernel: [ 103.654151] Total swap = 524284kB > Aug 15 23:14:03 security kernel: [ 103.654151] 262128 pages RAM > Aug 15 23:14:03 security kernel: [ 103.654151] 73203 pages reserved > Aug 15 23:14:03 security kernel: [ 103.654151] 21103 pages shared > Aug 15 23:14:03 security kernel: [ 103.654151] 105944 pages non-shared > Aug 15 23:14:03 security kernel: [ 103.654151] kmemleak: Cannot allocate a > kmemleak_object structure > Aug 15 23:14:03 security kernel: [ 103.654151] Pid: 1418, comm: motion Not > tainted 2.6.352.6.36pre+pci-front-xhci-20100813-xhcidebug+ #14 > Aug 15 23:14:03 security kernel: [ 103.654151] Call Trace: > Aug 15 23:14:03 security kernel: [ 103.654151] <IRQ> [<ffffffff810ce9d8>] > create_object+0x49/0x2a1 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8135e085>] ? > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8135e085>] ? > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810c8f68>] ? > cache_alloc_debugcheck_after+0x1bf/0x239 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8135e085>] ? > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8143e7d7>] > kmemleak_alloc+0x21/0x3e > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810c9edf>] > kmem_cache_alloc+0x197/0x1c4 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8135e085>] > xhci_urb_enqueue+0x14c/0x3f5 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81340450>] > usb_hcd_submit_urb+0xb1f/0xbf5 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810071ef>] ? > xen_restore_fl_direct_end+0x0/0x1 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81006b4d>] ? > xen_force_evtchn_callback+0xd/0xf > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81007202>] ? > check_events+0x12/0x20 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff812b6267>] ? > vt_console_print+0x43/0x346 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810071ef>] ? > xen_restore_fl_direct_end+0x0/0x1 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810071ef>] ? > xen_restore_fl_direct_end+0x0/0x1 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8106d5b0>] ? > lock_release+0x1e0/0x1ed > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81340d45>] > usb_submit_urb+0x249/0x260 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffffa0060615>] > em28xx_irq_callback+0xc9/0xfd [em28xx] > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8133f16a>] > usb_hcd_giveback_urb+0x84/0xb8 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8136480a>] > xhci_handle_event+0xced/0xdf7 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff813648a8>] > xhci_handle_event+0xd8b/0xdf7 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff813648a8>] > xhci_handle_event+0xd8b/0xdf7 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81364a31>] > xhci_irq+0x11d/0x1af > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81364ade>] > xhci_msi_irq+0x1b/0x1d > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8108e4aa>] > handle_IRQ_event+0x20/0xc7 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810902a4>] > handle_level_irq+0x98/0xf3 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff812997a8>] > __xen_evtchn_do_upcall+0xe1/0x183 > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8129a423>] > xen_evtchn_do_upcall+0x2a/0x3c > Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8100a9ee>] > xen_do_hypervisor_callback+0x1e/0x30 > Aug 15 23:14:03 security kernel: [ 103.654151] <EOI> [<ffffffff810012eb>] > ? hypercall_page+0x2eb/0x1006 > Aug 15 23:14:03 security kernel: [ 103.654151] kmemleak: Kernel memory leak > detector disabled > Aug 15 23:14:03 security kernel: [ 103.654151] xhci_hcd 0000:08:00.0: Found > td. Clear skip flag. > Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status > :URB packet 0, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status > :URB packet 1, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status > :URB packet 2, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status > :URB packet 3, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status > :URB packet 4, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status > :URB packet 5, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status > :URB packet 6, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status > :URB packet 7, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.659299] xhci_hcd 0000:08:00.0: Miss > service interval error, set skip flag > Aug 15 23:14:03 security kernel: [ 103.659307] xhci_hcd 0000:08:00.0: Found > td. Clear skip flag. > Aug 15 23:14:03 security kernel: [ 103.661748] kmemleak: Automatic memory > scanning thread ended > Aug 15 23:14:03 security kernel: [ 103.755620] em28xx #0 print_err_status > :URB packet 6, status -18 [Unknown]. > Aug 15 23:14:03 security kernel: [ 103.755630] em28xx #0 print_err_status > :URB packet 7, status -18 [Unknown]. > The first fault is kmemleaker not being able to allocate its structure, > and then it throws an error too. > Then the em28xx gets horribly confused that it could not send an URB > anymore and from there one things get worst. > There is something causing this memory problem,and I think we can narrow > it down if you output the kmemleaker output. You might have to > unload/reload the em28xx driver to get an idea of where the leak is. >> So what i have as domU kernel is: >> - from Linus tree 2.6.36-pre with latest commit >> 8357422d4bf33bc2c35884d4016c3fc9efbbc1d2 >> - pulled your devel/xen-pcifront-0.5 tree, and fixed the merge conflicts due >> to the pv on hvm patches, in the same way Jeremy did for his 2.6.36 branch. >> Conflicts: >> drivers/xen/events.c >> include/xen/events.h >> >> - Added a patch for xhci isoc that fixes another bug. (attached >> 0001-xHCI-update-ring-dequeue-pointer-when-process-missed.patch) >> - Added a patch that shows extra debug info for xhci from the author of the >> xhci-isoc pathes (attached isoc_length5.patch) >> >> - Changed some debug lines from dbg to warn level, (just enabling xhci-debug >> in kernel config floods the logs too fast, so i just enabled them in >> xhci-mem.c) >> (Attached a patch with all my changes to xhci*, including the 2 patches >> above applied.) >> >> - Added some printk's to swiotlb-xen.c to see which functions were used, and >> let some of them print the address as well, in the hope i could find some >> debug info there.(attached) > So the other thing I realized is that xhci can actually do 64bit > transfers. Try booting without the 'swiotlb=force' (but still have > iommu=soft). >> >> Apart from fixing the xhci in the end, is there a way to prevent xen from >> freezing altogether without leaving a trace ? > I am still puzzled as why Xen hypervisor would freeze. It _should_ work > just fine. >> Even an Oops is much easier to debug than a freeze. Due to the nature of DMA >> that could perhaps be difficult, although there is an DMA API ... >> >> Had a fruitful LinuxCon ? > Quite so! >> >> -- >> Sander -- Best regards, Sander mailto:linux@xxxxxxxxxxxxxx _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |