[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: xen-swiotlb
> > Looking at the output, the physical addresses that DMA-ed are: > > > 0x1f2962dc0 > > 0x1f24f2e68 > > > and they look to be called quite often. In fact, there looks to be a > > loop that does something like this: > > > again: > > p = kmalloc(..) > > > dma = pci_map_single(p) > > pci_dma_mapping_error(dma); > > /* get some data.. */ > > /* parse the: (pipe 0x80000280): IN: c0 00 00 00 0c 00 01 00 */ > > pci_unmap_sg(dma); > > goto again; > > > As the virtual address sent to pci_map_single looks to be sequentially > > increasing. I am not sure what virtual address I was thinking off. Looking again at the dmesg output it looks actually correct. SWIOTLB is giving out chunks, increasing its own index, and then unmaping those pages right away. This stack-trace in the middle of the log: Aug 15 23:14:02 security kernel: [ 103.646497] em28xx #0 em28xx_isoc_copy :Video frame 1, length=960, odd Aug 15 23:14:02 security kernel: [ 103.650177] xhci_hcd 0000:08:00.0: Miss service interval error, set skip flag Aug 15 23:14:02 security kernel: [ 103.651371] em28xx #0 em28xx_irq_callback :urb completition error -18. Aug 15 23:14:02 security kernel: [ 103.651406] em28xx #0 print_err_status :URB status -18 [Unknown]. Aug 15 23:14:02 security kernel: [ 103.651432] em28xx #0 em28xx_isoc_copy :Video frame 2, length=2888, even Aug 15 23:14:02 security kernel: [ 103.651454] em28xx #0 get_next_buf :No active queue to serve Aug 15 23:14:02 security kernel: [ 103.651477] em28xx #0 print_err_status :URB packet 63, status -18 [Unknown]. Aug 15 23:14:02 security kernel: [ 103.653143] motion: page allocation failure. order:1, mode:0x20 Aug 15 23:14:02 security kernel: [ 103.653177] Pid: 1418, comm: motion Not tainted 2.6.352.6.36pre+pci-front-xhci-20100813-xhcidebug+ #14 Aug 15 23:14:02 security kernel: [ 103.653205] Call Trace: Aug 15 23:14:02 security kernel: [ 103.653219] <IRQ> [<ffffffff810a7ebf>] __alloc_pages_nodemask+0x667/0x6c7 Aug 15 23:14:02 security kernel: [ 103.653272] [<ffffffff810071ef>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 15 23:14:02 security kernel: [ 103.653300] [<ffffffff810ca32f>] cache_alloc_refill+0x3b3/0x851 Aug 15 23:14:02 security kernel: [ 103.653326] [<ffffffff81006b4d>] ? xen_force_evtchn_callback+0xd/0xf Aug 15 23:14:02 security kernel: [ 103.653352] [<ffffffff81007202>] ? check_events+0x12/0x20 Aug 15 23:14:02 security kernel: [ 103.653375] [<ffffffff810071ef>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 15 23:14:02 security kernel: [ 103.653401] [<ffffffff810ce9bd>] ? create_object+0x2e/0x2a1 Aug 15 23:14:02 security kernel: [ 103.653425] [<ffffffff810c9e3f>] kmem_cache_alloc+0xf7/0x1c4 Aug 15 23:14:02 security kernel: [ 103.653451] [<ffffffff8135e085>] ? xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:02 security kernel: [ 103.653475] [<ffffffff810ce9bd>] create_object+0x2e/0x2a1 Aug 15 23:14:02 security kernel: [ 103.653499] [<ffffffff8135e085>] ? xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:02 security kernel: [ 103.653523] [<ffffffff8135e085>] ? xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:02 security kernel: [ 103.653548] [<ffffffff810c8f68>] ? cache_alloc_debugcheck_after+0x1bf/0x239 Aug 15 23:14:02 security kernel: [ 103.653575] [<ffffffff8135e085>] ? xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:02 security kernel: [ 103.653602] [<ffffffff8143e7d7>] kmemleak_alloc+0x21/0x3e Aug 15 23:14:02 security kernel: [ 103.653625] [<ffffffff810c9edf>] kmem_cache_alloc+0x197/0x1c4 Aug 15 23:14:02 security kernel: [ 103.653649] [<ffffffff8135e085>] xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:02 security kernel: [ 103.653675] [<ffffffff81340450>] usb_hcd_submit_urb+0xb1f/0xbf5 Aug 15 23:14:02 security kernel: [ 103.653701] [<ffffffff810071ef>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 15 23:14:02 security kernel: [ 103.653726] [<ffffffff81006b4d>] ? xen_force_evtchn_callback+0xd/0xf Aug 15 23:14:02 security kernel: [ 103.653751] [<ffffffff81007202>] ? check_events+0x12/0x20 Aug 15 23:14:02 security kernel: [ 103.653777] [<ffffffff812b6267>] ? vt_console_print+0x43/0x346 Aug 15 23:14:02 security kernel: [ 103.653802] [<ffffffff810071ef>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 15 23:14:02 security kernel: [ 103.653830] [<ffffffff810071ef>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 15 23:14:02 security kernel: [ 103.653857] [<ffffffff8106d5b0>] ? lock_release+0x1e0/0x1ed Aug 15 23:14:03 security kernel: [ 103.653882] [<ffffffff81340d45>] usb_submit_urb+0x249/0x260 Aug 15 23:14:03 security kernel: [ 103.653919] [<ffffffffa0060615>] em28xx_irq_callback+0xc9/0xfd [em28xx] Aug 15 23:14:03 security kernel: [ 103.653946] [<ffffffff8133f16a>] usb_hcd_giveback_urb+0x84/0xb8 Aug 15 23:14:03 security kernel: [ 103.653972] [<ffffffff8136480a>] xhci_handle_event+0xced/0xdf7 Aug 15 23:14:03 security kernel: [ 103.653998] [<ffffffff813648a8>] xhci_handle_event+0xd8b/0xdf7 Aug 15 23:14:03 security kernel: [ 103.654024] [<ffffffff813648a8>] xhci_handle_event+0xd8b/0xdf7 Aug 15 23:14:03 security kernel: [ 103.654050] [<ffffffff81364a31>] xhci_irq+0x11d/0x1af Aug 15 23:14:03 security kernel: [ 103.654074] [<ffffffff81364ade>] xhci_msi_irq+0x1b/0x1d Aug 15 23:14:03 security kernel: [ 103.654103] [<ffffffff8108e4aa>] handle_IRQ_event+0x20/0xc7 Aug 15 23:14:03 security kernel: [ 103.654128] [<ffffffff810902a4>] handle_level_irq+0x98/0xf3 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff812997a8>] __xen_evtchn_do_upcall+0xe1/0x183 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8129a423>] xen_evtchn_do_upcall+0x2a/0x3c Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8100a9ee>] xen_do_hypervisor_callback+0x1e/0x30 Aug 15 23:14:03 security kernel: [ 103.654151] <EOI> [<ffffffff810012eb>] ? hypercall_page+0x2eb/0x1006 Aug 15 23:14:03 security kernel: [ 103.654151] Mem-Info: Aug 15 23:14:03 security kernel: [ 103.654151] DMA per-cpu: Aug 15 23:14:03 security kernel: [ 103.654151] CPU 0: hi: 0, btch: 1 usd: 0 Aug 15 23:14:03 security kernel: [ 103.654151] DMA32 per-cpu: Aug 15 23:14:03 security kernel: [ 103.654151] CPU 0: hi: 186, btch: 31 usd: 168 Aug 15 23:14:03 security kernel: [ 103.654151] active_anon:4503 inactive_anon:4767 isolated_anon:0 Aug 15 23:14:03 security kernel: [ 103.654151] active_file:4071 inactive_file:9087 isolated_file:0 Aug 15 23:14:03 security kernel: [ 103.654151] unevictable:0 dirty:1081 writeback:0 unstable:0 Aug 15 23:14:03 security kernel: [ 103.654151] free:1584 slab_reclaimable:5991 slab_unreclaimable:153956 Aug 15 23:14:03 security kernel: [ 103.654151] mapped:3401 shmem:48 pagetables:1452 bounce:0 Aug 15 23:14:03 security kernel: [ 103.654151] DMA free:3992kB min:60kB low:72kB high:88kB active_anon:84kB inactive_anon:436kB active_file:24kB inactive_file:148kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15712kB mlocked:0kB dirty:20kB writeback:0kB mapped:16kB shmem:0kB slab_reclaimable:264kB slab_unreclaimable:8932kB kernel_stack:0kB pagetables:384kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Aug 15 23:14:03 security kernel: [ 103.654151] lowmem_reserve[]: 0 994 994 994 Aug 15 23:14:03 security kernel: [ 103.654151] DMA32 free:2344kB min:4000kB low:5000kB high:6000kB active_anon:17928kB inactive_anon:18632kB active_file:16260kB inactive_file:36200kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:1018080kB mlocked:0kB dirty:4304kB writeback:0kB mapped:13588kB shmem:192kB slab_reclaimable:23700kB slab_unreclaimable:606892kB kernel_stack:720kB pagetables:5424kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:0 all_unreclaimable? no Aug 15 23:14:03 security kernel: [ 103.654151] lowmem_reserve[]: 0 0 0 0 Aug 15 23:14:03 security kernel: [ 103.654151] DMA: 88*4kB 1*8kB 1*16kB 1*32kB 0*64kB 0*128kB 0*256kB 1*512kB 1*1024kB 1*2048kB 0*4096kB = 3992kB Aug 15 23:14:03 security kernel: [ 103.654151] DMA32: 406*4kB 0*8kB 1*16kB 0*32kB 1*64kB 1*128kB 0*256kB 1*512kB 0*1024kB 0*2048kB 0*4096kB = 2344kB Aug 15 23:14:03 security kernel: [ 103.654151] 13236 total pagecache pages Aug 15 23:14:03 security kernel: [ 103.654151] 21 pages in swap cache Aug 15 23:14:03 security kernel: [ 103.654151] Swap cache stats: add 32, delete 11, find 27/27 Aug 15 23:14:03 security kernel: [ 103.654151] Free swap = 524200kB Aug 15 23:14:03 security kernel: [ 103.654151] Total swap = 524284kB Aug 15 23:14:03 security kernel: [ 103.654151] 262128 pages RAM Aug 15 23:14:03 security kernel: [ 103.654151] 73203 pages reserved Aug 15 23:14:03 security kernel: [ 103.654151] 21103 pages shared Aug 15 23:14:03 security kernel: [ 103.654151] 105944 pages non-shared Aug 15 23:14:03 security kernel: [ 103.654151] kmemleak: Cannot allocate a kmemleak_object structure Aug 15 23:14:03 security kernel: [ 103.654151] Pid: 1418, comm: motion Not tainted 2.6.352.6.36pre+pci-front-xhci-20100813-xhcidebug+ #14 Aug 15 23:14:03 security kernel: [ 103.654151] Call Trace: Aug 15 23:14:03 security kernel: [ 103.654151] <IRQ> [<ffffffff810ce9d8>] create_object+0x49/0x2a1 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8135e085>] ? xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8135e085>] ? xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810c8f68>] ? cache_alloc_debugcheck_after+0x1bf/0x239 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8135e085>] ? xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8143e7d7>] kmemleak_alloc+0x21/0x3e Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810c9edf>] kmem_cache_alloc+0x197/0x1c4 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8135e085>] xhci_urb_enqueue+0x14c/0x3f5 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81340450>] usb_hcd_submit_urb+0xb1f/0xbf5 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810071ef>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81006b4d>] ? xen_force_evtchn_callback+0xd/0xf Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81007202>] ? check_events+0x12/0x20 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff812b6267>] ? vt_console_print+0x43/0x346 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810071ef>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810071ef>] ? xen_restore_fl_direct_end+0x0/0x1 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8106d5b0>] ? lock_release+0x1e0/0x1ed Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81340d45>] usb_submit_urb+0x249/0x260 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffffa0060615>] em28xx_irq_callback+0xc9/0xfd [em28xx] Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8133f16a>] usb_hcd_giveback_urb+0x84/0xb8 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8136480a>] xhci_handle_event+0xced/0xdf7 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff813648a8>] xhci_handle_event+0xd8b/0xdf7 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff813648a8>] xhci_handle_event+0xd8b/0xdf7 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81364a31>] xhci_irq+0x11d/0x1af Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff81364ade>] xhci_msi_irq+0x1b/0x1d Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8108e4aa>] handle_IRQ_event+0x20/0xc7 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff810902a4>] handle_level_irq+0x98/0xf3 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff812997a8>] __xen_evtchn_do_upcall+0xe1/0x183 Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8129a423>] xen_evtchn_do_upcall+0x2a/0x3c Aug 15 23:14:03 security kernel: [ 103.654151] [<ffffffff8100a9ee>] xen_do_hypervisor_callback+0x1e/0x30 Aug 15 23:14:03 security kernel: [ 103.654151] <EOI> [<ffffffff810012eb>] ? hypercall_page+0x2eb/0x1006 Aug 15 23:14:03 security kernel: [ 103.654151] kmemleak: Kernel memory leak detector disabled Aug 15 23:14:03 security kernel: [ 103.654151] xhci_hcd 0000:08:00.0: Found td. Clear skip flag. Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status :URB packet 0, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status :URB packet 1, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status :URB packet 2, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status :URB packet 3, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status :URB packet 4, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status :URB packet 5, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status :URB packet 6, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.654151] em28xx #0 print_err_status :URB packet 7, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.659299] xhci_hcd 0000:08:00.0: Miss service interval error, set skip flag Aug 15 23:14:03 security kernel: [ 103.659307] xhci_hcd 0000:08:00.0: Found td. Clear skip flag. Aug 15 23:14:03 security kernel: [ 103.661748] kmemleak: Automatic memory scanning thread ended Aug 15 23:14:03 security kernel: [ 103.755620] em28xx #0 print_err_status :URB packet 6, status -18 [Unknown]. Aug 15 23:14:03 security kernel: [ 103.755630] em28xx #0 print_err_status :URB packet 7, status -18 [Unknown]. The first fault is kmemleaker not being able to allocate its structure, and then it throws an error too. Then the em28xx gets horribly confused that it could not send an URB anymore and from there one things get worst. There is something causing this memory problem,and I think we can narrow it down if you output the kmemleaker output. You might have to unload/reload the em28xx driver to get an idea of where the leak is. > So what i have as domU kernel is: > - from Linus tree 2.6.36-pre with latest commit > 8357422d4bf33bc2c35884d4016c3fc9efbbc1d2 > - pulled your devel/xen-pcifront-0.5 tree, and fixed the merge conflicts due > to the pv on hvm patches, in the same way Jeremy did for his 2.6.36 branch. > Conflicts: > drivers/xen/events.c > include/xen/events.h > > - Added a patch for xhci isoc that fixes another bug. (attached > 0001-xHCI-update-ring-dequeue-pointer-when-process-missed.patch) > - Added a patch that shows extra debug info for xhci from the author of the > xhci-isoc pathes (attached isoc_length5.patch) > > - Changed some debug lines from dbg to warn level, (just enabling xhci-debug > in kernel config floods the logs too fast, so i just enabled them in > xhci-mem.c) > (Attached a patch with all my changes to xhci*, including the 2 patches > above applied.) > > - Added some printk's to swiotlb-xen.c to see which functions were used, and > let some of them print the address as well, in the hope i could find some > debug info there.(attached) So the other thing I realized is that xhci can actually do 64bit transfers. Try booting without the 'swiotlb=force' (but still have iommu=soft). > > Apart from fixing the xhci in the end, is there a way to prevent xen from > freezing altogether without leaving a trace ? I am still puzzled as why Xen hypervisor would freeze. It _should_ work just fine. > Even an Oops is much easier to debug than a freeze. Due to the nature of DMA > that could perhaps be difficult, although there is an DMA API ... > > Had a fruitful LinuxCon ? Quite so! > > -- > Sander _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |