[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Re: [PATCH] blkfront: Move blkif_interrupt into a tasklet.
On Tue, Aug 16, 2011 at 04:26:54AM -0700, imammedo [via Xen] wrote: > > Jeremy Fitzhardinge wrote: > > > > Have you tried bisecting to see when this particular problem appeared? > > It looks to me like something is accidentally re-enabling interrupts - > > perhaps a stack overrun is corrupting the "flags" argument between a > > spin_lock_irqsave()/restore pair. > > > > Is it only on 32-bit kernels? > > Any specific reason you did not include xen-devel in this email? I am CC-ing it here. > ------------[ cut here ]------------ > [604001.659925] WARNING: at block/blk-core.c:239 blk_start_queue+0x70/0x80() > [604001.659964] Modules linked in: nfs lockd fscache auth_rpcgss nfs_acl > sunrpc ip6t_REJECT nf_conntrack_ipv6 nf_conntrack_ipv4 nf_defrag_ipv4 > nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables xen_netfront > pcspkr [last unloaded: scsi_wait_scan] > [604001.660147] Pid: 336, comm: udevd Tainted: G W 3.0.0+ #50 > [604001.660181] Call Trace: > [604001.660209] [<c045c512>] warn_slowpath_common+0x72/0xa0 > [604001.660243] [<c06643a0>] ? blk_start_queue+0x70/0x80 > [604001.660275] [<c06643a0>] ? blk_start_queue+0x70/0x80 > [604001.660310] [<c045c562>] warn_slowpath_null+0x22/0x30 > [604001.660343] [<c06643a0>] blk_start_queue+0x70/0x80 > [604001.660379] [<c075e231>] kick_pending_request_queues+0x21/0x30 > [604001.660417] [<c075e42f>] blkif_interrupt+0x19f/0x2b0 > ... > ------------[ cut here ]------------ > > I've debugged a bit blk-core warning and can say: > - Yes, It is 32-bit PAE kernel and happens only with it so far. > - Affects PV xen guest, bare-metal and kvm configs are not affected. > - Upstream kernel is affected as well. > - Reproduces on xen 4.1.1 and 3.1.2 hosts > > IF flag is always restored at drivers/md/dm.c > static void clone_endio(struct bio *bio, int error) > ... > dm_endio_fn endio = tio->ti->type->end_io; > ... > when page fault happens accessing tio->ti->type field. > > After successful resync with kernel's pagetable in > do_page_fault->vmalloc_fault, io continues happily on, however with IF flag > restored even if faulted context's eflags register had no IF flag set. > It happens with random task every time. > > Here is ftrace call graph showing problematic place: > ======================================================== > # tracer: function_graph > # > # function_graph latency trace v1.1.5 on 3.0.0+ > # -------------------------------------------------------------------- > # latency: 0 us, #42330/242738181, CPU#0 | (M:desktop VP:0, KP:0, SP:0 HP:0 > #P:1) > # ----------------- > # | task: -0 (uid:0 nice:0 policy:0 rt_prio:0) > # ----------------- > # > # _-----=> irqs-off > # / _----=> need-resched > # | / _---=> hardirq/softirq > # || / _--=> preempt-depth > # ||| / > # CPU|||| DURATION FUNCTION CALLS > # | |||| | | | | | | > 0) d... | xen_evtchn_do_upcall() { > 0) d... | irq_enter() { > 0) d.h. 2.880 us | } > 0) d.h. | __xen_evtchn_do_upcall() { > 0) d.h. 0.099 us | irq_to_desc(); > 0) d.h. | handle_edge_irq() { > 0) d.h. 0.107 us | _raw_spin_lock(); > 0) d.h. | ack_dynirq() { > 0) d.h. 3.153 us | } > 0) d.h. | handle_irq_event() { > 0) d.h. | handle_irq_event_percpu() { > 0) d.h. | blkif_interrupt() { > 0) d.h. 0.110 us | _raw_spin_lock_irqsave(); > 0) d.h. | __blk_end_request_all() { > 0) d.h. | > blk_update_bidi_request() { > 0) d.h. | blk_update_request() { > 0) d.h. | req_bio_endio() { > 0) d.h. | bio_endio() { > 0) d.h. | endio() { > 0) d.h. | bio_put() { > 0) d.h. 4.149 us | } > 0) d.h. | dec_count() { > 0) d.h. | > mempool_free() { > 0) d.h. 1.395 us | } > 0) d.h. | > read_callback() { > 0) d.h. | > bio_endio() { > 0) d.h. | > clone_endio() { > 0) d.h. | /* ==> > enter clone_endio: tio: c1e14c70 */ > 0) d.h. 0.104 us | > arch_irqs_disabled_flags(); > 0) d.h. | /* ==> > clone_endio: endio = tio->ti->type->end_io: tio->ti c918c040 */ > 0) d.h. 0.100 us | > arch_irqs_disabled_flags(); > 0) d.h. 0.117 us | > mirror_end_io(); > 0) d.h. | > free_tio() { > 0) d.h. 2.269 us | } > 0) d.h. | > bio_put() { > 0) d.h. 3.933 us | } > 0) d.h. | > dec_pending() { > 0) d.h. 0.100 us | > atomic_dec_and_test(); > 0) d.h. | > end_io_acct() { > 0) d.h. 5.655 us | } > 0) d.h. | > free_io() { > 0) d.h. 1.992 us | } > 0) d.h. 0.098 us | > trace_block_bio_complete(); > 0) d.h. | > bio_endio() { > 0) d.h. | > clone_endio() { > 0) d.h. | > /* ==> enter clone_endio: tio: c1e14ee0 */ > 0) d.h. 0.098 us | > arch_irqs_disabled_flags(); > 0) d.h. | > do_page_fault() { > 0) d.h. 0.103 us | > xen_read_cr2(); > 0) d.h. | > /* dpf: tsk: c785a6a0 mm: 0 comm: kworker/0:0 */ > 0) d.h. | > /* before vmalloc_fault (c9552044) regs: c786db1c ip: c082bb20 eflags: > 10002 err: 0 irq: off */ > ^^^ - fault error code > 0) d.h. | > vmalloc_fault() { > 0) d.h. 0.104 us | > > xen_read_cr3(); > 0) d.h. | > > xen_pgd_val(); > 0) d.h. | > > xen_pgd_val(); > 0) d.h. | > > xen_set_pmd(); > 0) d.h. | > > xen_pmd_val(); > 0) d.h.+ 14.599 us | > } > 0) d.h.+ 18.019 us | > } > v -- irq enabled > 0) ..h. | > /* ==> clone_endio: endio = tio->ti->type->end_io: tio->ti c9552040 */ > 0) ..h. 0.102 us | > arch_irqs_disabled_flags(); > 0) ..h. | > /* <7>clone_endio BUG DETECTED irq */ > ======================================== > > So IF flag is restored right after exiting from do_page_fault(). > > Any thoughts why it might happen? > > PS: > Full logs, additional trace patch, kernel config and a way reproduce bug can > be found at https://bugzilla.redhat.com/show_bug.cgi?id=707552 > > > > ______________________________________ > If you reply to this email, your message will be added to the discussion > below: > http://xen.1045712.n5.nabble.com/Fix-the-occasional-xen-blkfront-deadlock-when-irqbalancing-tp2644296p4704111.html > This email was sent by imammedo (via Nabble) > To receive all replies by email, subscribe to this discussion: > http://xen.1045712.n5.nabble.com/template/NamlServlet.jtp?macro=subscribe_by_code&node=2644296&code=a29ucmFkLndpbGtAb3JhY2xlLmNvbXwyNjQ0Mjk2fDE1MjU5MDEwODc= _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |