[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] "rcu_preempt detected stalls" with xen_free_irq involved - regression
Hi, Since updating from 5.15.124 to 6.1.43, I observe rather often an issue like in the subject. This happens on a domU with heavy vchan usage (several connections established and released per second). The domain in question is a PVH with 16 vCPUs and generally is rather busy (CPU time, but also some noticeable network and disk I/O), but I've seen this happening also in less intensive times (but still several vchan connections being handled). This is running on Xen 4.17.2, AMD EPYC (Zen3), with smt=off. Any ideas? Full message: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks: rcu: 9-...0: (0 ticks this GP) idle=2364/1/0x4000000000000000 softirq=20505/20505 fqs=11999 (detected by 12, t=60004 jiffies, g=79009, q=1863 ncpus=16) Sending NMI from CPU 12 to CPUs 9: NMI backtrace for cpu 9 CPU: 9 PID: 18266 Comm: qrexec-agent Not tainted 6.1.43-1.qubes.fc37.x86_64 #1 RIP: 0010:queued_write_lock_slowpath+0x64/0x124 Code: ff 90 0f 1f 44 00 00 5b 5d c3 cc cc cc cc f0 81 0b 00 01 00 00 ba ff 00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 <8b> 03 3d 00 01 00 00 75 f5 89 c8 f0 0f b1 13 74 be eb e2 65 RSP: 0018:ffffc9000229fd30 EFLAGS: 00000006 RAX: 0000000000000500 RBX: ffffffff8468de60 RCX: 0000000000000100 RDX: 00000000000000ff RSI: ffff8881004b86d8 RDI: ffffffff8468de60 RBP: ffffffff8468de64 R08: ffff8881004b8860 R09: ffffffff82d47600 R10: 0000000000000000 R11: 0000000000000000 R12: ffff88810464e3e0 R13: ffff8881012a76a0 R14: ffff88838d1f5a90 R15: 0000000000000000 FS: 0000716dc7f6b780(0000) GS:ffff8883dc040000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000716dc7f7c060 CR3: 00000001e3d0c001 CR4: 0000000000770ee0 PKRU: 55555554 Call Trace: <NMI> ? show_trace_log_lvl+0x1d3/0x2ef ? show_trace_log_lvl+0x1d3/0x2ef ? show_trace_log_lvl+0x1d3/0x2ef ? __raw_write_lock_irqsave+0x3d/0x50 ? nmi_cpu_backtrace.cold+0x1b/0x76 ? queued_write_lock_slowpath+0x64/0x124 ? nmi_cpu_backtrace_handler+0xd/0x20 ? nmi_handle+0x5d/0x120 ? queued_write_lock_slowpath+0x64/0x124 ? default_do_nmi+0x69/0x170 ? exc_nmi+0x13c/0x170 ? end_repeat_nmi+0x16/0x67 ? queued_write_lock_slowpath+0x64/0x124 ? queued_write_lock_slowpath+0x64/0x124 ? queued_write_lock_slowpath+0x64/0x124 </NMI> <TASK> __raw_write_lock_irqsave+0x3d/0x50 xen_free_irq+0x43/0x170 unbind_from_irqhandler+0x40/0x80 evtchn_release+0x27/0x8e [xen_evtchn] __fput+0x91/0x250 task_work_run+0x59/0x90 exit_to_user_mode_loop+0x121/0x150 exit_to_user_mode_prepare+0xaf/0xc0 syscall_exit_to_user_mode+0x17/0x40 do_syscall_64+0x67/0x80 ? handle_mm_fault+0xdb/0x2d0 ? preempt_count_add+0x47/0xa0 ? up_read+0x37/0x70 ? do_user_addr_fault+0x1bb/0x570 ? exc_page_fault+0x70/0x170 entry_SYSCALL_64_after_hwframe+0x63/0xcd RIP: 0033:0x716dc81077ea Code: 48 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c 24 0c e8 d3 ce f8 ff 8b 7c 24 0c 89 c2 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 36 89 d7 89 44 24 0c e8 33 cf f8 ff 8b RSP: 002b:00007ffce6ce80e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003 RAX: 0000000000000000 RBX: 000055aae7e8c150 RCX: 0000716dc81077ea RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000014 RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000 R10: 0000716dc80086b8 R11: 0000000000000293 R12: 000055aae7e8ad70 R13: 0000000000000003 R14: 0000716dc8020bf8 R15: 00007ffce6ce81a0 </TASK> I've seen also few other flavors of the above, for example: https://gist.github.com/marmarek/a8b79ef2a877443c7aa57fdca366a701 -- Best Regards, Marek Marczykowski-Górecki Invisible Things Lab Attachment:
signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |