|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] "rcu_preempt detected stalls" with xen_free_irq involved - regression
Hi,
Since updating from 5.15.124 to 6.1.43, I observe rather often an issue
like in the subject. This happens on a domU with heavy vchan usage
(several connections established and released per second).
The domain in question is a PVH with 16 vCPUs and generally is rather
busy (CPU time, but also some noticeable network and disk I/O), but I've
seen this happening also in less intensive times (but still several
vchan connections being handled).
This is running on Xen 4.17.2, AMD EPYC (Zen3), with smt=off.
Any ideas?
Full message:
rcu: INFO: rcu_preempt detected stalls on CPUs/tasks:
rcu: 9-...0: (0 ticks this GP) idle=2364/1/0x4000000000000000
softirq=20505/20505 fqs=11999
(detected by 12, t=60004 jiffies, g=79009, q=1863 ncpus=16)
Sending NMI from CPU 12 to CPUs 9:
NMI backtrace for cpu 9
CPU: 9 PID: 18266 Comm: qrexec-agent Not tainted 6.1.43-1.qubes.fc37.x86_64
#1
RIP: 0010:queued_write_lock_slowpath+0x64/0x124
Code: ff 90 0f 1f 44 00 00 5b 5d c3 cc cc cc cc f0 81 0b 00 01 00 00 ba ff
00 00 00 b9 00 01 00 00 8b 03 3d 00 01 00 00 74 0b f3 90 <8b> 03 3d 00 01 00 00
75 f5 89 c8 f0 0f b1 13 74 be eb e2 65
RSP: 0018:ffffc9000229fd30 EFLAGS: 00000006
RAX: 0000000000000500 RBX: ffffffff8468de60 RCX: 0000000000000100
RDX: 00000000000000ff RSI: ffff8881004b86d8 RDI: ffffffff8468de60
RBP: ffffffff8468de64 R08: ffff8881004b8860 R09: ffffffff82d47600
R10: 0000000000000000 R11: 0000000000000000 R12: ffff88810464e3e0
R13: ffff8881012a76a0 R14: ffff88838d1f5a90 R15: 0000000000000000
FS: 0000716dc7f6b780(0000) GS:ffff8883dc040000(0000)
knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000716dc7f7c060 CR3: 00000001e3d0c001 CR4: 0000000000770ee0
PKRU: 55555554
Call Trace:
<NMI>
? show_trace_log_lvl+0x1d3/0x2ef
? show_trace_log_lvl+0x1d3/0x2ef
? show_trace_log_lvl+0x1d3/0x2ef
? __raw_write_lock_irqsave+0x3d/0x50
? nmi_cpu_backtrace.cold+0x1b/0x76
? queued_write_lock_slowpath+0x64/0x124
? nmi_cpu_backtrace_handler+0xd/0x20
? nmi_handle+0x5d/0x120
? queued_write_lock_slowpath+0x64/0x124
? default_do_nmi+0x69/0x170
? exc_nmi+0x13c/0x170
? end_repeat_nmi+0x16/0x67
? queued_write_lock_slowpath+0x64/0x124
? queued_write_lock_slowpath+0x64/0x124
? queued_write_lock_slowpath+0x64/0x124
</NMI>
<TASK>
__raw_write_lock_irqsave+0x3d/0x50
xen_free_irq+0x43/0x170
unbind_from_irqhandler+0x40/0x80
evtchn_release+0x27/0x8e [xen_evtchn]
__fput+0x91/0x250
task_work_run+0x59/0x90
exit_to_user_mode_loop+0x121/0x150
exit_to_user_mode_prepare+0xaf/0xc0
syscall_exit_to_user_mode+0x17/0x40
do_syscall_64+0x67/0x80
? handle_mm_fault+0xdb/0x2d0
? preempt_count_add+0x47/0xa0
? up_read+0x37/0x70
? do_user_addr_fault+0x1bb/0x570
? exc_page_fault+0x70/0x170
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x716dc81077ea
Code: 48 3d 00 f0 ff ff 77 48 c3 0f 1f 80 00 00 00 00 48 83 ec 18 89 7c 24
0c e8 d3 ce f8 ff 8b 7c 24 0c 89 c2 b8 03 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77
36 89 d7 89 44 24 0c e8 33 cf f8 ff 8b
RSP: 002b:00007ffce6ce80e0 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
RAX: 0000000000000000 RBX: 000055aae7e8c150 RCX: 0000716dc81077ea
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000014
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000716dc80086b8 R11: 0000000000000293 R12: 000055aae7e8ad70
R13: 0000000000000003 R14: 0000716dc8020bf8 R15: 00007ffce6ce81a0
</TASK>
I've seen also few other flavors of the above, for example:
https://gist.github.com/marmarek/a8b79ef2a877443c7aa57fdca366a701
--
Best Regards,
Marek Marczykowski-Górecki
Invisible Things Lab
Attachment:
signature.asc
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |