Xen project Mailing List

Re: [Xen-devel] Xen Linux deadlock

To: Andre Przywara <andre.przywara@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Juergen Gross <jgross@xxxxxxxx>

Date: Wed, 7 Jun 2017 17:51:58 +0200

Cc: Julien Grall <julien.grall@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>

Delivery-date: Wed, 07 Jun 2017 15:52:11 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 07/06/17 17:05, Andre Przywara wrote: > Hi, > > when booting Linux 4.12-rc4 as Dom0 under a recent Xen HV I saw the > following lockdep splat after running xencommons start: > > root@junor1:~# bash /etc/init.d/xencommons start > Setting domain 0 name, domid and JSON config... > [ 247.979498] ====================================================== > [ 247.985688] WARNING: possible circular locking dependency detected > [ 247.991882] 4.12.0-rc4-00022-gc4b25c0 #575 Not tainted > [ 247.997040] ------------------------------------------------------ > [ 248.003232] xenbus/91 is trying to acquire lock: > [ 248.007875] (&u->msgbuffer_mutex){+.+.+.}, at: [<ffff00000863e904>] > xenbus_dev_queue_reply+0x3c/0x230 > [ 248.017163] > [ 248.017163] but task is already holding lock: > [ 248.023096] (xb_write_mutex){+.+...}, at: [<ffff00000863a940>] > xenbus_thread+0x5f0/0x798 > [ 248.031267] > [ 248.031267] which lock already depends on the new lock. > [ 248.031267] > [ 248.039615] > [ 248.039615] the existing dependency chain (in reverse order) is: > [ 248.047176] > [ 248.047176] -> #1 (xb_write_mutex){+.+...}: > [ 248.052943] __lock_acquire+0x1728/0x1778 > [ 248.057498] lock_acquire+0xc4/0x288 > [ 248.061630] __mutex_lock+0x84/0x868 > [ 248.065755] mutex_lock_nested+0x3c/0x50 > [ 248.070227] xs_send+0x164/0x1f8 > [ 248.074015] xenbus_dev_request_and_reply+0x6c/0x88 > [ 248.079427] xenbus_file_write+0x260/0x420 > [ 248.084073] __vfs_write+0x48/0x138 > [ 248.088113] vfs_write+0xa8/0x1b8 > [ 248.091983] SyS_write+0x54/0xb0 > [ 248.095768] el0_svc_naked+0x24/0x28 > [ 248.099897] > [ 248.099897] -> #0 (&u->msgbuffer_mutex){+.+.+.}: > [ 248.106088] print_circular_bug+0x80/0x2e0 > [ 248.110730] __lock_acquire+0x1768/0x1778 > [ 248.115288] lock_acquire+0xc4/0x288 > [ 248.119417] __mutex_lock+0x84/0x868 > [ 248.123545] mutex_lock_nested+0x3c/0x50 > [ 248.128016] xenbus_dev_queue_reply+0x3c/0x230 > [ 248.133005] xenbus_thread+0x788/0x798 > [ 248.137306] kthread+0x110/0x140 > [ 248.141087] ret_from_fork+0x10/0x40 > [ 248.145214] > [ 248.145214] other info that might help us debug this: > [ 248.145214] > [ 248.153383] Possible unsafe locking scenario: > [ 248.153383] > [ 248.159403] CPU0 CPU1 > [ 248.163960] ---- ---- > [ 248.168518] lock(xb_write_mutex); > [ 248.172045] lock(&u->msgbuffer_mutex); > [ 248.178500] lock(xb_write_mutex); > [ 248.184514] lock(&u->msgbuffer_mutex); > [ 248.188470] > [ 248.188470] *** DEADLOCK *** > [ 248.188470] > [ 248.194578] 2 locks held by xenbus/91: > [ 248.198360] #0: (xs_response_mutex){+.+...}, at: > [<ffff00000863a7b0>] xenbus_thread+0x460/0x798 > [ 248.207218] #1: (xb_write_mutex){+.+...}, at: [<ffff00000863a940>] > xenbus_thread+0x5f0/0x798 > [ 248.215818] > [ 248.215818] stack backtrace: > [ 248.220293] CPU: 0 PID: 91 Comm: xenbus Not tainted > 4.12.0-rc4-00022-gc4b25c0 #575 > [ 248.227858] Hardware name: ARM Juno development board (r1) (DT) > [ 248.233792] Call trace: > [ 248.236289] [<ffff00000808a748>] dump_backtrace+0x0/0x270 > [ 248.241707] [<ffff00000808aa94>] show_stack+0x24/0x30 > [ 248.246782] [<ffff0000084caa98>] dump_stack+0xb8/0xf0 > [ 248.251859] [<ffff000008139068>] print_circular_bug+0x1f8/0x2e0 > [ 248.257787] [<ffff00000813c090>] __lock_acquire+0x1768/0x1778 > [ 248.263548] [<ffff00000813c90c>] lock_acquire+0xc4/0x288 > [ 248.268882] [<ffff000008bdb28c>] __mutex_lock+0x84/0x868 > [ 248.274219] [<ffff000008bdbaac>] mutex_lock_nested+0x3c/0x50 > [ 248.279889] [<ffff00000863e904>] xenbus_dev_queue_reply+0x3c/0x230 > [ 248.286081] [<ffff00000863aad8>] xenbus_thread+0x788/0x798 > [ 248.291585] [<ffff000008108070>] kthread+0x110/0x140 > [ 248.296572] [<ffff000008083710>] ret_from_fork+0x10/0x40 > > Apparently it's not easily reproducible, but Julien confirmed that the > dead lock condition as reported above is indeed in the Linux code. > > Does anyone has an idea of how to fix this? Shouldn't be too hard. The xb_write_mutex can be dropped earlier in the critical path. I'll send a patch. Juergen _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.