[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Deadlock in /proc/xen/xenbus watch+read on 3.17+ (maybe earlier)



On Thu, 2015-03-19 at 02:19 +0100, Marek Marczykowski-GÃrecki wrote:
> Hi,
> 
> I've hit some deadlock in kernel xenstore client exposed via
> /proc/xen/xenbus.

Sounds similar to what Iurii also reported last night in "Userspace PV
backend hangs".

Iurri's case was all 3.14 kernels, which is in your range too.

>  Steps to reproduce are simple:
> int main() {
>       struct xs_handle *xs;
>       xs = xs_open(0);
>       xs_watch(xs, "domid", "token");
>       xs_read(xs, 0, "name", NULL);
>       return 0;
> }
> 
> xs_watch internally creates new thread, which uses read to wait for the
> watch. And in the same time, the program tries to read some value,
> but actually it hangs at sending the command (before even sending a path to be
> read). Strace gives this (simplified for readability):
> [pid  2494] write(3, "\4\0\0\0\0\0\0\0\0\0\0\0\f\0\0\0", 160 = 16
> [pid  2494] write(3, "domid\0", 6)      = 6
> [pid  2494] write(3, "token\0", 6)      = 6
> [pid  2495] read(3,  <unfinished ...>
> [pid  2494] futex(0x71c0d4, FUTEX_WAIT_PRIVATE, 1, NULL <unfinished ...>
> [pid  2495] <... read resumed>
> "\17\0\0\0\377\377\377\377\220~\255\27\f\0\0\0", 16) = 16
> [pid  2495] read(3, "domid\0token\0", 12) = 12
> [pid  2495] read(3, "\4\0\0\0\0\0\0\0\0\0\0\0\3\0\0\0", 16) = 16
> [pid  2495] read(3, "OK\0", 3)          = 3
> [pid  2495] futex(0x71c0d4, FUTEX_WAKE_OP_PRIVATE, 1, 1, 0x71c0d0,
> {FUTEX_OP_SET, 0, FUTEX_OP_CMP_GT, 1} <unfinished ...>
> [pid  2494] <... futex resumed> )       = 0
> [pid  2495] <... futex resumed> )       = 1
> [pid  2494] futex(0x71c0a8, FUTEX_WAIT_PRIVATE, 2, NULL <unfinished ...>
> [pid  2495] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
> [pid  2494] <... futex resumed> )       = -1 EAGAIN (Resource
> temporarily unavailable)
> [pid  2495] <... futex resumed> )       = 0
> [pid  2494] futex(0x71c0a8, FUTEX_WAKE_PRIVATE, 1 <unfinished ...>
> [pid  2495] read(3,  <unfinished ...>
> [pid  2494] <... futex resumed> )       = 0
> [pid  2494] rt_sigaction(SIGPIPE, {SIG_DFL, [], SA_RESTORER,
> 0x7fc78c1488f0}, NULL, 8) = 0
> [pid  2494] rt_sigaction(SIGPIPE, {SIG_IGN, [], SA_RESTORER,
> 0x7fc78c1488f0}, {SIG_DFL, [], SA_RESTORER, 0x7fc78c1488f0}, 8) = 0
> [pid  2494] write(3, "\2\0\0\0\0\0\0\0\0\0\0\0\5\0\0\0", 16
> 
> And thats all - 2494 is waiting on write, 2495 is waiting on read.
> 
> On 3.12.x it is working. On 3.17.0 and 3.18.7 it is broken. I haven't
> checked versions in the middle.
> 
> Any ideas?
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.