|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Xen blktap driver for Ceph RBD : Anybody wants to test ? :p
>
> Hi,
>
> > I've had a few occasions where tapdisk has segfaulted:
> >
> > tapdisk[9180]: segfault at 7f7e3a5c8c10 ip 00007f7e387532d4 sp
> 00007f7e3a5c8c10 error 4 in libpthread-2.13.so[7f7e38748000+17000]
> > tapdisk:9180 blocked for more than 120 seconds.
> > tapdisk D ffff88043fc13540 0 9180 1 0x00000000
> >
> > and then like:
> >
> > end_request: I/O error, dev tdc, sector 472008
> >
> > I can't be sure but I suspect that when this happened either one OSD was
> > offline, or the cluster lost quorum briefly.
>
> Interesting. There might be an issue if a request ends in error, I'll
> have to check that.
> I'll have a look on monday.
>
You say in tdrbd_finish_aiocb:
while (1) {
/* POSIX says write will be atomic or blocking */
rv = write(prv->pipe_fds[1], (void*)&req, sizeof(req));
but from what I've read in "man 7 pipe", the statement about being atomic only
applies if the pipe is open in non-blocking mode, and you open it with a call
to pipe() (same as pipe2(,0)) and you never call fcntl to change it. This would
be consistent with the random crashes I'm seeing - I thought they were related
to transient errors but my ceph cluster has been perfectly stable for a few
days now and it's still happening.
What do you think?
Thanks
James
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |