[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Basic blktap2 functionality issues.
On Thu, 2012-03-29 at 18:35 +0100, Stefano Stabellini wrote: > On Thu, 29 Mar 2012, greg@xxxxxxxxxxxx wrote: > > Hi, hope the day is going well for everyone. > > > > I had posted a note about this issue two weeks ago and didn't get any > > response. I don't know if that indicates that people are not using > > blktap or if the question stumped everyone. > > > > First of all there is a documentable issue with blktap in 4.1.2 and > > xl. Anyone trying to use it to export files to a guest will be > > affected. Using xl to do 'block-attach' and 'block-detach' in dom0 > > also doesn't work in stock 4.1.2 so I suspect there are generic issues > > with blktap and xl. > > > > In stock 4.1.2 using xl results in the tapdisk2 server process being > > orphaned. There is a patch floating around from Ian Campbell which > > fixes this problem and either creates or uncovers what may be a more > > fundamental problem. > > > > With Ian's patch applied the tapdisk2 process terminates but the > > tapdisk device is not released resulting in a steady accumulation of > > orphan minor numbers. The underlying cause of this appears to be a > > resource deadlock between xl requesting a detach of the VBD and the > > tapdisk2 process. > > > > Ian actually acknowledges the phenomenon in one of his posts saying > > there appeared to be a 'hang' when the VBD is released with his > > patches. The 'hang' is actually a livelock where the entire kernel > > blocks until the select call from xl to the tapdisk2 process times out > > and rescues things. The error return from the select call is what > > causes the xl code to not complete the release of the tapdisk minor. > > > > I'm still hunting through the code maze trying to find the underlying > > problem. The tapdisk2 process is hanging on the munmap of the ring > > buffers. It 'feels' as if the problem may be secondary to xl still > > holding a reference to some resource which blocks the unmap of the > > ring buffers. > > > > I will continue to hunt but if anyone has any pointers send them my > > way. It takes a bit of time to come up to speed on all the code > > paths. > > > > In a broader context I think the XEN community needs to take a > > reasoned look at how to handle the file issue. It appears as if Dan > > Stodden has disappeared so I get the sense the entire blktap2 > > architecture is a bit rudderless (no judgement just observation). I'm > > trying to put my money where my mouth is and actually hunt for the > > problems which are there. > > > > I've seen rumbles on LKML about discussions with regards to > > modifications to loop in order to deal with the page cache issues > > which hinder the reliability of delivering virtual block devices over > > loop. I don't know how far out that is or even if it will eventuate. > > I've also heard rumbles about a mythical 'blktap3' which runs > > completely in userspace. If that is the direction things are going I > > would certainly be willing to hammer on that rather then put more time > > into blktap2 if there is 'blktap3' code someplace. > > > > I will look forward to any comments or suggestions people may have. > > A completely userspace blktap version is indeed in the work, but it is > not clear yet when it is going to be ready. > > Alternatively if you don't need VHD format support, you can use QCOW or > QCOW2 with upstream QEMU in xen-unstable, that leads to very good > performances and still provides copy on write support. > We should still support blktap2 in the 4.2 for people who have a kernel with the necessary driver included therefore I still think the bug which Greg reports needs investigating and hopefully fixing. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |