[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Basic blktap2 functionality issues.
Hi, hope the day is going well for everyone. I had posted a note about this issue two weeks ago and didn't get any response. I don't know if that indicates that people are not using blktap or if the question stumped everyone. First of all there is a documentable issue with blktap in 4.1.2 and xl. Anyone trying to use it to export files to a guest will be affected. Using xl to do 'block-attach' and 'block-detach' in dom0 also doesn't work in stock 4.1.2 so I suspect there are generic issues with blktap and xl. In stock 4.1.2 using xl results in the tapdisk2 server process being orphaned. There is a patch floating around from Ian Campbell which fixes this problem and either creates or uncovers what may be a more fundamental problem. With Ian's patch applied the tapdisk2 process terminates but the tapdisk device is not released resulting in a steady accumulation of orphan minor numbers. The underlying cause of this appears to be a resource deadlock between xl requesting a detach of the VBD and the tapdisk2 process. Ian actually acknowledges the phenomenon in one of his posts saying there appeared to be a 'hang' when the VBD is released with his patches. The 'hang' is actually a livelock where the entire kernel blocks until the select call from xl to the tapdisk2 process times out and rescues things. The error return from the select call is what causes the xl code to not complete the release of the tapdisk minor. I'm still hunting through the code maze trying to find the underlying problem. The tapdisk2 process is hanging on the munmap of the ring buffers. It 'feels' as if the problem may be secondary to xl still holding a reference to some resource which blocks the unmap of the ring buffers. I will continue to hunt but if anyone has any pointers send them my way. It takes a bit of time to come up to speed on all the code paths. In a broader context I think the XEN community needs to take a reasoned look at how to handle the file issue. It appears as if Dan Stodden has disappeared so I get the sense the entire blktap2 architecture is a bit rudderless (no judgement just observation). I'm trying to put my money where my mouth is and actually hunt for the problems which are there. I've seen rumbles on LKML about discussions with regards to modifications to loop in order to deal with the page cache issues which hinder the reliability of delivering virtual block devices over loop. I don't know how far out that is or even if it will eventuate. I've also heard rumbles about a mythical 'blktap3' which runs completely in userspace. If that is the direction things are going I would certainly be willing to hammer on that rather then put more time into blktap2 if there is 'blktap3' code someplace. I will look forward to any comments or suggestions people may have. Have a good weekend. As always, Dr. G.W. Wettstein, Ph.D. Enjellic Systems Development, LLC. 4206 N. 19th Ave. Specializing in information infra-structure Fargo, ND 58102 development. PH: 701-281-1686 FAX: 701-281-3949 EMAIL: greg@xxxxxxxxxxxx ------------------------------------------------------------------------------ "My thoughts on trusting Open-Source? A quote I once saw said it best: 'Remember, Amateurs built the ark. Professionals built the Titanic.' Perhaps most significantly the ark was one guy, there were no doubt committees involved with the Titanic project." -- Dr. G.W. Wettstein Resurrection _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |