[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: de-BKLing blkfront
On 07/21/2010 03:26 PM, Daniel Stodden wrote: > On Wed, 2010-07-21 at 17:39 -0400, Jeremy Fitzhardinge wrote: > >> On 07/21/2010 02:12 PM, Daniel Stodden wrote: >> >>> On Wed, 2010-07-21 at 15:49 -0400, Jeremy Fitzhardinge wrote: >>> >>> >>>> When I was preparing the latest set of blkfront patches to send upstream >>>> to Jens Axboe, he pointed out there were conflicts with what he >>>> currently has queued. >>>> >>>> It turns out the conflict was from pushing the BKL (lock/unlock_kernel) >>>> into the open and release functions. I did the merge keeping them >>>> around all the new stuff you added to those functions, but I wonder if >>>> its actually necessary. Do we rely on open/release being globally >>>> serialized in there? >>>> >>>> I've pushed what I have into the upstream/blkfront branch in xen.git. >>>> >>>> >>> Whatever it was, a BLK presumably fixed it. >>> >>> >> There's an ongoing project to remove the BKL; part of that is to remove >> implicit use of the BKL from the core kernel and push uses down to >> drivers which need it. That basically means mechanically adding >> lock_kernel/unlock_kernel pairs to driver functions as they're removed >> from the core kernel. blkfront got hit with that at some point (haven't >> identified precisely where), so we have the option to remove those >> lock/unlocks if they're not necessary. >> > Ah, understood. But for a while I used to dig through bdev open/release > stuff quite regularly. I never was aware of any potential big lock on > that path to push down. > > Do you think it's worth to look into the lock usage now removed in more > detail? Would take a hint regarding which code was affected. > > >>> Anyway, it should not be necessary any more. >>> >>> Next I made the common mistake of looking into my code again, and >>> immediately found I would send an unlucky opener transiently spinning in >>> blkdev_get. Sometimes I wonder what I'm thinking. >>> >>> >> What's the effect of that? >> > Uhm, false alarm. Better just ignore me. > > I stuck that ERESTARTSYS stuff into blkif_open. It's not strictly > needed, but looked like a good idea. I picked it up from the device > mapper code. > > It's telling the block layer to retry if it managed to get a ref on a > disk which just happened to be del_gendisk'd on a different thread. So > it will retry and either succeed with a new disk, or fail right on the > next attempt. > > I would have bet there would be a case where blkfront clears > disk->private_data before removing the disk. But apparently not. > Otherwise a thread hitting that phase would peg the CPU until the > removal succeeded elsewhere. > > Sorry for the noise. > > >>> Hold on for a patch to both. >>> > So rather not. As far as I can tell, feel free to just drop the bkl > code. > > >> Thanks. BTW, did you get a change to look into those barrier comments? >> > Nope, sorry. In fact I dropped that one. Getting more urgent? > Well, it would be nice to address it for the next merge window, if there's something to address. There's also a mysterious IO-related hang I've been seeing moderately frequently. Something will end up waiting on IO while holding a lock, and eventually the system grinds to a halt as things want to do block IO. What appears to be happening is that an IO request is getting lost somewhere, I think, either because blkfront is dropping it or perhaps blkback/tap (but this has been observed on 2.6.18-based systems too, so I suspect its a frontend issue. Perhaps an IO really getting lost, or perhaps its just an event; I'm not sure. Or maybe there's something screwy with barriers. Tom Kopek and Alexander McKinney seem to be able to reproduce it reliably, and have a blocktrace of a hang showing a bunch of outstanding IO. Anyway, I'd been meaning to mention it to you before... J _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |