Xen project Mailing List

Re: [Xen-devel] [patch] barrier support for blk{front,back}

To: Ian Pratt <m+Ian.Pratt@xxxxxxxxxxxx>

From: Gerd Hoffmann <kraxel@xxxxxxx>

Date: Mon, 11 Sep 2006 15:22:07 -0700

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxx>

Delivery-date: Mon, 11 Sep 2006 15:12:31 -0700

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

Ian Pratt wrote: >> This patch adds support for barriers to blk{back,front} drivers. > > It's good to see barrier supported added. > > Out of interest, what was your motivation for adding it? Trying to fix some problems of loop-file backed virtual block devices. For SLES10 we have a patch which adds a syncronous mode to the loop driver (by opening the file with O_SYNC). It solves the problem of loop doing too much buffering and screw up journaling filesystems, but is dead slow. When using barriers instead the performance should become better without the risc to kill the filesystem by ignoring write ordering. There is also a patch in the queue (for mainline) which adds barrier support to loop devices, attached below for reference. > Which file systems use it, and do you see a worthwhile performance > gain from the extra disk scheduling flexibility? All journaling filesystems should be able to use them. ext3 and reiserfs do for sure, although they are not enabled by default, you need the barrier=1 (ext3) and barrier=flush (reiser) mount options. Don't know what xfs and jfs are doing by default. No benchmarks yet, sorry. I finished the patch just the day before the summit on my notebook, which is way to slow for serious performance tests. Beside that I simply had no time yet. I can run some next week. > We are going to have to think through what the impact of this would > be in the live relocation block safety optimizations Andy Warfield > described at the summit. The simple thing is just to revert to > stalling until the backend gives the all clear if there's a barrier > in the queue. Hmm, yes, the frontend driver better should take care that there isn't an barrier request in flight. Doing that should also reduce the risc to corrupt the filesystem in the (already unlikely) case that the writes on the host the machine is migrated from are ending up on disk after the ones resubmitted from the host the machine is migrated to. jetlagged greetings from europe, Gerd -- Gerd Hoffmann <kraxel@xxxxxxx>

--- linux-2.6.16/drivers/block/loop.c~ 2006-06-29 13:22:37.000000000 +0200 +++ linux-2.6.16/drivers/block/loop.c 2006-06-29 13:28:17.000000000 +0200 @@ -467,16 +467,58 @@ return ret; } +/* + * This is best effort. We really wouldn't know what to do with a returned + * error. This code is taken from the implementation of fsync. + */ +static int sync_file(struct file * file) +{ + struct address_space *mapping; + int ret; + + if (!file->f_op || !file->f_op->fsync) + return -EOPNOTSUPP; + + mapping = file->f_mapping; + + ret = filemap_fdatawrite(mapping); + if (!ret) { + /* + * We need to protect against concurrent writers, + * which could cause livelocks in fsync_buffers_list + */ + mutex_lock(&mapping->host->i_mutex); + ret = file->f_op->fsync(file, file->f_dentry, 1); + mutex_unlock(&mapping->host->i_mutex); + + filemap_fdatawait(mapping); + } + + return ret; +} + static int do_bio_filebacked(struct loop_device *lo, struct bio *bio) { loff_t pos; int ret; + int sync = bio_sync(bio); + int barrier = bio_barrier(bio); + + if (barrier) { + ret = sync_file(lo->lo_backing_file); + if (unlikely(ret)) + return ret; + } pos = ((loff_t) bio->bi_sector << 9) + lo->lo_offset; if (bio_rw(bio) == WRITE) ret = lo_send(lo, bio, lo->lo_blocksize, pos); else ret = lo_receive(lo, bio, lo->lo_blocksize, pos); + + if ((barrier || sync) && !ret) + ret = sync_file(lo->lo_backing_file); + return ret; }

_______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.