[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-changelog] [qemu-xen master] block: Let write zeroes fallback work even with small max_transfer



commit b2f95feec5e4d546b932848dd421ec3361e8ef77
Author:     Eric Blake <eblake@xxxxxxxxxx>
AuthorDate: Thu Nov 17 14:13:56 2016 -0600
Commit:     Kevin Wolf <kwolf@xxxxxxxxxx>
CommitDate: Tue Nov 22 15:59:22 2016 +0100

    block: Let write zeroes fallback work even with small max_transfer
    
    Commit 443668ca rewrote the write_zeroes logic to guarantee that
    an unaligned request never crosses a cluster boundary.  But
    in the rewrite, the new code assumed that at most one iteration
    would be needed to get to an alignment boundary.
    
    However, it is easy to trigger an assertion failure: the Linux
    kernel limits loopback devices to advertise a max_transfer of
    only 64k.  Any operation that requires falling back to writes
    rather than more efficient zeroing must obey max_transfer during
    that fallback, which means an unaligned head may require multiple
    iterations of the write fallbacks before reaching the aligned
    boundaries, when layering a format with clusters larger than 64k
    atop the protocol of file access to a loopback device.
    
    Test case:
    
    $ qemu-img create -f qcow2 -o cluster_size=1M file 10M
    $ losetup /dev/loop2 /path/to/file
    $ qemu-io -f qcow2 /dev/loop2
    qemu-io> w 7m 1k
    qemu-io> w -z 8003584 2093056
    
    In fairness to Denis (as the original listed author of the culprit
    commit), the faulty logic for at most one iteration is probably all
    my fault in reworking his idea.  But the solution is to restore what
    was in place prior to that commit: when dealing with an unaligned
    head or tail, iterate as many times as necessary while fragmenting
    the operation at max_transfer boundaries.
    
    Reported-by: Ed Swierk <eswierk@xxxxxxxxxxxxxxxxxx>
    CC: qemu-stable@xxxxxxxxxx
    CC: Denis V. Lunev <den@xxxxxxxxxx>
    Signed-off-by: Eric Blake <eblake@xxxxxxxxxx>
    Reviewed-by: Max Reitz <mreitz@xxxxxxxxxx>
    Signed-off-by: Kevin Wolf <kwolf@xxxxxxxxxx>
---
 block/io.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/block/io.c b/block/io.c
index aa532a5..085ac34 100644
--- a/block/io.c
+++ b/block/io.c
@@ -1214,6 +1214,8 @@ static int coroutine_fn 
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
     int max_write_zeroes = MIN_NON_ZERO(bs->bl.max_pwrite_zeroes, INT_MAX);
     int alignment = MAX(bs->bl.pwrite_zeroes_alignment,
                         bs->bl.request_alignment);
+    int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
+                                    MAX_WRITE_ZEROES_BOUNCE_BUFFER);
 
     assert(alignment % bs->bl.request_alignment == 0);
     head = offset % alignment;
@@ -1229,9 +1231,12 @@ static int coroutine_fn 
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
          * boundaries.
          */
         if (head) {
-            /* Make a small request up to the first aligned sector.  */
-            num = MIN(count, alignment - head);
-            head = 0;
+            /* Make a small request up to the first aligned sector. For
+             * convenience, limit this request to max_transfer even if
+             * we don't need to fall back to writes.  */
+            num = MIN(MIN(count, max_transfer), alignment - head);
+            head = (head + num) % alignment;
+            assert(num < max_write_zeroes);
         } else if (tail && num > alignment) {
             /* Shorten the request to the last aligned sector.  */
             num -= tail;
@@ -1257,8 +1262,6 @@ static int coroutine_fn 
bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
 
         if (ret == -ENOTSUP) {
             /* Fall back to bounce buffer if write zeroes is unsupported */
-            int max_transfer = MIN_NON_ZERO(bs->bl.max_transfer,
-                                            MAX_WRITE_ZEROES_BOUNCE_BUFFER);
             BdrvRequestFlags write_flags = flags & ~BDRV_REQ_ZERO_WRITE;
 
             if ((flags & BDRV_REQ_FUA) &&
--
generated by git-patchbot for /home/xen/git/qemu-xen.git#master

_______________________________________________
Xen-changelog mailing list
Xen-changelog@xxxxxxxxxxxxx
https://lists.xenproject.org/xen-changelog

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.