[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2] introduce a cache options for PV disks

Ian Jackson writes ("Re: [PATCH v2] introduce a cache options for PV disks"):
> To avoid these problems we intend to rename this option and change the
> documentation accordingly.  Probably, to "direct-io-safe" (both in
> Xenstore and the libxl API).  The documentation will say that this
> option just disables any workaround where we avoid using direct IO
> because of bugs.

Here's an RFC version of the libxl part of this patch.  It compiles
but has not been tested.

For clarity the patch below doesn't include the rebuild of


Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

diff --git a/docs/misc/vbd-interface.txt b/docs/misc/vbd-interface.txt
index 3952e73..f873db0 100644
--- a/docs/misc/vbd-interface.txt
+++ b/docs/misc/vbd-interface.txt
@@ -125,3 +125,9 @@ because they directly map the bottom 8 bits of the xenstore 
 directly to the Linux guest's device number and throw away the rest;
 they can crash due to minor number clashes.  With these guests, the
 workaround is not to supply problematic combinations of devices.
+Other frontend and backend options
+See xen/include/public/io/blkif.h for the full list of options.
diff --git a/docs/misc/xl-disk-configuration.txt 
index 5bd456d..9c2650b 100644
--- a/docs/misc/xl-disk-configuration.txt
+++ b/docs/misc/xl-disk-configuration.txt
@@ -178,6 +178,44 @@ information to be interpreted by the executable program 
 These scripts are normally called "block-<script>".
+Description:           Disables non-O_DIRECT workaround
+Supported values:      absent, present
+Mandatory:             No
+Default value:         absent (workaround may be enabled)
+There is a memory lifetime bug in some driver domain (dom0) kernels
+which can cause crashes when using O_DIRECT.  The bug occurs due to a
+mismatch between the backend-visible lifetime of pages used for the
+Xen PV network protocol and that expected by the backend kernel's
+networking subsystem.  This can cause crashes when using certain
+backends with certain underlying storage.
+ http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html
+For this reason, (this version of) the Xen libxl toolstack disables
+O_DIRECT when using the qemu-based Xen PV backend ("qdisk").
+However, this workaround has performance and scaling implications, and
+it is only necessary if the underlying device is a network filesystem.
+If the underlying device is not, then it is good to disable it; that
+is what this option is for.
+This option simply requests that the workaround be disabled.  (However,
+not all backends versions which use the workaround understand this
+option, so this is on a best effort basis.)
+It's important to note that if you are storing the VM disk on a
+network filesystem or a network block device (NFS or ISCSI) it might
+not be safe to use this option.  Otherwise specifying it is safe and
+can give better performances.
+If in the future the bug is fixed properly this option will then be
+silently ignored.
diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
index 3236aa9..541c258 100644
--- a/tools/libxl/libxl.c
+++ b/tools/libxl/libxl.c
@@ -2168,6 +2168,10 @@ static void device_disk_add(libxl__egc *egc, uint32_t 
         flexarray_append(back, disk->readwrite ? "w" : "r");
         flexarray_append(back, "device-type");
         flexarray_append(back, disk->is_cdrom ? "cdrom" : "disk");
+        if (disk->direct_io_safe) {
+            flexarray_append(back, "direct-io-safe");
+            flexarray_append(back, "1");
+        }
         flexarray_append(front, "backend-id");
         flexarray_append(front, libxl__sprintf(gc, "%d", disk->backend_domid));
diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
index d218a2d..892ab01 100644
--- a/tools/libxl/libxl_types.idl
+++ b/tools/libxl/libxl_types.idl
@@ -375,6 +375,7 @@ libxl_device_disk = Struct("device_disk", [
     ("removable", integer),
     ("readwrite", integer),
     ("is_cdrom", integer),
+    ("direct_io_safe", bool),
 libxl_device_nic = Struct("device_nic", [
diff --git a/tools/libxl/libxlu_disk_l.l b/tools/libxl/libxlu_disk_l.l
index 7c4e7f1..ba8577c 100644
--- a/tools/libxl/libxlu_disk_l.l
+++ b/tools/libxl/libxlu_disk_l.l
@@ -173,6 +173,7 @@ backendtype=[^,]*,? { STRIP(','); 
setbackendtype(DPC,FROMEQUALS); }
 vdev=[^,]*,?   { STRIP(','); SAVESTRING("vdev", vdev, FROMEQUALS); }
 script=[^,]*,? { STRIP(','); SAVESTRING("script", script, FROMEQUALS); }
+direct-io-safe,? { DPC->disk->direct_io_safe = 1; }
  /* the target magic parameter, eats the rest of the string */
diff --git a/xen/include/public/io/blkif.h b/xen/include/public/io/blkif.h
index b9b9d98..baf0f0a 100644
--- a/xen/include/public/io/blkif.h
+++ b/xen/include/public/io/blkif.h
@@ -97,6 +97,22 @@
  *      The type of the backing device/object.
+ *
+ * direct-io-safe
+ *      Values:         1
+ *      Optional
+ *
+ *      The underlying storage is not affected by the direct IO memory
+ *      lifetime bug.  See:
+ *        http://lists.xen.org/archives/html/xen-devel/2012-12/msg01154.html
+ *
+ *      So use of O_DIRECT is safe, in circumstances
+ *      where we would normally have avoided it as a workaround for
+ *      that bug.  This option is not relevant for all backends, and
+ *      even not necessarily supported for those for which it is
+ *      relevant.  A backend which knows that it is not affected by
+ *      the bug can ignore this option.
+ *
  *--------------------------------- Features ---------------------------------
  * feature-barrier

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.