[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [Xen-devel] [PATCH v13 3/7] remus: introduce remus device
- To: FNST-Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
- From: Shriram Rajagopalan <rshriram@xxxxxxxxx>
- Date: Thu, 26 Jun 2014 22:29:42 -0500
- Cc: ian.campbell@xxxxxxxxxx, wency@xxxxxxxxxxxxxx, stefano.stabellini@xxxxxxxxxxxxx, ian.jackson@xxxxxxxxxxxxx, Jiang Yunhong <yunhong.jiang@xxxxxxxxx>, eddie.dong@xxxxxxxxx, xen-devel@xxxxxxxxxxxxx, andrew.cooper3@xxxxxxxxxx, laijs@xxxxxxxxxxxxxx, Roger Pau Monne <roger.pau@xxxxxxxxxx>
- Delivery-date: Fri, 27 Jun 2014 03:30:43 +0000
- List-id: Xen developer discussion <xen-devel.lists.xen.org>
On Jun 27, 2014 7:29 AM, "Yang Hongyang" <yanghy@xxxxxxxxxxxxxx> wrote:
>
> introduce remus device, an abstract layer of remus devices(nic, disk,
> etc).It provides the following APIs for libxl:
> Â >libxl__remus_device_setup
> Â Â setup remus devices, like attach qdisc, enable disk buffering, etc
> Â >libxl__remus_device_teardown
> Â Â teardown devices
> Â >libxl__remus_device_postsuspend
> Â >libxl__remus_device_preresume
> Â >libxl__remus_device_commit
> Â Â above three are for checkpoint.
> through remus device layer, the remus execution flow will be like
> this:
> Â xl remus -> remus device setup
> Â Â Â Â Â Â Â Â |-> remus checkpoint(postsuspend, preresume, commit)
> Â Â Â Â Â Â Â Â Â Â Â ...
> Â Â Â Â Â Â Â Â Â Â Â Â|-> remus device teardown, failover or abort
> the remus device layer provides an interface
> Â libxl__remus_device_ops
> which a remus device must implement. the whole remus structure:
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â |remus|
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â|
> Â Â Â Â Â Â Â Â Â Â Â Â |remus device|
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â|
> Â Â Â Â Â Â Â Â |nic| |drbd disks| |qemu disks| ...
> a device(nic, drbd disks, qemu disks, etc) must implement
> libxl__remus_device_ops to support remus.
>
> Signed-off-by: Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
> Signed-off-by: Wen Congyang <wency@xxxxxxxxxxxxxx>
> Signed-off-by: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> ---
> Âtools/libxl/Makefile       |  2 +
> Âtools/libxl/libxl.c       Â| Â34 +++-
> Âtools/libxl/libxl_dom.c     Â| 132 +++++++++++++--
> Âtools/libxl/libxl_internal.h   | 182 +++++++++++++++++++++
> Âtools/libxl/libxl_remus_device.c | 340 +++++++++++++++++++++++++++++++++++++++
> Âtools/libxl/libxl_types.idl   Â|  1 +
> Â6 files changed, 675 insertions(+), 16 deletions(-)
> Âcreate mode 100644 tools/libxl/libxl_remus_device.c
>
> diff --git a/tools/libxl/Makefile b/tools/libxl/Makefile
> index fdffff3..cb2efdf 100644
> --- a/tools/libxl/Makefile
> +++ b/tools/libxl/Makefile
> @@ -56,6 +56,8 @@ else
> ÂLIBXL_OBJS-y += libxl_nonetbuffer.o
> Âendif
>
> +LIBXL_OBJS-y += libxl_remus_device.o
> +
> ÂLIBXL_OBJS-$(CONFIG_X86) += libxl_cpuid.o libxl_x86.o
> ÂLIBXL_OBJS-$(CONFIG_ARM) += libxl_nocpuid.o libxl_arm.o
>
> diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c
> index 62e251a..f99477d 100644
> --- a/tools/libxl/libxl.c
> +++ b/tools/libxl/libxl.c
> @@ -733,6 +733,31 @@ out:
> Âstatic void remus_failover_cb(libxl__egc *egc,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__domain_suspend_state *dss, int rc);
>
> +static void libxl__remus_setup_failed(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_state *rs, int rc)
> +{
> + Â ÂSTATE_AO_GC(rs->ao);
> + Â Âlibxl__ao_complete(egc, ao, rc);
> +}
> +
> +static void libxl__remus_setup_done(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_state *rs, int rc)
> +{
> + Â Âlibxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Âif (!rc) {
> + Â Â Â Âlibxl__domain_suspend(egc, dss);
> + Â Â Â Âreturn;
> + Â Â}
> +
> + Â ÂLOG(ERROR, "Remus: failed to setup device for guest with domid %u",
> + Â Â Â Âdss->domid);
> + Â Ârs->saved_rc = rc;
> + Â Ârs->callback = libxl__remus_setup_failed;
> + Â Âlibxl__remus_device_teardown(egc, rs);
> +}
> +
> Â/* TODO: Explicit Checkpoint acknowledgements via recv_fd. */
> Âint libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â uint32_t domid, int send_fd, int recv_fd,
> @@ -761,10 +786,15 @@ int libxl_domain_remus_start(libxl_ctx *ctx, libxl_domain_remus_info *info,
>
> Â Â Âassert(info);
>
> - Â Â/* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_state *const rs = &dss->rs;
> + Â Ârs->ao = ao;
> + Â Ârs->domid = domid;
> + Â Ârs->saved_rc = 0;
> + Â Ârs->callback = libxl__remus_setup_done;
>
> Â Â Â/* Point of no return */
> - Â Âlibxl__domain_suspend(egc, dss);
> + Â Âlibxl__remus_device_setup(egc, rs);
> Â Â Âreturn AO_INPROGRESS;
>
> Â out:
> diff --git a/tools/libxl/libxl_dom.c b/tools/libxl/libxl_dom.c
> index c11993d..dde8bf6 100644
> --- a/tools/libxl/libxl_dom.c
> +++ b/tools/libxl/libxl_dom.c
> @@ -1426,6 +1426,17 @@ static void libxl__domain_suspend_callback(void *data)
> Â Â Âdomain_suspend_callback_common(egc, dss);
> Â}
>
> +static void remus_device_postsuspend_cb(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_state *rs, int rc)
> +{
> + Â Âint ok = 0;
> + Â Âlibxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> +
> + Â Âif (!rc)
> + Â Â Â Âok = 1;
> + Â Âlibxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
> +}
> +
> Âstatic void domain_suspend_callback_common_done(libxl__egc *egc,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__domain_suspend_state *dss, int ok)
> Â{
> @@ -1447,32 +1458,51 @@ static void libxl__remus_domain_suspend_callback(void *data)
> Âstatic void remus_domain_suspend_callback_common_done(libxl__egc *egc,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__domain_suspend_state *dss, int ok)
> Â{
> - Â Â/* REMUS TODO: Issue disk and network checkpoint reqs. */
> + Â Âif (!ok)
> + Â Â Â Âgoto out;
> +
> + Â Âlibxl__remus_state *const rs = &dss->rs;
> + Â Ârs->callback = remus_device_postsuspend_cb;
> + Â Âlibxl__remus_device_postsuspend(egc, rs);
> + Â Âreturn;
> +
> +out:
> Â Â Âlibxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
> Â}
>
> -static void libxl__remus_domain_resume_callback(void *data)
> +static void remus_device_preresume_cb(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_state *rs, int rc)
> Â{
> Â Â Âint ok = 0;
> + Â Âlibxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> + Â ÂSTATE_AO_GC(dss->ao);
> +
> + Â Âif (!rc) {
> + Â Â Â Â/* Resumes the domain and the device model */
> + Â Â Â Âif (!libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
> + Â Â Â Â Â Âok = 1;
> + Â Â}
> + Â Âlibxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, ok);
> +}
> +
> +static void libxl__remus_domain_resume_callback(void *data)
> +{
> Â Â Âlibxl__save_helper_state *shs = data;
> Â Â Âlibxl__egc *egc = shs->egc;
> Â Â Âlibxl__domain_suspend_state *dss = CONTAINER_OF(shs, *dss, shs);
> Â Â ÂSTATE_AO_GC(dss->ao);
>
> - Â Â/* Resumes the domain and the device model */
> - Â Âif (libxl__domain_resume(gc, dss->domid, /* Fast Suspend */1))
> - Â Â Â Âgoto out;
> -
> - Â Â/* REMUS TODO: Deal with disk. Start a new network output buffer */
> - Â Âok = 1;
> -out:
> - Â Âlibxl__xc_domain_saverestore_async_callback_done(egc, shs, ok);
> + Â Âlibxl__remus_state *const rs = &dss->rs;
> + Â Ârs->callback = remus_device_preresume_cb;
> + Â Âlibxl__remus_device_preresume(egc, rs);
> Â}
>
> Â/*----- remus asynchronous checkpoint callback -----*/
>
> Âstatic void remus_checkpoint_dm_saved(libxl__egc *egc,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__domain_suspend_state *dss, int rc);
> +static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âconst struct timeval *requested_abs);
>
> Âstatic void libxl__remus_domain_checkpoint_callback(void *data)
> Â{
> @@ -1489,13 +1519,67 @@ static void libxl__remus_domain_checkpoint_callback(void *data)
> Â Â Â}
> Â}
>
> +static void remus_device_commit_cb(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_state *rs, int rc)
> +{
> + Â Âlibxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> +
> + Â ÂSTATE_AO_GC(dss->ao);
> +
> + Â Âif (rc) {
> + Â Â Â ÂLOG(ERROR, "Failed to do device commit op."
> + Â Â Â Â Â Â" Terminating Remus..");
> + Â Â Â Âgoto out;
> + Â Â} else {
> + Â Â Â Â/* Set checkpoint interval timeout */
> + Â Â Â Ârc = libxl__ev_time_register_rel(gc, &rs->timeout,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â remus_next_checkpoint,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â dss->interval);
> + Â Â Â Âif (rc) {
> + Â Â Â Â Â ÂLOG(ERROR, "unable to register timeout for next epoch."
> + Â Â Â Â Â Â Â Â" Terminating Remus..");
> + Â Â Â Â Â Âgoto out;
> + Â Â Â Â}
> + Â Â}
> + Â Âreturn;
> +
> +out:
> + Â Âlibxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
> +}
> +
> Âstatic void remus_checkpoint_dm_saved(libxl__egc *egc,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__domain_suspend_state *dss, int rc)
> Â{
> - Â Â/* REMUS TODO: Wait for disk and memory ack, release network buffer */
> - Â Â/* REMUS TODO: make this asynchronous */
> - Â Âassert(!rc); /* REMUS TODO handle this error properly */
> - Â Âusleep(dss->interval * 1000);
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_state *const rs = &dss->rs;
> +
> + Â ÂSTATE_AO_GC(dss->ao);
> +
> + Â Âif (rc) {
> + Â Â Â ÂLOG(ERROR, "Failed to save device model. Terminating Remus..");
> + Â Â Â Âgoto out;
> + Â Â}
> +
> + Â Ârs->callback = remus_device_commit_cb;
> + Â Âlibxl__remus_device_commit(egc, rs);
> +
> + Â Âreturn;
> +
> +out:
> + Â Âlibxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 0);
> +}
> +
> +static void remus_next_checkpoint(libxl__egc *egc, libxl__ev_time *ev,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âconst struct timeval *requested_abs)
> +{
> + Â Âlibxl__remus_state *rs = CONTAINER_OF(ev, *rs, timeout);
> +
> + Â Â/* Convenience aliases */
> + Â Âlibxl__domain_suspend_state *const dss = CONTAINER_OF(rs, *dss, rs);
> +
> + Â ÂSTATE_AO_GC(dss->ao);
> +
> + Â Âlibxl__ev_time_deregister(gc, &rs->timeout);
> Â Â Âlibxl__xc_domain_saverestore_async_callback_done(egc, &dss->shs, 1);
> Â}
>
> @@ -1720,6 +1804,13 @@ static void save_device_model_datacopier_done(libxl__egc *egc,
> Â Â Âdss->save_dm_callback(egc, dss, our_rc);
> Â}
>
> +static void libxl__remus_teardown_done(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_state *rs, int rc)
> +{
> + Â Âlibxl__domain_suspend_state *dss = CONTAINER_OF(rs, *dss, rs);
> + Â Âdss->callback(egc, dss, rc);
> +}
> +
> Âstatic void domain_suspend_done(libxl__egc *egc,
> Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__domain_suspend_state *dss, int rc)
> Â{
> @@ -1734,6 +1825,19 @@ static void domain_suspend_done(libxl__egc *egc,
> Â Â Â Â Âxc_suspend_evtchn_release(CTX->xch, CTX->xce, domid,
> Â Â Â Â Â Â Â Â Â Â Â Â Â Â dss->guest_evtchn.port, &dss->guest_evtchn_lockfd);
>
> + Â Âif (dss->remus) {
> + Â Â Â Â/*
> + Â Â Â Â * With Remus, if we reach this point, it means either
> + Â Â Â Â * backup died or some network error occurred preventing us
> + Â Â Â Â * from sending checkpoints. Teardown the network buffers and
> + Â Â Â Â * release netlink resources. ÂThis is an async op.
> + Â Â Â Â */
> + Â Â Â Âdss->rs.saved_rc = rc;
> + Â Â Â Âdss->rs.callback = libxl__remus_teardown_done;
> + Â Â Â Âlibxl__remus_device_teardown(egc, &dss->rs);
> + Â Â Â Âreturn;
> + Â Â}
> +
> Â Â Âdss->callback(egc, dss, rc);
> Â}
>
> diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
> index 3fc90e2..5521a42 100644
> --- a/tools/libxl/libxl_internal.h
> +++ b/tools/libxl/libxl_internal.h
> @@ -2470,6 +2470,187 @@ typedef struct libxl__save_helper_state {
> Â Â Â Â Â Â Â Â Â Â Â Â* marshalling and xc callback functions */
> Â} libxl__save_helper_state;
>
> +/*----- remus device related state structure -----*/
> +/* remus device is an abstract layer of remus devices(nic, disk,
> + * etc).It provides the following APIs for libxl:
> + * Â >libxl__remus_device_setup
> + * Â Â setup remus devices, like attach qdisc, enable disk buffering, etc
> + * Â >libxl__remus_device_teardown
> + * Â Â teardown devices
> + * Â >libxl__remus_device_postsuspend
> + * Â >libxl__remus_device_preresume
> + * Â >libxl__remus_device_commit
> + * Â Â above three are for checkpoint.
> + * through remus device layer, the remus execution flow will be like
> + * this:
> + * xl remus -> remus device setup
> + * Â Â Â Â Â Â Â |-> remus checkpoint(postsuspend, preresume, commit)
> + * Â Â Â Â Â Â Â Â Â Â ...
> + * Â Â Â Â Â Â Â Â Â Â Â|-> remus device teardown, failover or abort
> + * the remus device layer provides an interface
> + * Â libxl__remus_device_ops
> + * which a remus device must implement. the whole remus structure:
> + * Â Â Â Â Â Â Â Â Â Â Â Â Â |remus|
> + * Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â|
> + * Â Â Â Â Â Â Â Â Â Â Â |remus device|
> + * Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â|
> + * Â Â Â Â Â Â Â |nic| |drbd disks| |qemu disks| ...
> + * a device(nic, drbd disks, qemu disks, etc) must implement
> + * libxl__remus_device_ops to support remus.
> + */
> +
> +typedef enum libxl__remus_device_kind {
> + Â ÂLIBXL__REMUS_DEVICE_NIC,
> + Â ÂLIBXL__REMUS_DEVICE_DISK,
> +} libxl__remus_device_kind;
> +
> +typedef struct libxl__remus_state libxl__remus_state;
> +typedef struct libxl__remus_device libxl__remus_device;
> +typedef struct libxl__remus_device_state libxl__remus_device_state;
> +typedef struct libxl__remus_device_ops libxl__remus_device_ops;
> +
> +struct libxl__remus_device_ops {
> + Â Â/*
> + Â Â * init() and destroy() APIs are produced by a device type and
> + Â Â * consumed by the main remus code, a device type must implement
> + Â Â * these two APIs.
> + Â Â */
> + Â Â/* init device ops private data, etc. must implement */
> + Â Âint (*init)(libxl__remus_device_ops *self,
> + Â Â Â Â Â Â Â Âlibxl__remus_state *rs);
> + Â Â/* free device ops private data, etc. must implement */
> + Â Âvoid (*destroy)(libxl__remus_device_ops *self);
> + Â Â/*
> + Â Â * This is device ops's private data, for different device types,
> + Â Â * the data structs are different
> + Â Â */
> + Â Âvoid *data;
> +
> + Â Â/*
> + Â Â * checkpoint callbacks, these are async ops, call dev->callback
> + Â Â * when done. These function pointers may be NULL, means the op is
> + Â Â * not implemented, and it will do nothing when checkpoint.
> + Â Â * The callers of these APIs must check the function pointer first.
> + Â Â * These callbacks can be implemented synchronously, call
> + Â Â * dev->callback at last directly.
> + Â Â */
> + Â Âvoid (*postsuspend)(libxl__remus_device *dev);
> + Â Âvoid (*preresume)(libxl__remus_device *dev);
> + Â Âvoid (*commit)(libxl__remus_device *dev);
> +
> + Â Â/*
> + Â Â * This API determines whether the ops matchs the specific device. In the
> + Â Â * implementation, we first init all device ops, for example, NIC ops,
> + Â Â * DRBD ops ... Then we will find out the libxl devices, and match the
> + Â Â * device with the ops, if the device is a drbd disk, then it will be
> + Â Â * matched with DRBD ops, and the further ops(such as checkpoint ops etc.)
> + Â Â * of this device will using DRBD ops. This API is mainly for disks,
> + Â Â * because we must use an external script to determine whether a
> + Â Â * libxl_disk is a DRBD disk. a device type must implement this API.
> + Â Â * It's an async op and must be implemented asynchronously,
> + Â Â * call dev->callback when done.
> + Â Â */
> + Â Âvoid (*match)(libxl__remus_device_ops *self,
> + Â Â Â Â Â Â Â Â Âlibxl__remus_device *dev);
> +
> + Â Â/*
> + Â Â * setup() and teardown() are refer to the actual remus device,
> + Â Â * a device type must implement these two APIs. They are async
> + Â Â * ops, and call dev->callback when done.
> + Â Â * These callbacks can be implemented synchronously, call
> + Â Â * dev->callback at last directly.
> + Â Â */
> + Â Â/* setup the remus device */
> + Â Âvoid (*setup)(libxl__remus_device *dev);
> +
> + Â Â/* teardown the remus device */
> + Â Âvoid (*teardown)(libxl__remus_device *dev);
> +};
> +
> +/*
> + * This structure is for remus device layer, it records remus devices
> + * that have been setuped.
> + */
> +struct libxl__remus_device_state {
> + Â Âlibxl__ao *ao;
> + Â Âlibxl__egc *egc;
> +
> + Â Â/* devices that have been setuped */
> + Â Âlibxl__remus_device **dev;
> +
> + Â Âint num_nics;
> + Â Âint num_disks;
> +
> + Â Â/* for counting devices that have been handled */
> + Â Âint num_devices;
> + Â Â/* for counting devices that matched and setuped */
> + Â Âint num_setuped;
> +};
> +
> +typedef void libxl__remus_device_callback(libxl__egc *,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_device *,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âint rc);
> +/*
> + * This structure is init and setup by remus device abstruct layer,
> + * and pass to remus device ops
> + */
> +struct libxl__remus_device {
> + Â Â/* set by remus device abstruct layer */
> + Â Âint devid;
> + Â Â/* libxl__device_* which this remus device related to */
> + Â Âconst void *backend_dev;
> + Â Âlibxl__remus_device_kind kind;
> + Â Â/*
> + Â Â * This is for matching, we must go through all device ops until we
> + Â Â * find a matched op for the device. The ops_index record which ops
> + Â Â * we are matching.
> + Â Â */
> + Â Âint ops_index;
> + Â Âlibxl__remus_device_ops *ops;
> + Â Âlibxl__remus_device_callback *callback;
> + Â Âlibxl__remus_device_state *rds;
> +
> + Â Â/* used by remus device implementation */
> + Â Â/* *kind* of device's private data */
> + Â Âvoid *data;
> + Â Â/* for calling scripts, eg. setup|teardown|match scripts */
> + Â Âlibxl__async_exec_state aes;
> + Â Â/*
> + Â Â * for async func calls, in the implenmentation of device ops, we
> + Â Â * may use fork to do async ops. this is owned by device-specific
> + Â Â * ops methods
> + Â Â */
> + Â Âlibxl__ev_child child;
> +};
> +
> +typedef void libxl__remus_callback(libxl__egc *,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_state *, int rc);
> +
> +struct libxl__remus_state {
> + Â Â/* must set by caller of libxl__remus_device_(setup|teardown) */
> + Â Âlibxl__ao *ao;
> + Â Âuint32_t domid;
> + Â Âlibxl__remus_callback *callback;
> +
> + Â Â/* private */
> + Â Âint saved_rc;
> + Â Â/* context containing device related stuff */
> + Â Âlibxl__remus_device_state dev_state;
> +
> + Â Âlibxl__ev_time timeout; /* used for checkpoint */
> +};
> +
> +/* the following 5 APIs are async ops, call rs->callback when done */
> +_hidden void libxl__remus_device_setup(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_state *rs);
> +_hidden void libxl__remus_device_teardown(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_state *rs);
> +_hidden void libxl__remus_device_postsuspend(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_state *rs);
> +_hidden void libxl__remus_device_preresume(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_state *rs);
> +_hidden void libxl__remus_device_commit(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_state *rs);
> Â_hidden int libxl__netbuffer_enabled(libxl__gc *gc);
>
> Â/*----- Domain suspend (save) state structure -----*/
> @@ -2500,6 +2681,7 @@ struct libxl__domain_suspend_state {
> Â Â Âint live;
> Â Â Âint debug;
> Â Â Âconst libxl_domain_remus_info *remus;
> + Â Âlibxl__remus_state rs;
> Â Â Â/* private */
> Â Â Âlibxl__ev_evtchn guest_evtchn;
> Â Â Âint guest_evtchn_lockfd;
> diff --git a/tools/libxl/libxl_remus_device.c b/tools/libxl/libxl_remus_device.c
> new file mode 100644
> index 0000000..07e298b
> --- /dev/null
> +++ b/tools/libxl/libxl_remus_device.c
> @@ -0,0 +1,340 @@
> +/*
> + * Copyright (C) 2014
> + * Author: Lai Jiangshan <laijs@xxxxxxxxxxxxxx>
> + * Â Â Â Â Yang Hongyang <yanghy@xxxxxxxxxxxxxx>
> + *
> + * This program is free software; you can redistribute it and/or modify
> + * it under the terms of the GNU Lesser General Public License as published
> + * by the Free Software Foundation; version 2.1 only. with the special
> + * exception on linking described in file LICENSE.
> + *
> + * This program is distributed in the hope that it will be useful,
> + * but WITHOUT ANY WARRANTY; without even the implied warranty of
> + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ÂSee the
> + * GNU Lesser General Public License for more details.
> + */
> +
> +#include "libxl_osdeps.h" /* must come before any other headers */
> +
> +#include "libxl_internal.h"
> +
> +static libxl__remus_device_ops *dev_ops[] = {
> +};
> +
> +static void device_common_cb(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_device *dev,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â int rc)
> +{
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_device_state *const rds = dev->rds;
> + Â Âlibxl__remus_state *const rs = CONTAINER_OF(rds, *rs, dev_state);
> +
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Ârds->num_devices++;
> +
> + Â Âif (rc)
> + Â Â Â Ârs->saved_rc = ERROR_FAIL;
> +
> + Â Âif (rds->num_devices == rds->num_setuped)
> + Â Â Â Ârs->callback(egc, rs, rs->saved_rc);
> +}
> +
> +void libxl__remus_device_postsuspend(libxl__egc *egc, libxl__remus_state *rs)
> +{
> + Â Âint i;
> + Â Âlibxl__remus_device *dev;
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_device_state *const rds = &rs->dev_state;
> +
> + Â Ârds->num_devices = 0;
> + Â Ârs->saved_rc = 0;
> +
> + Â Âif(rds->num_setuped == 0)
> + Â Â Â Âgoto out;
> +
> + Â Âfor (i = 0; i < rds->num_setuped; i++) {
> + Â Â Â Âdev = rds->dev[i];
> + Â Â Â Âdev->callback = device_common_cb;
> + Â Â Â Âif (dev->ops->postsuspend) {
> + Â Â Â Â Â Âdev->ops->postsuspend(dev);
> + Â Â Â Â} else {
> + Â Â Â Â Â Ârds->num_devices++;
> + Â Â Â Â Â Âif (rds->num_devices == rds->num_setuped)
> + Â Â Â Â Â Â Â Ârs->callback(egc, rs, rs->saved_rc);
> + Â Â Â Â}
> + Â Â}
> +
> + Â Âreturn;
> +
> +out:
> + Â Ârs->callback(egc, rs, rs->saved_rc);
> +}
> +
> +void libxl__remus_device_preresume(libxl__egc *egc, libxl__remus_state *rs)
> +{
> + Â Âint i;
> + Â Âlibxl__remus_device *dev;
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_device_state *const rds = &rs->dev_state;
> +
> + Â Ârds->num_devices = 0;
> + Â Ârs->saved_rc = 0;
> +
> + Â Âif(rds->num_setuped == 0)
> + Â Â Â Âgoto out;
> +
> + Â Âfor (i = 0; i < rds->num_setuped; i++) {
> + Â Â Â Âdev = rds->dev[i];
> + Â Â Â Âdev->callback = device_common_cb;
> + Â Â Â Âif (dev->ops->preresume) {
> + Â Â Â Â Â Âdev->ops->preresume(dev);
> + Â Â Â Â} else {
> + Â Â Â Â Â Ârds->num_devices++;
> + Â Â Â Â Â Âif (rds->num_devices == rds->num_setuped)
> + Â Â Â Â Â Â Â Ârs->callback(egc, rs, rs->saved_rc);
> + Â Â Â Â}
> + Â Â}
> +
> + Â Âreturn;
> +
> +out:
> + Â Ârs->callback(egc, rs, rs->saved_rc);
> +}
> +
> +void libxl__remus_device_commit(libxl__egc *egc, libxl__remus_state *rs)
> +{
> + Â Âint i;
> + Â Âlibxl__remus_device *dev;
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Â/*
> + Â Â * REMUS TODO: Wait for disk and explicit memory ack (through restore
> + Â Â * callback from remote) before releasing network buffer.
> + Â Â */
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_device_state *const rds = &rs->dev_state;
> +
> + Â Ârds->num_devices = 0;
> + Â Ârs->saved_rc = 0;
> +
> + Â Âif(rds->num_setuped == 0)
> + Â Â Â Âgoto out;
> +
> + Â Âfor (i = 0; i < rds->num_setuped; i++) {
> + Â Â Â Âdev = rds->dev[i];
> + Â Â Â Âdev->callback = device_common_cb;
> + Â Â Â Âif (dev->ops->commit) {
> + Â Â Â Â Â Âdev->ops->commit(dev);
> + Â Â Â Â} else {
> + Â Â Â Â Â Ârds->num_devices++;
> + Â Â Â Â Â Âif (rds->num_devices == rds->num_setuped)
> + Â Â Â Â Â Â Â Ârs->callback(egc, rs, rs->saved_rc);
> + Â Â Â Â}
> + Â Â}
> +
> + Â Âreturn;
> +
> +out:
> + Â Ârs->callback(egc, rs, rs->saved_rc);
> +}
> +
> +static void device_setup_cb(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_device *dev,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Âint rc)
> +{
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_device_state *const rds = dev->rds;
> + Â Âlibxl__remus_state *const rs = CONTAINER_OF(rds, *rs, dev_state);
> +
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Ârds->num_devices++;
> + Â Â/*
> + Â Â * we add devices that have been setuped to the array no matter
> + Â Â * the setup process succeed or failed because we need to ensure
> + Â Â * the device been teardown while setup failed. If any of the
> + Â Â * device setup failed, we will quit remus, but before we exit,
> + Â Â * we will teardown the devices that have been added to **dev
> + Â Â */
> + Â Ârds->dev[rds->num_setuped++] = dev;
> + Â Âif (rc) {
> + Â Â Â Â/* setup failed */
> + Â Â Â Ârs->saved_rc = ERROR_FAIL;
> + Â Â}
> +
> + Â Âif (rds->num_devices == (rds->num_nics + rds->num_disks))
> + Â Â Â Ârs->callback(egc, rs, rs->saved_rc);
> +}
> +
> +static void device_match_cb(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Âlibxl__remus_device *dev,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Âint rc)
> +{
> + Â Âlibxl__remus_device_state *const rds = dev->rds;
> + Â Âlibxl__remus_state *rs = CONTAINER_OF(rds, *rs, dev_state);
> +
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Âif (rc) {
> + Â Â Â Âif (++dev->ops_index >= ARRAY_SIZE(dev_ops) ||
> + Â Â Â Â Â Ârc != ERROR_NOT_MATCH) {
> + Â Â Â Â Â Â/* the device can not be matched */
> + Â Â Â Â Â Ârds->num_devices++;
> + Â Â Â Â Â Ârs->saved_rc = ERROR_FAIL;
> + Â Â Â Â Â Âif (rds->num_devices == (rds->num_nics + rds->num_disks))
> + Â Â Â Â Â Â Â Ârs->callback(egc, rs, rs->saved_rc);
> + Â Â Â Â Â Âreturn;
> + Â Â Â Â}
> + Â Â Â Â/* the ops does not match, try next ops */
> + Â Â Â Âdev->ops = dev_ops[dev->ops_index];
> + Â Â Â Âdev->ops->match(dev->ops, dev);
> + Â Â} else {
> + Â Â Â Â/* the ops matched, setup the device */
> + Â Â Â Âdev->callback = device_setup_cb;
> + Â Â Â Âdev->ops->setup(dev);
> + Â Â}
> +}
> +
> +static void device_teardown_cb(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_device *dev,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â int rc)
> +{
> + Â Âint i;
> + Â Âlibxl__remus_device_ops *ops;
> + Â Âlibxl__remus_device_state *const rds = dev->rds;
> + Â Âlibxl__remus_state *rs = CONTAINER_OF(rds, *rs, dev_state);
> +
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Â/* ignore teardown errors to teardown as many devs as possible*/
> + Â Ârds->num_setuped--;
> +
> + Â Âif (rds->num_setuped == 0) {
> + Â Â Â Â/* clean device ops */
> + Â Â Â Âfor (i = 0; i < ARRAY_SIZE(dev_ops); i++) {
> + Â Â Â Â Â Âops = dev_ops[i];
> + Â Â Â Â Â Âops->destroy(ops);
> + Â Â Â Â}
> + Â Â Â Ârs->callback(egc, rs, rs->saved_rc);
> + Â Â}
> +}
> +
> +static __attribute__((unused)) void libxl__remus_device_init(libxl__egc *egc,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_device_state *rds,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â libxl__remus_device_kind kind,
> + Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â Â void *libxl_dev)
> +{
> + Â Âlibxl__remus_device *dev = NULL;
> + Â Âlibxl_device_nic *nic = NULL;
> + Â Âlibxl_device_disk *disk = NULL;
> +
> + Â ÂSTATE_AO_GC(rds->ao);
> + Â ÂGCNEW(dev);
> + Â Âdev->ops_index = 0; /* we will match the ops later */
> + Â Âdev->backend_dev = libxl_dev;
> + Â Âdev->kind = kind;
> + Â Âdev->rds = rds;
> +
> + Â Âswitch (kind) {
> + Â Â Â Âcase LIBXL__REMUS_DEVICE_NIC:
> + Â Â Â Â Â Ânic = libxl_dev;
> + Â Â Â Â Â Âdev->devid = nic->devid;
> + Â Â Â Â Â Âbreak;
> + Â Â Â Âcase LIBXL__REMUS_DEVICE_DISK:
> + Â Â Â Â Â Âdisk = libxl_dev;
> + Â Â Â Â Â Â/* there are no dev id for disk devices */
> + Â Â Â Â Â Âdev->devid = -1;
> + Â Â Â Â Â Âbreak;
> + Â Â Â Âdefault:
> + Â Â Â Â Â Âreturn;
> + Â Â}
> +
> + Â Âlibxl__async_exec_init(&dev->aes);
> + Â Âlibxl__ev_child_init(&dev->child);
> +
> + Â Â/* match the ops begin */
> + Â Âdev->callback = device_match_cb;
> + Â Âdev->ops = dev_ops[dev->ops_index];
> + Â Âdev->ops->match(dev->ops, dev);
> +}
> +
> +void libxl__remus_device_setup(libxl__egc *egc, libxl__remus_state *rs)
> +{
> + Â Âint i;
> + Â Âlibxl__remus_device_ops *ops;
> +
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_device_state *const rds = &rs->dev_state;
> +
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Âif (ARRAY_SIZE(dev_ops) == 0)
> + Â Â Â Âgoto out;
> +
> + Â Âfor (i = 0; i < ARRAY_SIZE(dev_ops); i++) {
> + Â Â Â Âops = dev_ops[i];
> + Â Â Â Âif (ops->init(ops, rs)) {
> + Â Â Â Â Â Ârs->saved_rc = ERROR_FAIL;
> + Â Â Â Â Â Âgoto out;
> + Â Â Â Â}
> + Â Â}
> +
> + Â Ârds->ao = rs->ao;
> + Â Ârds->egc = egc;
> + Â Ârds->num_devices = 0;
> + Â Ârds->num_nics = 0;
> + Â Ârds->num_disks = 0;
> +
> + Â Â/* TBD: Remus setup - i.e. attach qdisc, enable disk buffering, etc */
> +
> + Â Âif (rds->num_nics == 0 && rds->num_disks == 0)
> + Â Â Â Âgoto out;
> +
> + Â ÂGCNEW_ARRAY(rds->dev, rds->num_nics + rds->num_disks);
> +
> + Â Â/* TBD: CALL libxl__remus_device_init to init remus devices */
> +
> + Â Âreturn;
> +
> +out:
> + Â Ârs->callback(egc, rs, rs->saved_rc);
> + Â Âreturn;
> +}
> +
> +void libxl__remus_device_teardown(libxl__egc *egc, libxl__remus_state *rs)
> +{
> + Â Âint i;
> + Â Âlibxl__remus_device *dev;
> + Â Âlibxl__remus_device_ops *ops;
> +
> + Â ÂSTATE_AO_GC(rs->ao);
> +
> + Â Â/* Convenience aliases */
> + Â Âlibxl__remus_device_state *const rds = &rs->dev_state;
> +
> + Â Âif (rds->num_setuped == 0) {
> + Â Â Â Â/* clean device ops */
> + Â Â Â Âfor (i = 0; i < ARRAY_SIZE(dev_ops); i++) {
> + Â Â Â Â Â Âops = dev_ops[i];
> + Â Â Â Â Â Âops->destroy(ops);
> + Â Â Â Â}
> + Â Â Â Âgoto out;
> + Â Â}
> +
> + Â Âfor (i = 0; i < rds->num_setuped; i++) {
> + Â Â Â Âdev = rds->dev[i];
> + Â Â Â Âdev->callback = device_teardown_cb;
> + Â Â Â Âdev->ops->teardown(dev);
> + Â Â}
> +
> + Â Âreturn;
> +
> +out:
> + Â Ârs->callback(egc, rs, rs->saved_rc);
> + Â Âreturn;
> +}
> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> index 1018142..cc5d390 100644
> --- a/tools/libxl/libxl_types.idl
> +++ b/tools/libxl/libxl_types.idl
> @@ -43,6 +43,7 @@ libxl_error = Enumeration("error", [
> Â Â Â(-12, "OSEVENT_REG_FAIL"),
> Â Â Â(-13, "BUFFERFULL"),
> Â Â Â(-14, "UNKNOWN_CHILD"),
> + Â Â(-15, "NOT_MATCH"),
> Â Â Â], value_namespace = "")
>
> Âlibxl_domain_type = Enumeration("domain_type", [
> --
> 1.9.1
>
As far as the Remus logic is concerned, I am fine with this patch. You can add my acked-by if it matters here. I'll defer it to IanJ to make the final call on the coding style, etc.
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|