[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] xenstore: extend the xenstore ring with a 'closing' signal



On Wed, 2014-09-03 at 17:25 +0100, David Scott wrote:
> Hvmloader uses the xenstore ring and then tries to reset it back
> to its initial state before booting the guest. Occasionally xenstored
> will read the ring while it is being zeroed and conclude it has
> been corrupted. This prevents PV drivers from loading in the guest.
> 
> This patch updates the xenstore ring protocol definition, enabling
> a server to advertise additional features to the guest. One such feature
> is defined: the ability to cleanly reset the ring (but not the
> higher-level protocol, for which we already have RESET_WATCHES).

Is RESET_WATCHES complete enough to be considered a higher-level
protocol reset, as opposed to just doing the watch stuff? (maybe there
is no other state to speak of?)

> This patch implements the ring reconnection features in oxenstored
> and hvmloader, fixing the bug.

I suppose it is worth mentioning here that C xenstored is untouched but
that the hvmloader changes have been written to work with an arbitrary
xenstore by using the protocol. (at least I hope it has!)

> This patch also defines an 'invalid' xenstore packet type and uses this
> to poison the ring over a reconnect. This will make diagnosing this
> bug much easier in future.
> 
> Signed-off-by: David Scott <dave.scott@xxxxxxxxxx>

I've made some comments below but I think it might be worth waiting for
Ian Jackson's input, since he had comments last time, before rushing to
make any changes.

> 
> ---
> 
> Updates since v3
> * hvmloader: signal the event channel after requesting a reconnect
>   (thanks to Zheng Li for diagnosing this)
> * switch to using feature flags/bits instead of version numbers
> * rename the 'closing' field to 'connection state'
> * define an invalid packet type (0xffff, since 0x0 was already taken)
> * on ring connection/reset, fill the input and output buffers with
>   the invalid packet
> * ocaml xs_ring_stubs.c: collapse both {read,write} functions into one,
>   remove #define ERROR_FOO
> * doc: write from the guest's point of view, weaken guarantees made by
>   server (e.g. producer = consumer != 0 is valid after reconnect)
> * doc: relate reconnection to the RESET_WATCHES command in the higher
>   level protocol
> * doc: be more specific about who must write what, when
> 
> Updates since v2
> * hvmloader: use volatile for read of closing flag
> * style improvements
> * remove xenstore version #defines
> 
> Updates since v1
> * remove unused client version and associated dead code
> * treat 'closing' as a flag by using "!=0" rather than "==1"
> * hvmloader: move function up and remove forward decl
> * document the existing xenstore ring and the extention in misc/
> ---
>  docs/misc/xenstore-ring.txt         |  112 
> +++++++++++++++++++++++++++++++++++
>  tools/firmware/hvmloader/xenbus.c   |   48 +++++++++------
>  tools/ocaml/libs/xb/xb.ml           |   27 ++++++++-
>  tools/ocaml/libs/xb/xb.mli          |    1 +
>  tools/ocaml/libs/xb/xs_ring.ml      |   10 ++++
>  tools/ocaml/libs/xb/xs_ring_stubs.c |  109 ++++++++++++++++++++++++----------
>  xen/include/public/io/xs_wire.h     |   13 +++-
>  7 files changed, 269 insertions(+), 51 deletions(-)
>  create mode 100644 docs/misc/xenstore-ring.txt
> 
> diff --git a/docs/misc/xenstore-ring.txt b/docs/misc/xenstore-ring.txt
> new file mode 100644
> index 0000000..6d18556
> --- /dev/null
> +++ b/docs/misc/xenstore-ring.txt

You seem to use markdown like syntax, so please name it .markdown and we
will get some sort of useful (well, ish) HTML out of it at build time.

> @@ -0,0 +1,112 @@
> +The xenstore ring is a datastructure stored within a single 4KiB page
> +shared between the xenstore server and the guest. The ring contains
> +two queues of bytes -- one in each direction -- and some signalling
> +information. The [xenstore protocol](xenstore.txt) is layered on top of
> +the byte streams.
> +
> +The xenstore ring datastructure
> +===============================
> +
> +The following table describes the ring structure where
> +  - offsets and lengths are in bytes;
> +  - "Input" is used to describe the data sent to the server; and
> +  - "Output" is used to describe the data sent to the domain.
> +
> +Offset  Length  Description
> +-----------------------------------------------------------------
> +0       1024    Input data
> +1024    1024    Output data
> +2048    4       Input consumer offset
> +2052    4       Input producer offset
> +2056    4       Output consumer offset
> +2060    4       Output producer offset
> +2064    4       Server feature bitmap
> +2068    4       Connection state
> +
> +The Input data and Output data are circular buffers. Each buffer is
> +associated with a pair of free-running offsets labelled "consumer" and
> +"producer".
> +
> +A "producer" offset is the offset in the byte stream of the next byte
> +to be written modulo 2^32. A "consumer" offset is the offset in the byte
> +stream of the next byte to be read modulo 2^32. Implementations must
> +take care to handle wraparound properly when performing arithmetic with
> +these values.
> +
> +The byte at offset 'x' in the byte stream will be stored at offset
> +'x modulo 1024' in the circular buffer.
> +
> +Implementations may only overwrite previously-written data if it has
> +been marked as 'consumed' by the relevant consumer pointer.
> +
> +When the guest domain is created, there is no outstanding Input or Output
> +data. However
> +
> +  - guests must not assume that producer or consumer pointers start
> +    at zero; and
> +  - guests must not assume that unused bytes in either the Input or
> +    Output data buffers has any particular value.
> +
> +A xenstore ring is always associated with an event channel. Whenever the
> +ring structure is updated the event channel must be signalled. The
> +guest and server are free to inspect the contents of the ring at any
> +time, not only in response to an event channel event. This implies that
> +updates must be ordered carefully to ensure consistency.
> +
> +The xenstore server may decide to advertise some features via the
> +"Server feature bitmap". The server can start advertising features
> +at any time by setting bits but it will never stop advertising features
> +i.e. bits will never be cleared. The guest is not permitted to write to
> +the server feature bitmap. The server features are offered to the guest;
> +it is up to the guest whether to use them or not. The guest should ignore
> +any unknown feature bits.
> +
> +The following features are defined:
> +
> +Mask    Description
> +-----------------------------------------------------------------
> +1       Ring reconnection (see the ring reconnection feature below)
> +
> +The "Connection state" field is used to request a ring close and reconnect.
> +The "Connection state" field only contains valid data if the server has
> +advertised the ring reconnection feature. If the feature has been advertised
> +then the "Connection state" may take the following values:
> +
> +Value   Description
> +-----------------------------------------------------------------
> +0       Ring is connected
> +1       Ring close and reconnect is in progress (see the "ring
> +        reconnection feature" described below)
> +
> +The ring reconnection feature
> +=============================
> +
> +The ring reconnection feature allows the guest to ask the server to
> +reset the ring to a valid initial state i.e. where the Input and Output
> +queues contain no data. The feature operates at the ring-level only;
> +it does not affect the higher-level protocol state machine at all.
> +To reset theh higher-level protocol, please read about the 'RESET_WATCHES'

typo here: theh.

> +command in the [xenstore protocol](xenstore.txt) specification.
> +
> +The ring reconnection feature is only available if the 'Ring reconnection'
> +feature bit has been set by the server in the "Server feature bitmap".
> +
> +Assuming the server has advertised the feature, the guest can initiate
> +a reconnection by setting the the Connection state to 1 ("Ring close
> +and reconnect is in progress") and signalling the event channel.
> +The guest must now ignore all fields except the Connection state and
> +wait for it to be set to 0 ("Ring is connected")
> +
> +The server will guarantee to
> +
> +  - drop any partially read or written higher-level
> +    [xenstore protocol](xenstore.txt) packets it may have;
> +  - empty the Input and Output queues in the xenstore ring;
> +  - set the Connection state to 0 ("Ring is connected"); and
> +  - signal the event channel.
> +
> +From the point of view of the guest, the connection has been reset on a
> +packet boundary.
> +
> +Note that only the guest may set the Connection state to 1 and only the
> +server may set it back to 0.

> [...]
> -    /* We zero out the whole ring -- the backend can handle this, and it's 
> -     * not going to surprise any frontends since it's equivalent to never 
> -     * having used the rings. */
> -    memset(rings, 0, sizeof *rings);
> -
> +    if (rings->server_features & XENSTORE_SERVER_FEATURE_RECONNECTION) {
> +        rings->connection = XENSTORE_RECONNECT;
> +        send.port = event;
> +        hypercall_event_channel_op(EVTCHNOP_send, &send);
> +        while (*(volatile uint32_t*)&rings->connection == XENSTORE_RECONNECT)

I don't think you need the volatile here, certainly none of the other
ring accesses seem to use it.

Apart from that the C side of things looks good. I've not looked at the
ocaml side of things, do you have someone you can lean on to do some
review or do I need to blow the cobwebs off that part of my brain?

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.