[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RFC] RFC: extend the xenstore ring with a 'closing' signal
Hi Paul, Thanks for the comments. On 26 Jun 2014, at 10:05, Paul Durrant <Paul.Durrant@xxxxxxxxxx> wrote: >> -----Original Message----- >> From: David Scott [mailto:dave.scott@xxxxxxxxxx] >> Sent: 25 June 2014 22:15 >> To: xen-devel@xxxxxxxxxxxxxxxxxxxx >> Cc: Paul Durrant; Ian Campbell; john.liuqiming@xxxxxxxxxx; >> yanqiangjun@xxxxxxxxxx; Dave Scott >> Subject: [PATCH RFC] RFC: extend the xenstore ring with a 'closing' signal >> >> Currently hvmloader uses the xenstore ring and then tries to >> reset it back to its initial state. This is not part of the >> ring protocol and, if xenstored reads the ring while it is >> happening, xenstored will conclude it is corrupted. A corrupted >> ring will prevent PV drivers from connecting. This seems to >> be a rare failure. >> >> Furthermore, when a VM crashes it may jump to a 'crash kernel' >> to create a diagnostic dump. Without the ability to safely >> reset the ring the PV drivers won't be able to reliably >> establish connections, to (for example) stream a memory dump to >> disk. >> >> This prototype patch contains a simple extension of the >> xenstore ring structure, enough to contain version numbers and >> a 'closing' flag. This memory is currently unused within the >> 4k page and should be zeroed as part of the domain setup >> process. The oxenstored server advertises version 1, and >> hvmloader detects this and sets the 'closing' flag. The server >> then reinitialises the ring, filling it with obviously invalid >> data to help debugging (unfortunately blocks of zeroes are >> valid xenstore packets) and signals hvmloader by the event >> channel that it's safe to boot the guest OS. >> >> This patch has only been lightly tested. I'd appreciate >> feedback on the approach before going further. >> >> Signed-off-by: David Scott <dave.scott@xxxxxxxxxx> >> --- >> tools/firmware/hvmloader/xenbus.c | 16 +++++-- >> tools/ocaml/libs/xb/xb.ml | 26 ++++++++++- >> tools/ocaml/libs/xb/xb.mli | 3 +- >> tools/ocaml/libs/xb/xs_ring.ml | 13 ++++++ >> tools/ocaml/libs/xb/xs_ring_stubs.c | 81 >> ++++++++++++++++++++++++++++++++++- >> xen/include/public/io/xs_wire.h | 8 ++++ >> 6 files changed, 138 insertions(+), 9 deletions(-) >> >> diff --git a/tools/firmware/hvmloader/xenbus.c >> b/tools/firmware/hvmloader/xenbus.c >> index fe72e97..15d961b 100644 >> --- a/tools/firmware/hvmloader/xenbus.c >> +++ b/tools/firmware/hvmloader/xenbus.c >> @@ -37,6 +37,8 @@ static struct xenstore_domain_interface *rings; /* >> Shared ring with dom0 */ >> static evtchn_port_t event; /* Event-channel to dom0 */ >> static char payload[XENSTORE_PAYLOAD_MAX + 1]; /* Unmarshalling area >> */ >> >> +static void ring_wait(void); >> + > > Personally, I'd move the definition of ring_wait() up rather than forward > declaring it. > >> /* Connect our xenbus client to the backend. >> * Call once, before any other xenbus actions. */ >> void xenbus_setup(void) >> @@ -68,10 +70,16 @@ void xenbus_shutdown(void) >> >> ASSERT(rings != NULL); >> >> - /* We zero out the whole ring -- the backend can handle this, and it's >> - * not going to surprise any frontends since it's equivalent to never >> - * having used the rings. */ >> - memset(rings, 0, sizeof *rings); >> + if (rings->server_version > XENSTORE_VERSION_0) { > > I notice in the changes to xs_wire.h below you have a client_version field. > Should that not be set here too? I added fields for a client version and a server version initially, thinking it sounded like a good idea (symmetry and all that). Looking at it now, it might be a premature generalisation? We only need the server_version in this particular case since only the client needs to request a close/reopen. I agree with you that, if we have a client_version field we should set it and test it. Or drop it :-) > >> + rings->closing = 1; >> + while (rings->closing == 1) >> + ring_wait (); >> + } else { >> + /* If the backend reads the state while we're erasing it then the >> + ring state will become corrupted, preventing guest frontends from >> + connecting. This is rare. */ >> + memset(rings, 0, sizeof *rings); >> + } >> >> /* Clear the event-channel state too. */ >> memset(shinfo->vcpu_info, 0, sizeof(shinfo->vcpu_info)); >> diff --git a/tools/ocaml/libs/xb/xb.ml b/tools/ocaml/libs/xb/xb.ml >> index 29d354d..d5cd776 100644 >> --- a/tools/ocaml/libs/xb/xb.ml >> +++ b/tools/ocaml/libs/xb/xb.ml >> @@ -84,7 +84,26 @@ let write con s len = >> | Fd backfd -> write_fd backfd con s len >> | Xenmmap backmmap -> write_mmap backmmap con s len >> >> -let output con = >> +(* If a function throws Xs_ring.Closing, then clear the ring state >> + and serve the ring again. *) >> +let rec handle_closing f con = >> + match (try Some (f con) with Xs_ring.Closing -> None) with >> + | Some x -> x >> + | None -> >> + begin match con.backend with >> + | Fd _ -> raise Xs_ring.Closing (* should never happen, but >> just in case *) >> + | Xenmmap backend -> >> + Xs_ring.close backend.mmap; >> + backend.eventchn_notify (); >> + (* Clear our old connection state *) >> + Queue.clear con.pkt_in; >> + Queue.clear con.pkt_out; >> + con.partial_in <- init_partial_in (); >> + con.partial_out <- ""; >> + handle_closing f con >> + end >> + >> +let output = handle_closing (fun con -> >> (* get the output string from a string_of(packet) or partial_out *) >> let s = if String.length con.partial_out > 0 then >> con.partial_out >> @@ -101,8 +120,9 @@ let output con = >> ); >> (* after sending one packet, partial is empty *) >> con.partial_out = "" >> +) >> >> -let input con = >> +let input = handle_closing (fun con -> >> let newpacket = ref false in >> let to_read = >> match con.partial_in with >> @@ -133,6 +153,7 @@ let input con = >> HaveHdr (Partial.of_string buf) else NoHdr (i - sz, buf) >> ); >> !newpacket >> +) >> >> let newcon backend = { >> backend = backend; >> @@ -145,6 +166,7 @@ let newcon backend = { >> let open_fd fd = newcon (Fd { fd = fd; }) >> >> let open_mmap mmap notifyfct = >> + Xs_ring.set_server_version mmap 1; (* defined in xs_wire.h *) >> newcon (Xenmmap { >> mmap = mmap; >> eventchn_notify = notifyfct; >> diff --git a/tools/ocaml/libs/xb/xb.mli b/tools/ocaml/libs/xb/xb.mli >> index 58234ae..af7ea5c 100644 >> --- a/tools/ocaml/libs/xb/xb.mli >> +++ b/tools/ocaml/libs/xb/xb.mli >> @@ -23,7 +23,7 @@ module Op : >> | Resume >> | Set_target >> | Restrict >> - | Invalid (* Not a valid wire operation *) >> + | Invalid >> val operation_c_mapping : operation array >> val size : int >> val array_search : 'a -> 'a array -> int >> @@ -80,6 +80,7 @@ val read : t -> string -> int -> int >> val write_fd : backend_fd -> 'a -> string -> int -> int >> val write_mmap : backend_mmap -> 'a -> string -> int -> int >> val write : t -> string -> int -> int >> +val handle_closing : (t -> 'a) -> t -> 'a >> val output : t -> bool >> val input : t -> bool >> val newcon : backend -> t >> diff --git a/tools/ocaml/libs/xb/xs_ring.ml b/tools/ocaml/libs/xb/xs_ring.ml >> index 9469609..dadf9e1 100644 >> --- a/tools/ocaml/libs/xb/xs_ring.ml >> +++ b/tools/ocaml/libs/xb/xs_ring.ml >> @@ -14,5 +14,18 @@ >> * GNU Lesser General Public License for more details. >> *) >> >> +exception Closing >> + >> +let _ = >> + Callback.register_exception "Xs_ring.Closing" Closing >> + >> external read: Xenmmap.mmap_interface -> string -> int -> int = >> "ml_interface_read" >> external write: Xenmmap.mmap_interface -> string -> int -> int = >> "ml_interface_write" >> + >> +external set_client_version: Xenmmap.mmap_interface -> int -> unit = >> "ml_interface_set_client_version" "noalloc" >> +external get_client_version: Xenmmap.mmap_interface -> int = >> "ml_interface_get_client_version" "noalloc" >> + >> +external set_server_version: Xenmmap.mmap_interface -> int -> unit = >> "ml_interface_set_server_version" "noalloc" >> +external get_server_version: Xenmmap.mmap_interface -> int = >> "ml_interface_get_server_version" "noalloc" >> + >> +external close: Xenmmap.mmap_interface -> unit = "ml_interface_close" >> "noalloc" >> diff --git a/tools/ocaml/libs/xb/xs_ring_stubs.c >> b/tools/ocaml/libs/xb/xs_ring_stubs.c >> index 8bd1047..4ddf5a7 100644 >> --- a/tools/ocaml/libs/xb/xs_ring_stubs.c >> +++ b/tools/ocaml/libs/xb/xs_ring_stubs.c >> @@ -35,19 +35,28 @@ >> >> #define GET_C_STRUCT(a) ((struct mmap_interface *) a) >> >> +#define ERROR_UNKNOWN (-1) >> +#define ERROR_CLOSING (-2) >> + >> static int xs_ring_read(struct mmap_interface *interface, >> char *buffer, int len) >> { >> struct xenstore_domain_interface *intf = interface->addr; >> XENSTORE_RING_IDX cons, prod; /* offsets only */ >> int to_read; >> + uint32_t closing; >> >> cons = *(volatile uint32*)&intf->req_cons; >> prod = *(volatile uint32*)&intf->req_prod; >> + closing = *(volatile uint32*)&intf->closing; >> + >> + if (closing == 1) > > Do you really mean if == 1 here? It's a flag, so != 0 would seem better. > Also, should you test the client version? != 0 would be fine by me. > >> + return ERROR_CLOSING; >> + >> xen_mb(); >> >> if ((prod - cons) > XENSTORE_RING_SIZE) >> - return -1; >> + return ERROR_UNKNOWN; >> >> if (prod == cons) >> return 0; >> @@ -71,9 +80,15 @@ static int xs_ring_write(struct mmap_interface >> *interface, >> struct xenstore_domain_interface *intf = interface->addr; >> XENSTORE_RING_IDX cons, prod; >> int can_write; >> + uint32_t closing; >> >> cons = *(volatile uint32*)&intf->rsp_cons; >> prod = *(volatile uint32*)&intf->rsp_prod; >> + closing = *(volatile uint32*)&intf->closing; >> + >> + if (closing == 1) > > Same again. > >> + return ERROR_CLOSING; >> + >> xen_mb(); >> if ( (prod - cons) >= XENSTORE_RING_SIZE ) >> return 0; >> @@ -97,8 +112,12 @@ CAMLprim value ml_interface_read(value interface, >> value buffer, value len) >> >> res = xs_ring_read(GET_C_STRUCT(interface), >> String_val(buffer), Int_val(len)); >> - if (res == -1) >> + if (res == ERROR_UNKNOWN) >> caml_failwith("bad connection"); >> + >> + if (res == ERROR_CLOSING) >> + >> caml_raise_constant(*caml_named_value("Xs_ring.Closing")); >> + >> result = Val_int(res); >> CAMLreturn(result); >> } >> @@ -111,6 +130,64 @@ CAMLprim value ml_interface_write(value interface, >> value buffer, value len) >> >> res = xs_ring_write(GET_C_STRUCT(interface), >> String_val(buffer), Int_val(len)); >> + >> + if (res == ERROR_CLOSING) >> + >> caml_raise_constant(*caml_named_value("Xs_ring.Closing")); >> + >> result = Val_int(res); >> CAMLreturn(result); >> } >> + >> +CAMLprim value ml_interface_set_client_version(value interface, value v) >> +{ >> + CAMLparam2(interface, v); >> + struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)- >>> addr; >> + >> + intf->client_version = Int_val(v); > > Why would the server ever set the client version? This is dead code which is only present for symmetry. It certainly could be removed :-) > >> + >> + CAMLreturn(Val_unit); >> +} >> + >> +CAMLprim value ml_interface_get_client_version(value interface) >> +{ >> + CAMLparam1(interface); >> + struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)- >>> addr; >> + >> + CAMLreturn(Val_int (intf->client_version)); >> +} >> + >> +CAMLprim value ml_interface_set_server_version(value interface, value v) >> +{ >> + CAMLparam2(interface, v); >> + struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)- >>> addr; >> + >> + intf->server_version = Int_val(v); >> + >> + CAMLreturn(Val_unit); >> +} >> + >> +CAMLprim value ml_interface_get_server_version(value interface) >> +{ >> + CAMLparam1(interface); >> + struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)- >>> addr; >> + >> + CAMLreturn(Val_int (intf->server_version)); >> +} >> + >> +CAMLprim value ml_interface_close(value interface) >> +{ >> + CAMLparam1(interface); >> + const char invalid_data[] = { 'd', 'e', 'a', 'd', 'b', 'e', 'e', 'f' }; >> + struct xenstore_domain_interface *intf = GET_C_STRUCT(interface)- >>> addr; >> + int i; >> + >> + intf->req_cons = intf->req_prod = intf->rsp_cons = intf->rsp_prod = >> 0; >> + /* Ensure the unused space is full of invalid xenstore packets. */ >> + for (i = 0; i < XENSTORE_RING_SIZE; i++) { >> + intf->req[i] = invalid_data[i % sizeof(invalid_data)]; >> + intf->rsp[i] = invalid_data[i % sizeof(invalid_data)]; >> + } >> + xen_mb (); >> + intf->closing = 0; >> + CAMLreturn(Val_unit); >> +} >> diff --git a/xen/include/public/io/xs_wire.h >> b/xen/include/public/io/xs_wire.h >> index 585f0c8..68460cc 100644 >> --- a/xen/include/public/io/xs_wire.h >> +++ b/xen/include/public/io/xs_wire.h >> @@ -104,6 +104,9 @@ enum xs_watch_type >> XS_WATCH_TOKEN >> }; >> >> +#define XENSTORE_VERSION_0 0 >> +#define XENSTORE_VERSION_1 1 >> + >> /* >> * `incontents 150 xenstore_struct XenStore wire protocol. >> * >> @@ -112,10 +115,15 @@ enum xs_watch_type >> typedef uint32_t XENSTORE_RING_IDX; >> #define MASK_XENSTORE_IDX(idx) ((idx) & (XENSTORE_RING_SIZE-1)) >> struct xenstore_domain_interface { >> + /* XENSTORE_VERSION_0 */ >> char req[XENSTORE_RING_SIZE]; /* Requests to xenstore daemon. */ >> char rsp[XENSTORE_RING_SIZE]; /* Replies and async watch events. */ >> XENSTORE_RING_IDX req_cons, req_prod; >> XENSTORE_RING_IDX rsp_cons, rsp_prod; >> + uint32_t client_version; >> + uint32_t server_version; > > Should any values below perhaps be part of a union so that future versions > may replace them, or do you intend that future versions should only extend > the set of fields? Good question, I don’t know what’s best here. I notice that the ring structure has been constant for a long time; maybe it would never change again? Or is that wishful thinking? We’ve also got just under 2k of space left, so we could do quite a lot of extending if necessary. > > Paul > >> + /* XENSTORE_VERSION_1 */ >> + uint32_t closing; >> }; >> >> /* Violating this is very bad. See docs/misc/xenstore.txt. */ (I thought this was an appropriate last line of the diff :-) Cheers, Dave _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |