[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] xenstore: document the xenstore protocol
Ian Jackson writes ("[Xen-devel] [PATCH] xenstore: document the xenstore protocol"): > The attached patch [...] I seem to have attached the document itself rather than the patch. *sigh*. Let's try again. Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> diff -r 6e9ee9b86661 docs/misc/xenstore.txt --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/docs/misc/xenstore.txt Tue Dec 04 17:03:57 2007 +0000 @@ -0,0 +1,287 @@ +Xenstore protocol specification +------------------------------- + +Xenstore implements a database which maps filename-like pathnames +(also known as `keys') to values. Clients may read and write values, +watch for changes, and set permissions to allow or deny access. There +is a rudimentary transaction system. + +While xenstore and most tools and APIs are capable of dealing with +arbitrary binary data as values, this should generally be avoided. +Data should generally be human-readable for ease of management and +debugging; xenstore is not a high-performance facility and should be +used only for small amounts of control plane data. Therefore xenstore +values should normally be 7-bit ASCII text strings containing bytes +0x20..0x7f only, and should not contain a trailing nul byte. (The +APIs used for accessing xenstore generally add a nul when reading, for +the caller's convenience.) + +A separate specification will detail the keys and values which are +used in the Xen system and what their meanings are. (Sadly that +specification currently exists only in multiple out-of-date versions.) + + +Paths are /-separated and start with a /, just as Unix filenames. + +We can speak of two paths being <child> and <parent>, which is the +case if they're identical, or if <parent> is /, or if <parent>/ is an +initial substring of <child>. (This includes <path> being a child of +itself.) + +If a particular path exists, all of its parents do too. Every +existing path maps to a possibly empty value, and may also have zero +or more immediate children. There is thus no particular distinction +between directories and leaf nodes. However, it is conventional not +to store nonempty values at nodes which also have children. + +The permitted character for paths set is ASCII alphanumerics and plus +the four punctuation characters -/_@ (hyphen slash underscore atsign). +@ should be avoided except to specify special watches (see below). +Doubled slashes and trailing slashes (except to specify the root) are +forbidden. The empty path is also forbidden. + + +Communication with xenstore is via either sockets, or event channel +and shared memory, as specified in io/xs_wire.h: each message in +either direction is a header formatted as a struct xsd_sockmsg +followed by xsd_sockmsg.len bytes of payload. + +The payload syntax varies according to the type field. Generally +requests each generate a reply with an identical type, req_id and +tx_id. However, if an error occurs, a reply will be returned with +type ERROR, and only req_id and tx_id copied from the request. + +A caller who sends several requests may receive the replies in any +order and must use req_id (and tx_id, if applicable) to match up +replies to requests. (The current implementation always replies to +requests in the order received but this should not be relied on.) + + +---------- Xenstore protocol details - introduction ---------- + +The payload syntax and semantics of the requests and replies are +described below. In the payload syntax specifications we use the +following notations: + + | A nul (zero) byte. + <foo> A string guaranteed not to contain any nul bytes. + <foo|> Binary data (which may contain zero or more nul bytes) + <foo>|* Zero or more strings each followed by a trailing nul + <foo>|+ One or more strings each followed by a trailing nul + ? Reserved value (may not contain nuls) + ?? Reserved value (may contain nuls) + +Except as otherwise noted, reserved values are believed to be sent as +empty strings by all current clients. Clients should not send +nonempty strings for reserved values; those parts of the protocol may +be used for extension in the future. + + +Error replies are as follows: + +ERROR E<something>| + Where E<something> is the name of an errno value + listed in io/xs_wire.h. Note that the string name + is transmitted, not a numeric value. + + +Where no reply payload format is specified below, success responses +have the following payload: + OK| + +Values commonly included in payloads include: + + <path> + Specifies a path in the hierarchical key structure. + If <path> starts with a / it simply represents that path. + + <path> is allowed not to start with /, in which case the + caller must be a domain (rather than connected via a socket) + and the path is taken to be relative to /local/domain/<domid> + (eg, `x/y' sent by domain 3 would mean `/local/domain/3/x/y'). + + <domid> + Integer domid, represented as decimal number 0..65535. + Parsing errors and values out of range generally go + undetected. The special DOMID_... values (see xen.h) are + represented as integers; unless otherwise specified it + is an error not to specify a real domain id. + + + +The following are the actual type values, including the request and +reply payloads as applicable: + + +---------- Database read, write and permissions operatons ---------- + +READ <path>| <value|> +WRITE <path>|<value|> + Store and read the octet string <value> at <path>. + WRITE creates any missing parent paths, with empty values. + +MKDIR <path>| + Ensures that the <path> exists, by necessary by creating + it and any missing parents with empty values. If <path> + or any parent already exists, its value is left unchanged. + +RM <path>| + Ensures that the <path> does not exist, by deleting + it and all of its children. It is not an error if <path> does + not exist, but it _is_ an error if <path>'s immediate parent + does not exist either. + +DIRECTORY <path>| <child-leaf-name>|* + Gives a list of the immediate children of <path>, as only the + leafnames. The resulting children are each named + <path>/<child-leaf-name>. + +GET_PERMS <path>| <perm-as-string>|+ +SET_PERMS <path>|<perm-as-string>|+? + <perm-as-string> is one of the following + w<domid> write only + r<domid> read only + b<domid> both read and write + n<domid> no access + See http://wiki.xensource.com/xenwiki/XenBus section + `Permissions' for details of the permissions system. + +---------- Watches ---------- + +WATCH <wpath>|<token>|? + Adds a watch. + + When a <path> is modified (including path creation, removal, + contents change or permissions change) this generates an event + on the changed <path>. Changes made in transactions cause an + event only if and when committed. Each occurring event is + matched against all the watches currently set up, and each + matching watch results in a WATCH_EVENT message (see below). + + The event's path matches the watch's <wpath> if it is an child + of <wpath>. + + <wpath> can be a <path> to watch or @<wspecial>. In the + latter case <wspecial> may have any syntax but it matches + (according to the rules above) only the following special + events which are invented by xenstored: + @introduceDomain occurs on INTRODUCE + @releaseDomain occurs on any domain crash or + shutdown, and also on RELEASE + and domain destruction + + When a watch is first set up it is triggered once straight + away, with <path> equal to <wpath>. Watches may be triggered + spuriously. The tx_id in a WATCH request is ignored. + +WATCH_EVENT <epath>|<token>| + Unsolicited `reply' generated for matching modfication events + as described above. req_id and tx_id are both 0. + + <epath> is the event's path, ie the actual path that was + modifed; however if the event was the recursive removal of an + parent of <wpath>, <epath> is just + <wpath> (rather than the actual path which was removed). So + <epath> is a child of <epath>, regardless. + + Iff <wpath> for the watch was specified as a relative pathname, + the <epath> path will also be relative (with the same base, + obviously). + +UNWATCH <wpath>|<token>|? + +---------- Transactions ---------- + +TRANSACTION_START ?? <transid>| + <transid> is an opaque uint32_t allocated by xenstored + represented as unsigned decimal. After this, transaction may + be referenced by using <transid> (as 32-bit binary) in the + tx_id request header field. When transaction is started whole + db is copied; reads and writes happen on the copy. + It is not legal to send non-0 tx_id in TRANSACTION_START. + Currently xenstored has the bug that after 2^32 transactions + it will allocate the transid 0 for an actual transaction. + + Clients using the provided xs.c bindings will send a single + nul byte for the argument payload. We recommend that future + clients continue to do the same; any future extension will not + use that syntax. + +TRANSACTION_END T| +TRANSACTION_END F| + tx_id must refer to existing transaction. After this + request the tx_id is no longer valid and may be reused by + xenstore. If F, the transaction is discarded. If T, + it is committed: if there were any other intervening writes + then our END gets get EAGAIN. + + The plan is that in the future only intervening `conflicting' + writes cause EAGAIN, meaning only writes or other commits + which changed paths which were read or written in the + transaction at hand. + +---------- Domain management and xenstored communications ---------- + +INTRODUCE <domid>|<mfn>|<evtchn>|? + Notifies xenstored to communicate with this domain. + + INTRODUCE is currently only used by xend (during domain + startup and various forms of restore and resume), and + xenstored prevents its use other than by dom0. + + <domid> must be a real domain id (not 0 and not a special + DOMID_... value). <mfn> must be a machine page in that domain + represented in signed decimal (!). <evtchn> must be event + channel is an unbound event channel in <domid> (likewise in + decimal), on which xenstored will call bind_interdomain. + Violations of these rules may result in undefined behaviour; + for example passing a high-bit-set 32-bit mfn as an unsigned + decimal will attempt to use 0x7fffffff instead (!). + +RELEASE <domid>| + Manually requests that xenstored disconnect from the domain. + The event channel is unbound at the xenstored end and the page + unmapped. If the domain is still running it won't be able to + communicate with xenstored. NB that xenstored will in any + case detect domain destruction and disconnect by itself. + xenstored prevents the use of RELEASE other than by dom0. + +GET_DOMAIN_PATH <domid>| <path>| + Returns the domain's base path, as is used for relative + transactions: ie, /local/domain/<domid> (with <domid> + normalised). The answer will be useless unless <domid> is a + real domain id. + +IS_DOMAIN_INTRODUCED <domid>| T| or F| + Returns T if xenstored is in communication with the domain: + ie, if INTRODUCE for the domain has not yet been followed by + domain destruction or explicit RELEASE. + +RESUME <domid>| + + Arranges that @releaseDomain events will once more be + generated when the domain becomes shut down. This might have + to be used if a domain were to be shut down (generating one + @releaseDomain) and then subsequently restarted, since the + state-sensitive algorithm in xenstored will not otherwise send + further watch event notifications if the domain were to be + shut down again. + + It is not clear whether this is possible since one would + normally expect a domain not to be restarted after being shut + down without being destroyed in the meantime. There are + currently no users of this request in xen-unstable. + + xenstored prevents the use of RESUME other than by dom0. + +---------- Miscellaneous ---------- + +DEBUG print|<string>|?? sends <string> to debug log +DEBUG print|<thing-with-no-nul> EINVAL +DEBUG check|?? checks xenstored innards +DEBUG <anything-else|> no-op (future extension) + + These requests should not generally be used and may be + withdrawn in the future. + + diff -r 6e9ee9b86661 tools/xenstore/xenstored_core.c --- a/tools/xenstore/xenstored_core.c Fri Nov 30 15:03:30 2007 +0000 +++ b/tools/xenstore/xenstored_core.c Tue Dec 04 16:36:00 2007 +0000 @@ -563,7 +563,9 @@ static struct buffered_data *new_buffer( return data; } -/* Return length of string (including nul) at this offset. */ +/* Return length of string (including nul) at this offset. + * If there is no nul, returns 0 for failure. + */ static unsigned int get_string(const struct buffered_data *data, unsigned int offset) { @@ -579,7 +581,12 @@ static unsigned int get_string(const str return nul - (data->buffer + offset) + 1; } -/* Break input into vectors, return the number, fill in up to num of them. */ +/* Break input into vectors, return the number, fill in up to num of them. + * Always returns the actual number of nuls in the input. Stores the + * positions of the starts of the nul-terminated strings in vec. + * Callers who use this and then rely only on vec[] will + * ignore any data after the final nul. + */ unsigned int get_strings(struct buffered_data *data, char *vec[], unsigned int num) { @@ -668,7 +675,9 @@ bool is_valid_nodename(const char *node) return valid_chars(node); } -/* We expect one arg in the input: return NULL otherwise. */ +/* We expect one arg in the input: return NULL otherwise. + * The payload must contain exactly one nul, at the end. + */ static const char *onearg(struct buffered_data *in) { if (!in->used || get_string(in, 0) != in->used) _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |