[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2] docs/designs: re-work the xenstore migration document...
On 27.04.20 09:53, Paul Durrant wrote: From: Paul Durrant <pdurrant@xxxxxxxxxx> ... to specify a separate migration stream that will also be suitable for live update. The original scope of the document was to support non-cooperative migration of guests [1] but, since then, live update of xenstored has been brought into scope. Thus it makes more sense to define a separate image format for serializing xenstore state that is suitable for both purposes. The document has been limited to specifying a new image format. The mechanism for acquiring the image for live update or migration is not covered as that is more appropriately dealt with by a patch to docs/misc/xenstore.txt. It is also expected that, when the first implementation of live update or migration making use of this specification is committed, that the document is moved from docs/designs into docs/specs. [1] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md Signed-off-by: Paul Durrant <pdurrant@xxxxxxxxxx> --- Juergen Gross <jgross@xxxxxxxx> Andrew Cooper <andrew.cooper3@xxxxxxxxxx> George Dunlap <george.dunlap@xxxxxxxxxx> Ian Jackson <ian.jackson@xxxxxxxxxxxxx> Jan Beulich <jbeulich@xxxxxxxx> Julien Grall <julien@xxxxxxx> Stefano Stabellini <sstabellini@xxxxxxxxxx> Wei Liu <wl@xxxxxxx> Mind adding CC: before those mail addresses in order to let git add those to the recipients list? v2: - Address comments from Juergen Not all unfortunately. :-( +### CONNECTION_DATA-Each WATCH_DATA record specifies a registered watch and is formatted as-follows: +For live update the image format will contain a `CONNECTION_DATA` record for +each connection to xenstore. For migration it will only contain a record for +the domain being migrated.```- 0 1 2 3 octet -+-------+-------+-------+-------+ -| WATCH_DATA | -+-------------------------------+ -| wpath length | -+-------------------------------+ -| wpath data | -... -| pad (0 to 3 octets) | -+-------------------------------+ + 0 1 2 3 4 5 6 7 octet ++-------+-------+-------+-------+-------+-------+-------+-------+ +| conn-id | pad | ++---------------+-----------------------------------------------+ +| conn-type | conn-spec ... I asked whether it wouldn't be better to drop the pad and move conn-type and a 2-byte (unified) flag field at its position. This together ... ++-------------------------------+-------------------------------+ +| data-len | data +-------------------------------+ -| token length | -+-------------------------------+ -| token data | ... -| pad (0 to 3 octets) | -+-------------------------------+ ```-wpath length and token length are specified in octets (excluding the NUL-terminator). The wpath should be as described for the `WATCH` operation in -[2]. The token is an arbitrary string of octets not containing any NUL -values.+| Field | Description |+|-------------|-------------------------------------------------| +| `conn-id` | A non-zero number used to identify this | +| | connection in subsequent connection-specific | +| | records | +| | | +| `conn-type` | 0x0000: shared ring | +| | 0x0001: socket | +| | | +| `conn-spec` | See below | +| | | +| `data-len` | The length (in octets) of any pending data not | +| | yet written to the connection | +| | | +| `data` | Pending data (may be empty) |-**TRANSACTION_DATA**+The format of `conn-spec` is dependent upon `conn-type`.+\pagebreak -Each TRANSACTION_DATA record specifies an open transaction and is formatted-as follows: +For `shared ring` connections it is as follows:```- 0 1 2 3 octet -+-------+-------+-------+-------+ -| TRANSACTION_DATA | -+-------------------------------+ -| tx_id | -+-------------------------------+ + 0 1 2 3 4 5 6 7 octet + +-------+-------+-------+-------+-------+-------+ + | domid | tdomid | flags | ++---------------+---------------+---------------+---------------+ +| revtchn | levtchn | ++-------------------------------+-------------------------------+ +| mfn | ++---------------------------------------------------------------+ ... with dropping levtchn (which isn't needed IMO) will make it much easier to have a union in C (which needs to be aligned to 8 bytes and have a length of a multiple of 8 bytes due to mfn). So something like: struct xs_state_connection { uint32_t conn_id; uint16_t conn_type; #define XS_STATE_CONN_TYPE_RING 0 #define XS_STATE_CONN_TYPE_SOCKET 1 uint16_t flags; #define XS_STATE_CONN_INTRODUCED 0x0001 #define XS_STATE_CONN_RELEASED 0x0002 #define XS_STATE_CONN_READONLY 0x0004 union { struct { uint16_t domid; uint16_t tdomid; #define XS_STATE_DOMID_INVALID 0xffffU uint32_t evtchn; uint64_t mfn; #define XS_STATE_MFN_INVALID 0xffffffffffffffffUL } ring; int32_t socket_fd; } spec; uint32_t data_out_len; uint8_t data[]; }; ```-where tx_id is the non-zero identifier values of an open transaction.--### Protocol Extension+| Field | Description | +|------------|--------------------------------------------------| +| `domid` | The domain-id that owns the shared page | +| | | +| `tdomid` | The domain-id that `domid` acts on behalf of if | +| | it has been subject to an SET_TARGET | +| | operation [2] or DOMID_INVALID otherwise | DOMID_INVALID needs to be defined (or we need a reference where it is coming from). +| | | +| `flags` | A bit-wise OR of: | +| | 0x0001: INTRODUCE has been issued | +| | 0x0002: RELEASE has been issued | +| | | +| `revtchn` | The port number of the interdomain channel used | +| | by `domid` to communicate with xenstored | +| | | +| `levtchn` | For a live update this will be the port number | +| | of the interdomain channel used by xenstored | +| | itself otherwise, for migration, it will be -1 | +| | | +| `mfn` | The MFN of the shared page for a live update or | +| | INVALID_MFN otherwise | INVALID_MFN is an internal detail of the hypervisor. We should have a local definition for it here (as in my example above). + +Since the ABI guarantees that entry 1 in `domid`'s grant table will always +contain the GFN of the shared page, so for a live update `mfn` can be used to +give confidence that `domid` has not been re-cycled during the update. + + +For `socket` connections it is as follows:-Before xenstore state is migrated it is necessary to wait for any pending-reads, writes, watch registrations etc. to complete, and also to make sure -that xenstored does not start processing any new requests (so that new -requests remain pending on the shared ring for subsequent processing on the -new host). Hence the following operation is needed:```-QUIESCE <domid>| - -Complete processing of any request issued by the specified domain, and -do not process any further requests from the shared ring. + 0 1 2 3 4 5 6 7 octet + +-------+-------+-------+-------+-------+-------+ + | flags | socket-fd | + +---------------+-------------------------------+ -START_DOMAIN_TRANSACTION <domid>|<transid>| + 0 1 2 3 octet ++-------+-------+-------+-------+ +| conn-id | ++-------------------------------+ +| tx-id | ++---------------+---------------+ +| access | perm-count | ++---------------+---------------+ +| perm1 | ++-------------------------------+ +... ++-------------------------------+ +| permN | ++---------------+---------------+ +| path-len | value-len | ++---------------+---------------+ +| path +... +| value +... +``` + + +| Field | Description | +|--------------|------------------------------------------------| +| `conn-id` | If this value is non-zero then this record | +| | related to a pending transaction | +| | | +| `tx-id` | This value should be ignored if `conn-id` is | +| | zero. Otherwise it specifies the id of the | +| | pending transaction | +| | | +| `access` | This value should be ignored if this record | +| | does not relate to a pending transaction, | +| | otherwise it specifies the accesses made to | +| | the node and hence is a bitwise OR of: | +| | | +| | 0x0001: read | +| | 0x0002: written | +| | | +| | The value will be zero for a deleted node | +| | | +| `perm-count` | The number (N) of node permission specifiers | +| | (which will be 0 for a node deleted in a | +| | pending transaction) | +| | | +| `perm1..N` | A list of zero or more node permission | +| | specifiers (see below) | +| | | +| `path-len` | The length (in octets) of `path` including the | +| | NUL terminator | +| | | +| `value-len` | The length (in octets) of `value` (which will | +| | be zero for a deleted node) | I asked you to put the path-len and value-len fields before the perm array. Juergen
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |