[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 5/5] tools/xenstore: add migration stream extensions for new features



Hi,

On 03/08/2022 12:59, Juergen Gross wrote:
Extend the definition of the Xenstore migration stream to cover new
features:

- per domain features
- extended watches (watch depth)
- per domain quota

Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
---
V3:
- new patch
---
  docs/designs/xenstore-migration.md | 85 ++++++++++++++++++++++++++++--
  1 file changed, 82 insertions(+), 3 deletions(-)

diff --git a/docs/designs/xenstore-migration.md 
b/docs/designs/xenstore-migration.md
index efa526f420..b2b1d3d5c7 100644
--- a/docs/designs/xenstore-migration.md
+++ b/docs/designs/xenstore-migration.md
@@ -43,7 +43,13 @@ the setting of the endianness bit.
  |-----------|---------------------------------------------------|
  | `ident`   | 0x78656e73746f7265 ('xenstore' in ASCII)          |
  |           |                                                   |
-| `version` | 0x00000001 (the version of the specification)     |
+| `version` | The version of the specification, defined values: |
+|           | 0x00000001: all fields without any explicitly     |
+|           |             mentioned version dependency are      |
+|           |             valid.                                |
+|           | 0x00000002: all fields valid for version 1 plus   |
+|           |             fields explicitly stated to be        |
+|           |             supported in version 2 are valid.     |

I am a bit concerned with the bump of the versions. It means, it will be necessary for Xenstored to know whether the new Xenstored speaks v1 or v2. This is less an issue when Live-Migration (although there is a fleet management problem) but it will be one for Live-Update if we need to rollback.

So I am wondering if we can avoid to bump the version and use other means to detect the difference.

  |           |                                                   |
  | `flags`   | 0 (LSB): Endianness: 0 = little, 1 = big          |
  |           |                                                   |
@@ -117,7 +123,17 @@ xenstored state that needs to be restored.
  | rw-socket-fd                  |
  +-------------------------------+
  | evtchn-fd                     |
++---------------+---------------+
+| n-dom-quota   | n-glob-quota  |
++---------------+---------------+
+| quota-val 1                   |
++-------------------------------+
+...
  +-------------------------------+
+| quota-val N                   |
++-------------------------------+
+| quota-names
+...
  ```
@@ -128,6 +144,22 @@ xenstored state that needs to be restored.
  |                |                                              |
  | `evtchn-fd`    | The file descriptor used to communicate with |
  |                | the event channel driver                     |
+|                |                                              |
+| `n-dom-quota`  | Number of quota values which apply per       |
+|                | domain. Valid only for version 2 and later.  |
+|                |                                              |

I think we can avoid extending the structure here by creating a separate record for the quota.

With that, the new Xenstored don't need specific code to deal with rollback because, as AFAICT, unimplemented records are ignored by existing Xenstored.

+| `n-glob-quota` | Number of quota values which apply globally  |
+|                | only. Valid only for version 2 and later.    |
+|                |                                              |
+| `quota-val`    | Quota values, first the ones applying per    |
+|                | domain, then the ones applying globally. A   |
+|                | value of 0 has the semantics of "unlimited". |
+|                | Valid only for version 2 and later.          |
+|                |                                              |
+| `quota-names`  | 0 delimited strings of the quota names in    |
+|                | the same sequence as the `quota-val` values. | > +|         
       | Valid only for version 2 and later.          |

From my understanding, both version of Xenstored needs to agree on the quota names. So it means the name have to be defined as part of the spec. At which point, I think it would be better to use ID.

Also, can you clarify what would happen if the stream contains a quota not supported by the new Xenstored?

+
xenstored will resume in the original process context. Hence `rw-socket-fd`
  simply specifies the file descriptor of the socket. Sockets are not always
@@ -145,7 +177,7 @@ the domain being migrated.
  ```
      0       1       2       3       4       5       6       7    octet
  +-------+-------+-------+-------+-------+-------+-------+-------+
-| conn-id                       | conn-type     |               |
+| conn-id                       | conn-type     | n-quota       |
  +-------------------------------+---------------+---------------+
  | conn-spec
  ...
@@ -154,6 +186,17 @@ the domain being migrated.
  +---------------+---------------+-------------------------------+
  | data
  ...
++-------------------------------+
+| features                      |
++-------------------------------+
+| quota-val 1                   |
++-------------------------------+
+...
++-------------------------------+
+| quota-val N                   |
++-------------------------------+
+| quota-names
+...
  ```
@@ -167,6 +210,10 @@ the domain being migrated.
  |                | 0x0001: socket                               |
  |                | 0x0002 - 0xFFFF: reserved for future use     |
  |                |                                              |
+| `n-quota`      | Number of quota values.                      |
+|                | Only for `conn-type` 0 (shared ring).        |
+|                | Only valid for version 2 and later.          |
+|                |                                              |
  | `conn-spec`    | See below                                    |
  |                |                                              |
  | `in-data-len`  | The length (in octets) of any data read      |
@@ -182,6 +229,22 @@ the domain being migrated.
  | `data`         | Pending data: first in-data-len octets of    |
  |                | read data, then out-data-len octets of       |
  |                | written data (any of both may be empty)      |
+|                |                                              |
+| `features`     | Value of the feature field visible by the    |
+|                | guest at offset 2064 of the ring page.       |
+|                | Aligned to the next 4 octet boundary.        |
+|                | Only for `conn-type` 0 (shared ring).        |

For the purpose of the stream, I would consider to make it available for the socket connection. This could potentially be used in the future to allow each application to have a different behavior when socket is used.

I can't make my mind yet if we can avoid bumping the version for this field. What would happen if we need to rollback?

+|                | Only valid for version 2 and later.          | > +|         
       |                                              |
+| `quota-val`    | Quota values, a value of 0 has the semantics |
+|                | "unlimited".                                 |
+|                | Only for `conn-type` 0 (shared ring).        |
+|                | Only valid for version 2 and later.          |

I would suggest to be very obvious and say the value will override the value (if any)

+|                |                                              |
+| `quota-names`  | 0 delimited strings of the quota names in    |
+|                | the same sequence as the `quota-val` values. |
+|                | Only for `conn-type` 0 (shared ring).        |
+|                | Only valid for version 2 and later.          |

As for the "global" quotas, I would move the quotas in a separate record. In this case, this would also be useful to avoid having may dynamic length field within the same record.

In case of live update the connection record for the connection via which
  the live update command was issued will contain the response for the live
@@ -247,7 +310,7 @@ by a connection for which there is `CONNECTION_DATA` record 
previously present.
```
      0       1       2       3    octet
-+-------+-------+-------+-------+
++---------------+---------------+
  | conn-id                       |
  +---------------+---------------+
  | wpath-len     | token-len     |
@@ -256,6 +319,9 @@ by a connection for which there is `CONNECTION_DATA` record 
previously present.
  ...
  | token
  ...
++---------------+---------------+
+| depth         |               |
++---------------+---------------+
  ```
@@ -275,6 +341,13 @@ by a connection for which there is `CONNECTION_DATA` record previously present.
  |             |                                                 |
  | `token`     | The watch identifier token, as specified in the |
  |             | `WATCH` operation                               |
+|             |                                                 |
+| `depth`     | The number of directory levels below the        |
+|             | watched path to consider for a match. This      |
+|             | field is aligned to the next 4 octet boundary.  |
+|             | A value of 0xffff is used for unlimited depth.  |
+|             | This field is valid only for version 2 and      |
+|             | higher.                                         |

If we are going to bump the stream version, then I think we should move the field before token/path.

\pagebreak @@ -406,6 +479,12 @@ A node permission specifier has the following format:
  Note that perm1 defines the domain owning the node. See [4] for more
  explanation of node permissions.
+\pagebreak
+
+### DOMAIN_DATA
+
+
+

What this section is for?

Cheers,

  * * *
[1] See https://xenbits.xen.org/gitweb/?p=xen.git;a=blob;f=docs/designs/non-cooperative-migration.md

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.