[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [DRAFT 1] XenSock protocol design document
On Mon, 11 Jul 2016, Paul Durrant wrote: > > -----Original Message----- > [snip] > > > > # XenSocks Protocol v1 > > > > ## Rationale > > > > XenSocks is a paravirtualized protocol for the POSIX socket API. > > > > The purpose of XenSocks is to allow the implementation of a specific set > > of POSIX calls to be done in a domain other than your own. It allows > > connect, accept, bind, release, listen, poll, recvmsg and sendmsg to be > > implemented in another domain. > > Does the other domain have privilege over the domain issuing the POSIX calls? I don't have a strong opinion on this. In my scenario the backend is in fact always dom0, but so far nothing in the protocol would prevent XenSock from being used with driver domains AFAICT. Maybe writing down that the backend needs to be privileged would allow us to take some shortcuts in the future, but as there are none at the moment, I don't think we should make this a requirement. What do you think? > [snip] > > #### State Machine > > > > **Front** **Back** > > XenbusStateInitialising XenbusStateInitialising > > - Query virtual device - Query backend device > > properties. identification data. > > - Setup OS device instance. | > > - Allocate and initialize the | > > request ring. V > > - Publish transport parameters XenbusStateInitWait > > that will be in effect during > > this connection. > > | > > | > > V > > XenbusStateInitialised > > > > - Query frontend transport > > parameters. > > - Connect to the request ring and > > event channel. > > | > > | > > V > > XenbusStateConnected > > > > - Query backend device properties. > > - Finalize OS virtual device > > instance. > > | > > | > > V > > XenbusStateConnected > > > > Once frontend and backend are connected, they have a shared page, which > > will is used to exchange messages over a ring, and an event channel, > > which is used to send notifications. > > > > What about XenbusStateClosing and XenbusStateClosed? We're missing half the > state model here. Specifically how do individual connections get terminated > if either end moves to closing? Does either end have to wait for the other? I admit I "took inspiration" from xen/include/public/io/blkif.h, which is also missing the closing steps. I'll try to add them. (If you know of any existing descriptions of a XenBus closing protocol please let me know.) > > > > ### Commands Ring > > > > The shared ring is used by the frontend to forward socket API calls to the > > backend. I'll refer to this ring as **commands ring** to distinguish it from > > other rings which will be created later in the lifecycle of the protocol > > (data > > rings). The ring format is defined using the familiar `DEFINE_RING_TYPES` > > macro > > (`xen/include/public/io/ring.h`). Frontend requests are allocated on the > > ring > > using the `RING_GET_REQUEST` macro. > > > > The format is defined as follows: > > > > #define XENSOCK_DATARING_ORDER 6 > > #define XENSOCK_DATARING_PAGES (1 << XENSOCK_DATARING_ORDER) > > #define XENSOCK_DATARING_SIZE (XENSOCK_DATARING_PAGES << > > PAGE_SHIFT) > > > > Why a fixed size? Also, I assume DATARING should be CMDRING or somesuch here. > Plus a fixed size of *six* pages seems like a lot. This is going to be changed and significantly improved following Juergen's suggestion. > > Return value: > > > > - 0 on success > > - less than 0 on failure, see the error codes of the socket system call > > > > The socket system call on which OS? I'll add more info on this. I'll try to stick to POSIX as much as I can, defining explicitly anything which is not specified by it (such as error numbers). > > #### Bind > > > > The **bind** operation assigns the address passed as parameter to the > > socket. > > It corresponds to the bind system call. > > Is a domain allowed to bind to a privileged port in the backend domain? I would let the backend decide: the backend can return -EACCES if it doesn't want to allow access to a given port. > > **sockid** is freely chosen by the > > frontend and references this specific socket from this point forward. > > **Bind**, > > **listen** and **accept** are the three operations required to have fully > > working passive sockets and should be issued in this order. > > > > Fields: > > > > - **cmd** value: 4 > > - additional fields: > > - **addr**: address to bind to, in struct sockaddr format > > - **len**: address length > > > > Binary layout: > > > > 16 20 24 28 32 36 40 44 48 > > +-------+-------+-------+-------+-------+-------+-------+-------+ > > | addr | len | > > +-------+-------+-------+-------+-------+-------+-------+-------+ > > > > Return value: > > > > - 0 on success > > - less than 0 on failure, see the error codes of the bind system call > > > > > > #### Listen > > > > The **listen** operation marks the socket as a passive socket. It > > corresponds to > > the listen system call. > > ...which also takes a 'backlog' parameter, which doesn't seem to be specified > here. Fixed, thanks! > > XENSOCK_RING_IDX in_cons, in_prod; > > XENSOCK_RING_IDX out_cons, out_prod; > > int32_t in_error, out_error; > > }; > > > > The design is flexible and can support different ring sizes (at compile > > time). > > The following description is based on order 6 rings, chosen because they > > provide > > excellent performance. > > > > What about datagram sockets? Raw sockets? Setting socket options? Etc. All currently unimplemented. Probably they are not going to be part of the initial version of the protocol, but it would be nice if the protocol was flexible enough to allow somebody in the future to jump in and add them without too much trouble. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |