[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] libxl refactoring, call for discussion

Hi everyone,

I want to bring domain restart question for a discussion. It originates from DomD restart, but the solution I am about to offer can be quite generic.

Problem is, domain specification currently holds only frontend info, which is used to generate both frontend and backend entries for a device; that means that backend xenstore data is handled not by a domain that owns backends, but by a domain that has frontends linked to these backends. As a result, reboot of any domain with backends without reboot of the corresponding frontend domains is impossible.

This is wrong on many levels, but the main thing is: some domains don't know something they should (do they have any backends) and some know something they should not (there are backends for their frontends in different domains).

I propose following change: make domain "require" and "provide" interfaces visible (think CORBA), hold connections between the two in the priveleged domain (where toolstack is, think controller from the MVC idiom). With this change domains (except for Dom0 which is a special case) can be rebooted in any order whatsoever, and frontend/backend link can be adjusted as a static config or during runtime (e.g. if hardware rendering backend hangs, switch to software rendering to avoid glitches). However, it requires change in the libxl internal device representation (device should not be a frontend/backend pair any more) and config format change, which breaks backwards compatibility.

That is, I want domain configuration hold records on both frontends (what this domain require) and backends (what this domain provides) and libxl to create corresponding xenstore branches separately. Moreover, I'd like to have frontend/backend connection information be held in a different config belonging to Dom0, so that on any domain reboot (or any exceptional situation like watchdog failure) supervisor (Dom0) can use this information to initiate a reconnect.

And, as we talk about libxl refactoring, I'd like to state one point more: code duplication. Libxl support for a split-driver model consists of an declarative IDL device specification, xenstore read, xenstore write, config read, config write, xl args read, JSON read/write and device chain of responsibility with async device creation. The only thing IDL is used for is type and JSON read/write code generation, everything else is an error-prone hand-written duplicated code.

Why won't we generate as much as we can? That means generation of xenstore read, xenstore write, config read, config write and xl args read - these all directly depend on device IDL specification. If we already have external code generation tool, why not use it to full extent instead of writing all this serialization/deserialization code manually (and in different styles - e.g. block device is the only one that uses lexx, and xl_cmdimpl.c parse_config_data implementations differs from device to device quite a lot)?

As a matter of fact, I'd be doing some work in this general direction because we need DomD restart anyway and libxl boilerplate is kinda messy (we have ~12 devices and xl/libxl interface patches for them are almost copy and pasted), but I would like to hear as much criticism and ideas as possible. It would be nice if we can come out from this discussion with something potentially upstreamable.

Suikov Pavlo
P +x.xxx.xxx.xxxxÂÂM +38.066.667.1296Â S psujkov

Xen-devel mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.