[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Re: Communicating with the domU from dom0 without Network
On Mon, 2006-08-07 at 12:18 -0400, Michael LeMay wrote: > This is precisely the sort of problem that the Keir's proposal seems to > address. I've copied my comments on the proposal below; perhaps we can > discuss them further now since nobody was interested when I originally > posted them. :-) > > --- > > Here's another general comment for discussion... > > The bottom of page 18 in the Xen Roadmap proposal recommends considering > how to "export byte stream > (TCP) data between domains in a high performance fashion." For > communications that occur between domains on a single physical machine, > it would seem logical to setup a new address and protocol family within > Linux that could be used to create and manipulate stream sockets via the > standard interfaces (I'm focusing on Linux at this point, although > similar adaptations could be made to other kernels). Then, behind the > scenes, the Xen grant tables could be used to efficiently transfer > socket buffers between the domains. This should involve much less > overhead than directly connecting two network frontends or performing > other optimizations at lower layers, since it would truncate the > protocol stack and avoid unnecessary TCP-style flow control protocols. > > An enhancement such as this could help to eliminate the network > dependence of some Xen management applications, particularly those that > rely on XML-RPC to communicate. For example, xm currently uses a UNIX > domain socket to communicate with Xend, which introduces an artificial > requirement that xend and xm be running in the same domain. Once XenSE > gains traction and management utilities are scattered across multiple > domains, UNIX domain sockets will no longer be adequate. Under this > scheme, stream sockets to specific domains could easily be constructed, > without regard for the network configuration on the system. > > One important detail that I haven't yet resolved is how to address > inter-domain sockets. Of course, the most important component in the > address for each socket would be the domain ID. However, some sort of > port specification or pathname would also be necessary. I'm not sure > which of those options would be appropriate in this case. Port numbers > would be consistent with TCP and would probably ease the task of porting > applications based on TCP, but pathnames are more consistent with the > UNIX domain sockets used by xm and xend. Perhaps we could provide both, > using two address families associated with the same protocol family? > > What other ideas have been floating around on how to accomplish > byte-stream transport between domains? Are any concrete efforts to > provide this functionality currently underway? Thanks! hi all. since you've explicitly asked for comments, here's mine. from a performance point of view, it is all obviously correct. get rid of the tcp congestion/flow/reliability overhead. in a synchronous, reliable environment like host-local domain intercommunication infrastructure, as you propose, it is nothing but overhead, and should speed up things a lot. plus it saves a whole bunch of memory. but there's a different point of view, which i would like to point out. if you think about the whole 'virtualization' thing, some of the relevant literature is correct to point out that simple unix process is nothing but a virtual machine. a 'process vm', in many respects quite different from a system vm on top of a hypervisor, like xen, though it already has a number of features which make up a virtual machine. resource control and abstraction, as an example, being the most prominent ones. such comparisons are especially daunting if you look at a paravirtualizing, microkernel-style hypervisor design, like xen is one. so, if operating systems and hypervisors are already so similar, where's the merit? one of the major features which make many system VMMs, carrying whole systems, so interesting and different from a unix-style operating system, carrying a number simple processes, is proper isolation. 'isolation' here means separation of an entity, here the guest os instance, from its environment. currently, there is only a few communication primitives connecting a guest from its supporting processing environment. apart from vcpu-state, it's block I/O, network I/O, memory aquisition. a small number of interfaces, each of them on a sufficiently abstract level to enable one of the most distinguishing features (compared with conventional os-level processing models) a system vm has to offer: migrateablity. if the communication primitives remain simple, and more important, location-indepent enough, you can just freeze the whole thing on one place, move it around, and thaw processing state at whatever different location you see fit. for xen as of today, complexity of implementation varies somewhere between 'trivial' and 'easy enough'. now try the same thing with your regular unix process. let's see what we need to carry: system v ipc. shared memory. ip sockets, ok, but then unix domain sockets, netlink sockets. pipes. named pipes. device special files. for starters, just migrate open files terminating in block storage. then try maintaining original process identifiers, your application may have inquired about them and computational state therefore depends on their consistency as well. save all that, migrate, now try to restore elsewhere. the bottom line is: unix processes are anything but isolated. for good reason, a lot of useful applications depend on on inter-process communication. but that lack of isolation has its cost. what the proposal above means is basically addition of dedicated ipc to the domain model. good for performance, but also a good step towards breaking isolation. dom3 may call connect(socket(PF_XEN), "dom7") in future. does that mean if i move dom3 to a backup node, dom7 has to move as well? no not desirable, ok, let's write a proxy service redirecting the once so efficient channel over tcp back to maintain transparency, then. that's just a few additional lines of code. then don't forget those few additional additional code lines telling the domain controller to automatically reroot that proxy as well, in case either domain needs to remigrate at a later point because the machine maintainer cleary doesn't want to have to care. yes, still it's all 'virtually' possible. after all, computers are state machines, and state could always be captured. it just turns out to be a whole lot of work if the number of state machines connecting your vm to its environment keeps morphing and multiplying. operating systems are a moving target, and probably always stay so. that's why hypervisors make sense, as long as they stay simple and reasonably nonintrusive to the guest. that is why os virtualization like openvz may be doomed. if they don't make it into the stock kernel, so others help to maintain their code, they'll keep maintaining on a pretty regular basis, until infinity. xen does as well due to paravitualization, but not as much as the vmm/os-integrated approach. i even suggest a term for this class of proposal: "overparavirtualization". the point where you modified the guest so deeply that you need someone else to maintain the patches. apart from that, i'm all for performance. there's a compromise: add those features, but take good care to separate them from the isolated, network-transparent, preferably IP-based, regular standard guest state. never make such a thing a dependency of anything. most users of vm technology are better off rejecting it, if they wish to keep the features distinguishing the result from a standard operating system environment. make it absolutely clear to users, that if they wish to configure fast host-local inter-domain-communications, they get what they desire: fast, but host-local domain-interdependent communications. if your customer asks for light-weight optimized inter-domain-communication, ask her if that specific application would not rather demand for regular inter-process-communications on a standard operating system, because that's what they get then. kind regards, daniel -- Daniel Stodden LRR - Lehrstuhl fÃr Rechnertechnik und Rechnerorganisation Institut fÃr Informatik der TU MÃnchen D-85748 Garching http://www.lrr.in.tum.de/~stodden mailto:stodden@xxxxxxxxxx PGP Fingerprint: F5A4 1575 4C56 E26A 0B33 3D80 457E 82AE B0D8 735B Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |