[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] Netchannel2
After many delays and false starts, here's a preliminary netchannel2 implementation: http://xenbits.xensource.com/ext/netchannel2/xen-unstable.hg http://xenbits.xensource.com/ext/netchannel2/linux-2.6.18.hg These trees are up-to-date with the mainline trees of the same name as of a couple of days ago. Here are the main features of the current scheme: -- Obviously, it implements a fully functional network interface. LRO, TSO, and checksum offload are all supported. Hot-add and hot-remove work as expected. -- The copy-to-receiver-buffers is now performed in the receiving domain, rather than in dom0. This helps to prevent dom0 from becoming a bottleneck, and also has some cache locality advantages. -- Inter-domain traffic can be configure to bypass dom0 completely. Once a bypass is established, the domains communicate on their own private ring, without indirecting via dom0. This significantly increases inter-domain bandwidth, reduces latency, and reduces dom0 load. (This is currently somewhat rough around the edges, and each bypass needs to be configured manually. It'll (hopefully) eventually be automatic, but that hasn't been implemented yet.) -- A new, and hopefully far more extensible, ring protocol, supporting variable size messages, multi-page rings, and out-of-order message return. This is intended to make VMDQ support straightforward, although that hasn't been implemented yet. -- Packet headers are sent in-line in the ring, rather than out-of-band in fragment descriptors. Small packets (e.g. TCP ACKs) are sent entirely in-line. -- There's an asymmetry limiter, intended to protect dom0 against denial of service attacks by malicious domUs. -- Sub-page grant support. The grant table interface is extended so a domain can grant another domain access to a range of bytes within a page, and Xen will then prevent the grantee domain accessing outside that range. For obvious reasons, it isn't possible to map these grant references, and domains are expected to use the grant copy hypercalls instead. -- Transitive grant support. It's now possible for a domain to create a grant reference which indirects to another grant reference, so that any attempt to access the first grant reference will be redirected to the second one. This is used to implement receiver-side copy on inter-domain traffic: rather than copying the packet in dom0, dom0 creates a transitive grant referencing the original transmit buffer, and passes that to the receiving domain. For implementation reasons, only a single level of transitive granting is supported, and transitive grants cannot be mapped (i.e. they can only be used in grant copy operations). Multi-level transitive grants could be added pretty much as soon as anybody needs them, but mapping transitive grants would be more tricky. It does still have a few rough edges: -- Suspend/resume and migration don't work with dom0 bypass. -- Ignoring the bypass support, performance isn't that much better than netchannel1 for many tests. Dom0 CPU load is usually lower, so it should scale better when you have many NICs, but in terms of raw throughput there's not much in it either way. Earlier versions were marginally ahead, but there seems to have been a bit of a regression while I was bringing it up to date with current Xen/Linux. -- The hotplug scripts and tool integration aren't nearly as complete as their netchannel1 equivalents. It's not clear to me how much of the netchannel1 stuff actually gets used, though, so I'm going to leave this as-is unless somebody complains. -- The code quality needs some attention. It's been hacked around by a number of people over the course of several months, and generally has a bit less conceptual integrity than I'd like in new code. (It's not horrific, by any means, but it is a bit harder to follow than the old netfront/netback drivers were.) -- There's no unmodified-drivers support, so you won't be able to use it in HVM domains. Adding support is unlikely to be terribly difficult, with the possible exception of the dom0 bypass functionality, but I've not looked at it at all yet. If you want to try this out, you'll need to rebuild Xen, the dom0 kernel, and the domU kernels, in addition to building the module. You'll also need to install xend and the userspace tools from the netchannel2 xen-unstable repository. To create an interface, either use the ``xm network2-attach'' command or specify a vif2= list in your xm config file. The current implementation is broadly functional, in that it doesn't have any known crippling bugs, but hasn't had a great deal of testing. It should work, for the most part, but it certainly isn't ready for production use. If you find any problems, please report them. Patches would be even better. :) A couple of people have asked about using the basic ring protocol in other PV device classes (e.g. pvSCSI, pvUSB). I'll follow up in a second with a summary of how all that works. Steven. Attachment:
signature.asc _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |