[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 20 of 29 RFC] libxl: introduce libxl hotplug public API functions
2012/2/9 Ian Campbell <Ian.Campbell@xxxxxxxxxx>: > On Thu, 2012-02-09 at 16:18 +0000, Stefano Stabellini wrote: >> On Thu, 9 Feb 2012, Ian Campbell wrote: >> > On Thu, 2012-02-09 at 16:00 +0000, Stefano Stabellini wrote: >> > > On Thu, 9 Feb 2012, Ian Campbell wrote: >> > > > On Thu, 2012-02-09 at 15:32 +0000, Stefano Stabellini wrote: >> > > > > On Thu, 9 Feb 2012, Ian Jackson wrote: >> > > > > > Stefano Stabellini writes ("Re: [Xen-devel] [PATCH 20 of 29 RFC] >> > > > > > libxl: introduce libxl hotplug public API functions"): >> > > > > > > - we can reuse the "state" based mechanism to establish a >> > > > > > > connection: >> > > > > > > again not a great protocol, but very well known and understood. >> > > > > > >> > > > > > I don't think we have, in general, a good understanding of these >> > > > > > "state" based protocols ... >> > > > > >> > > > > What?! We have netback, netfront, blkback, blkfront, pciback, >> > > > > pcifront, >> > > > > kbdfront, fbfront, xenconsole, and these are only the ones in Linux!! >> > > > >> > > > And no one I know is able to describe, accurately, exactly what the >> > > > state diagram for even one of those actually looks like or indeed >> > > > should >> > > > look like. It became quite evident in these threads about hotplug >> > > > script >> > > > handling etc that no one really knows for sure what (is supposed to) >> > > > happens when. >> > > >> > > I thought that most of the thread was about the interface with the block >> > > scripts, that is an entirely different matter and completely obscure. >> > > If I am mistaken, please point me at the right email. >> > >> > We are talking about reusing the existing xenbus state machine schema >> > for a new purpose. Ian J pointed out that these are not generally well >> > understood, you replied that it was and cited some examples. I pointed >> > out why these were not examples of why this stuff was well understood at >> > all, in fact quite the opposite. >> >> Sorry but I don't understand how these examples are supposed to be >> "quite the opposite". >> I quite like the idea of being able to read a single source file of less >> than 400 LOC to understand how a protocol works >> (drivers/input/misc/xen-kbdfront.c). > > That is not a protocol specification, merely one implementation of it. > What does the BSD driver do? Is it exactly the same as Linux? Should BSD > driver authors be expected to reverse engineer the protocol from the > Linux code? What/who arbitrates when the two behave differently? > >> In fact I don't think that understanding the protocol has been an issue >> for the GSoC student that had to write a new one. > > Being able to reverse engineer something which works is not proof that > these things are "well understood" in the general case. > >> I think we are under influence of a "reiventing the wheel" virus. > > I think we are in danger of making the same mistakes again as have been > made with the device protocols and this is what I want to avoid. > > Now, perhaps this style of state machine protocol is a reasonable design > choice in this case, but since we are starting afresh here this specific > new instance should be well documented _up_front_ not left in the "oh, > just read the Linux code" state we have now for many of our devices > which has lead to multiple slightly divergent implementations of the > same basic concept. Yes, documentation about this protocol should go in together with the protocol itself. > >> > > > Justin just posted a good description for blkif.h which included a >> > > > state >> > > > machine description. We need the same for pciif.h, netif.h etc etc. >> > > >> > > The state machine is the same for block and network. >> > >> > No, it's not. This is exactly what IanJ and I are talking about. >> >> Could you please elaborate? >> >> I am sure you know that the xenstore state machine is handled the same >> way for all the backends in QEMU (see hw/xen_backend.c). >> And the same thing is true for the frontends and the backends in Linux. > > A substantial proportion of the threads about this hotplug script stuff > has been about the fact that no one is quite sure what really happens > when for all implementations nor what the common semantics are. > > e.g. How do you ask a backend to shut down (do you set it to state 5? > state 6? do you nuke the xenstore dir?). Neither is anyone sure when the > correct point to call the hotplug scripts actually is, or even what > actually happens with them right now across the different backend > drivers or kernel types. This is true, BSD, Linux and Qemu have slightly different implementations of the backend protocol at least, BSD and qemu-xen doesn't react when setting backend "state" to 5 or "online" to 0. This gave me some headaches that could be solved if this was properly documented/implemented. > The actual state transitions which netback and blkback go through are > not the same: The netback protocol uses InitWait, the blkback one does > not or is it vice-versa? I can't remember and it isn't documented. Some > Linux frontends handled the kexec reconnect sequencing differently, by > disconnecting or reconnecting the actual underlying devices at subtly > different times and/or handling the transition from Closing back to Init > or InitWait differently. On this implementation I wait for backend to switch to state 2 (XenbusStateInitWait) before executing hotplug scripts for both vifs and vbds, and it seems like they work ok on both Linux and NetBSD, and again, since it's not documented anywhere, I just guess it's the correct way to do it, but I can't be sure. > And this is just for Linux talking to Linux. > > I know for sure that the Windows frontends follow a different state > transition path to Linux (and that it has interacted badly with the > kexec differences in the Linux backends discussed above). I bet BSD has > some subtle differences in behaviour too. > > The fact is that none of our device state machine protocols are not well > documented (although blkif.h is about to be). If this stuff were well > understood we would already have such documentation because it would be > trivial to write -- but it is not. If you disagree then please document > the netif state machine protocol in the form of a patch to netif.h. > > Ian. > _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |