[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.18] docs/sphinx: Lifecycle of a domid
On 17.10.23 12:09, Andrew Cooper wrote: On 17/10/2023 6:24 am, Juergen Gross wrote:On 16.10.23 18:24, Andrew Cooper wrote:+command to ``xenstored``. This instructs ``xenstored`` to connect to the +guest's xenstore ring, and fire the ``@introduceDomain`` watch. The firing of +this watch is the signal to all other components which care that a new VM has +appeared and is about to start running.A note should be added that the control domain is introduced implicitly by xenstored, so no XS_INTRODUCE command is needed and no @introduceDomain watch is being sent for the control domain.How does this work for a stub xenstored? It can't know that dom0 is alive, and is the control domain, and mustn't assume that this is true. A stub xenstored gets the control domain's domid via a boot parameter. I admit that I've been a bit vague in the areas where I think there are pre-existing bugs. This is one area. I'm planning a separate document on "how to connect to xenstore" seeing as it is buggy in multiple ways in Linux (causing a deadlock on boot with a stub xenstored), and made worse by dom0less creating memory corruption from a 3rd entity into the xenstored<->kernel comms channel. (And as I've said multiple times already, shuffling code in one of the two xenstored's doesn't fix the root of the dom0less bug. It simply shuffles it around for someone else to trip over.)All components interested in the @introduceDomain watch have to find out for themselves which new domain has appeared, as the watch event doesn't contain the domid of the new domain.Yes, but we're intending to change that, and it is diverting focus from the domain's lifecycle. I suppose I could put in a footnote discussing the single-bit-ness of the three signals. Fine with me. I just wanted to mention this detail. +ceased to exist. It fires the ``@releaseDomain`` watch a second time to +signal to any components which care that the domain has gone away. + +E.g. The second ``@releaseDomain`` is commonly used by paravirtual driver +backends to shut themselves down.There is no guarantee that @releaseDomain will always be fired twice for a domain ceasing to exist,Are you sure? Yes. Identical pending watch events are allowed to be merged into one. Because the toolstack needs to listen to @releaseDomain in order to start cleanup, there will be two distinct @releaseDomain's for an individual domain. > But an individual @releaseDomain can be relevant for a state change in more than one domain, so there are not necessary 2*nr_doms worth of @releaseDomain's fired. Correct. and multiple domains disappearing might result in only one @releaseDomain watch being fired. This means that any component receiving this watch event have not only to find out the domid(s) of the domains changing state, but whether they have been shutting down only, or are completely gone, too.All entities holding a reference on the domain will block the second notification until they have performed their own unmap action. You are aware that backends normally don't register for @releaseDomain, but set a watch on their backend specific Xenstore node in order to react on the tool stack removing the backend device nodes? But for entities which don't hold a reference on the domain, there is a race condition where it's @releaseDomain notification is delivered sufficiently late that the domid has already disappeared. Exactly. It's certainly good coding practice to cope with the domain disappearing entirely underfoot, but entities without held references don't watch @releaseDomain in the first place, so I don't think this case occurs in practice. I could easily see use cases where this assumptions isn't true, like a daemon supervising domains in order to respawn them in case they have died. Juergen Attachment:
OpenPGP_0xB0DE9DD628BF132F.asc Attachment:
OpenPGP_signature.asc
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |