[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] libxl: trigger attach events for devices attached before xl devd startup
On Mon, Jul 11, 2016 at 11:49:19AM +0200, Marek Marczykowski-Górecki wrote: > On Mon, Jul 11, 2016 at 11:43:18AM +0200, Roger Pau Monné wrote: > > On Mon, Jul 11, 2016 at 10:56:04AM +0200, Marek Marczykowski-Górecki wrote: > > > On Mon, Jul 11, 2016 at 10:31:17AM +0200, Roger Pau Monné wrote: > > > > On Sun, Jul 10, 2016 at 07:35:47PM +0200, Marek Marczykowski-Górecki > > > > wrote: > > > > > When this daemon is started after creating backend device, that device > > > > > will not be configured. > > > > > > > > > > Racy situation: > > > > > 1. driver domain is started > > > > > 2. frontend domain is started (just after kicking driver domain off) > > > > > 3. device in frontend domain is connected to the backend (as specified > > > > > in frontend domain configuration) > > > > > 4. xl devd is started in driver domain > > > > > > > > > > End result is that backend device in driver domain is not configured > > > > > (like network interface is not enabled), so the device doesn't work. > > > > > > > > > > Fix this by artifically triggering events for devices already present > > > > > in > > > > > xenstore before xl devd is started. Do this only after xenstore watch > > > > > is > > > > > already registered, and only for devices not already initialized (in > > > > > XenbusStateInitWait state). > > > > > > > > Thanks! > > > > > > > > > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > > > > > Cc: Wei Liu <wei.liu2@xxxxxxxxxx> > > > > > Signed-off-by: Marek Marczykowski-Górecki > > > > > <marmarek@xxxxxxxxxxxxxxxxxxxxxx> > > > > > --- > > > > > tools/libxl/libxl.c | 40 ++++++++++++++++++++++++++++++++++++++++ > > > > > 1 file changed, 40 insertions(+) > > > > > > > > > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c > > > > > index 1c81239..99815a7 100644 > > > > > --- a/tools/libxl/libxl.c > > > > > +++ b/tools/libxl/libxl.c > > > > > @@ -4743,8 +4743,16 @@ int libxl_device_events_handler(libxl_ctx *ctx, > > > > > uint32_t domid; > > > > > libxl__ddomain ddomain; > > > > > char *be_path; > > > > > + char **kinds = NULL, **domains = NULL, **devs = NULL; > > > > > + const char *sstate; > > > > > + char *state_path; > > > > > + int state; > > > > > + unsigned int nkinds, ndomains, ndevs; > > > > > + int i, j, k; > > > > > + xs_transaction_t t; > > > > > > > > > > ddomain.ao = ao; > > > > > + FILLZERO(ddomain.watch); > > > > > > > > Is this a different bugfix or stray change? > > > > > > To cleanly unregister watch and not do nothing if wasn't registered at > > > all. If it isn't initialized, libxl__ev_xswatch_deregister call on > > > not registered watch isn't harmless. > > > > Right, I've realized that before your changes the function only registered > > the watch and exited, this is needed now. > > > > > > > LIBXL_SLIST_INIT(&ddomain.guests); > > > > > > > > > > rc = libxl__get_domid(gc, &domid); > > > > > @@ -4762,9 +4770,41 @@ int libxl_device_events_handler(libxl_ctx *ctx, > > > > > be_path); > > > > > if (rc) goto out; > > > > > > > > > > + rc = libxl__xs_transaction_start(gc, &t); > > > > > + if (rc) goto out; > > > > > > > > Why do you need to start a transaction here if you end up aborting it > > > > when > > > > finished? > > > > > > Mostly to ease error checking. Because below code does three level > > > listing, I don't want to deal with races where some entry was removed > > > between those calls, at least not here. Like this: > > > > > > xs_directory('backend/vif') -> 3, 4, 5 > > > xs_directory('backend/vif/3') -> 0, 1 > > > xs_read('backend/vif/3/0/state') -> ... > > > xs_read('backend/vif/3/1/state') -> ... > > > toolstack removes backend/vif/4 here > > > xs_directory('backend/vif/4') -> fail > > > > > > Of course backend_watch_callback would fail anyway in such a case, which > > > is ok. But having snapshot of xenstore during this multi-level listing > > > looks like avoiding some corner cases during listing itself. > > > > AFAICT your code seems to be prepared to deal with entries disappearing in > > the middle of the tree walk, so I would just remove the transaction. > > Actually I'm considering changing error handling below to "goto out" > instead of "continue", as race condition should be eliminated by the > transaction, other errors (permission denied for example) maybe should > be considered fatal. So, IMO there two options: > - ignore failed reads and remove transaction > - error on failed reads and keep transaction > > Which one would be better? IMHO, I think the first option is better, and you already have it coded. Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |