[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Xen 4.18 release: Reminder about code freeze



On Thu, 12 Oct 2023, George Dunlap wrote:
> > > Stop tinkering in the hope that it hides the problem.  You're only
> > > making it harder to fix properly.
> >
> > Making it harder to fix properly would be a valid reason not to commit
> > the (maybe partial) fix. But looking at the fix again:
> >
> > diff --git a/tools/xenstored/domain.c b/tools/xenstored/domain.c
> > index a6cd199fdc..9cd6678015 100644
> > --- a/tools/xenstored/domain.c
> > +++ b/tools/xenstored/domain.c
> > @@ -989,6 +989,7 @@ static struct domain *introduce_domain(const void *ctx,
> >                 talloc_steal(domain->conn, domain);
> >
> >                 if (!restore) {
> > +                       domain_conn_reset(domain);
> >                         /* Notify the domain that xenstore is available */
> >                         interface->connection = XENSTORE_CONNECTED;
> >                         xenevtchn_notify(xce_handle, domain->port);
> > @@ -1031,8 +1032,6 @@ int do_introduce(const void *ctx, struct connection 
> > *conn,
> >         if (!domain)
> >                 return errno;
> >
> > -       domain_conn_reset(domain);
> > -
> >         send_ack(conn, XS_INTRODUCE);
> >
> > It is a 1-line movement. Textually small. Easy to understand and to
> > revert. It doesn't seem to be making things harder to fix? We could
> > revert it any time if a better fix is offered.
> >
> > Maybe we could have a XXX note in the commit message or in-code
> > comment?
> 
> It moves a line from one function (do_domain_introduce()) into a
> completely different function (introduce_domain()), nested inside two
> if() statements; with no analysis on how the change will impact
> things.

I am not the original author of the patch, and I am not the maintainer
of the code, so I don't feel I have the qualifications to give you the
answers you are seeking. Julien as author of the patch and xenstore
reviewer might be in a better position to answer. Or Juergen as xenstore
maintainer.

>From what I can see the patch is correct.

We are removing a call to domain_conn_reset in do_introduce.
We are adding a call to domain_conn_reset in introduce_domain, which is
called right before in introduce_domain. Yes there are 2 if statements
but the domain_conn_reset is added in the right location: the
non-already-introduced non-restore code path.


> Are there any paths through do_domain_introduce() that now *won't* get
> a domain_conn_reset() call?  Is that OK?

Yes, the already-introduced and the restore code paths. The operations in
the already-introduced or the restore code paths seem simple enough not
to require a domain_conn_reset. Julien and Juergen should confirm.


> Is introduce_domain() called in other places?  Will those places now
> get extra domain_conn_reset() calls they weren't expecting?  Is that
> OK?

introduce_domain is called by dom0_init, but I am guessing that dom0 is
already-introduced so it wouldn't get an extra domain_conn_reset. Julien
and Jurgen should confirm.


> I mean, it certainly seems strange to set the state to CONNECTED, send
> off an event channel, and then after that delete all watches /
> transactions / buffered data and so on; but we need at least a basic
> understanding of what's going on to know that this change isn't going
> to break comething.
> 
> Not knowing much about the xenstore protocol: In the
> (!domain->introduced) case, will there be anything to actually delete?
>  It seems like it would only be necessary / useful on the
> (domain->introduced) case.

Doesn't it seem weird to you that we set a connection to CONNECTED,
notify the domain that it is ready to go, and only *after* that we reset
the connection to zero?

What happens if a domain starts using the connection as soon as it
receives the event channel notification and before domain_conn_reset is
called?



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.