[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 15/19] libxl: suspend: Async xenstore pvcontrol wait



Ian Campbell writes ("Re: [PATCH 15/19] libxl: suspend: Async xenstore 
pvcontrol wait"):
> On Thu, 2014-03-13 at 18:26 +0000, Ian Jackson wrote:
> > No, this possibility still exists.  This is handled by the rc ==
> > ERROR_TIMEDOUT branch in domain_suspend_common_pvcontrol_suspending.
> > The comment there (now) reads:
> > 
> >         /*
> >          * Guest appears to not be responding. Cancel the suspend
> >          * request.
> >          *
> >          * We re-read the suspend node and clear it within a
> >          * transaction in order to handle the case where we race
> >          * against the guest catching up and acknowledging the request
> >          * at the last minute.
> >          */
> 
> I was querying it because the original code appears to have both of the
> above ("Guest appears to not..." and "Final check for..."). IOW even
> after the final check transaction to clear it checks state again.

That's just an artefact of the tangled control structure (and
confusing comments) in the old code.  At the "Final check" it's not
actually reading xs again - that happened earlier.  It's just coping
with one of the two possible exits from the transaction loop.  In the
old code the last-minute-success path and the timeout-failure path are
mixed during and after the loop and need to be distinguished again.

In the new code, the success case is the main path;
last-minute-success comes out in the wash, falling through into the
main success path.  The timeout-failure path goes to err, so there is
nothing corresponding to the if following "Final check".

> I think what you've done is make that "final" check within the
> transaction loop with:
>     if (domain_suspend_pvcontrol_acked(state))
>         break;
> 
> Would it make sense to include something akin to the "Final check"
> comment just before this to clarify?

In the old code the "Final check" comment is the second part of the
explanation "We re-read", earlier.

I think this is only confusing if you look at the confusing old code.
If you just look at the new code it's clearly correct.  The comment
above clearly says that the purpose is to handle last-minute-success.
But if you prefer I could add a comment

>     if (domain_suspend_pvcontrol_acked(state))
+         /* last minute ack */
>         break;

> > > Is this right/proper in the success case?
> > 
> > Yes.  Because it has then already been committed.  This is exactly the
> > pattern from the usage comment for libxl__xs_transaction_* in
> > libxl_internal.h.
> 
> That usage comment doesn't have an abort after a successful commit
> though i.e. in the "if (!rc) break;" case it drops straight through to
> the success and return something case.

Oh I see.

> I think the subtlety is that the transaction loop here differs slightly
> from the canonical example because of the "final check" break quoted
> above -- since that exits the transaction successfully without actually
> needing to complete the transaction.
> 
> Assuming aborting a committed transaction is ok then a comment might be
> worthwhile. Or do the abort just before the exceptional success break,
> which is a bit unconventional but then so is the check itself.

How about if I add a new patch introducing a comment documenting
formally the possible states of *t with the libxl__xs_transaction
functions ?

And/or perhaps rename libxl__xs_transaction_abort to
libxl__xs_transaction_cleanup ?

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.