[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] tools: libxl: Take the userdata lock around maxmem changes



On Tue, Jun 23, 2015 at 05:38:17PM +0100, Ian Campbell wrote:
> On Tue, 2015-06-23 at 17:32 +0100, Wei Liu wrote:
> > On Tue, Jun 23, 2015 at 03:58:32PM +0100, Ian Campbell wrote:
> > > There is an issue in libxl_set_memory_target whereby the target and
> > > the max mem can get out of sync, this is because the call the
> > > xc_domain_setmaxmem is not tied in any way to the xenstore transaction
> > > which controls updates to the xenstore side of things.
> > > 
> > > Consider a domain with 1M of RAM (==target and maxmem for the sake of
> > > argument) and two simultaneous calls to libxl_set_memory_target, both
> > > with relative=0 and enforce=1, one with target=3 and the other with
> > > target=5.
> > > 
> > > target=5 call                   target=3 call
> > > 
> > > transaction start
> > >                                 transaction start
> > > write target=5 to xenstore
> > >                                 write target=3 to xenstore
> > > setmaxmem(5)
> > >                                 setmaxmem(3)
> > > 
> > > transaction commit = success
> > >                                 transaction commit = EAGAIN
> > > 
> > > At this point maxmem=3 while target=5.
> > > 
> > > In reality the target=3 case will the retry and eventually (hopefully)
> > > succeed with target=maxmem=3, however the bad state will persist for
> > > some window which is undesirable. On failure other than EAGAIN all
> > > bets are off anyway, but in that case we will likely stick in the bad
> > > state until someone else sets the memory).
> > > 
> > > To fix this we slightly abuse the userdata lock which is used to
> > > protect updates to the domain's json configuration. Abused because
> > > maxmem is not actually stored in there, but is kept by Xen. However
> > > the lock protects some semantically similar things and is convenient
> > > to use here too.
> > > 
> > > libxl_domain_setmaxmem also takes the lock, since it reads
> > > memory/target from xenstore before calling xc_domain_setmaxmem there
> > > is a small (but perhaps not very interesting) race there too.
> > > 
> > > There is on more use of xc_domain_setmaxmem in libxl__build_pre.
> > > However taking a lock around this would be tricky since the xenstore
> > > parts are not done until libxl__build_post. I think this one could be
> > > argued to be OK since the domid is not "public" yet, that is it has
> > > not been returned to the application yet (as the result of the create
> > > operation). Toolstacks which go round fiddling with random domid's
> > > which they find lying on the floor should be taught to do better.
> > > 
> > > Add a doc note that taking the userdata lock requires the CTX_LOCK to
> > > be held.
> > > 
> > > Signed-off-by: Ian Campbell <ian.campbell@xxxxxxxxxx>
> > > ---
> > > This applies on top of Wei's revert of "libxl_set_memory_target:
> > > retain the same maxmem offset on top of the current target"
> > > 
> > > I couldn't quite rule out some race (for transaction=EAGAIN, for
> > > !EAGAIN there are obvious ones) which resulted in the incorrect state
> > > being in place after both entities exit, but couldn't construct an
> > > actual case.
> > 
> > Not sure I follow. With this patch you lock out other contenders to
> > even start a transaction so the EAGAIN vs !EAGAIN should be moot. You
> > can safely rollback in !EAGAIN case, right?
> 
> Sorry, I meant a prexisting race which was fixed by this patch, rather
> than one that continues to exist with this fix.
> 

Is there inconsistency after this fix? I think not, because you can roll
back maxmem and pod target to previous values -- but the rollback was
not implemented here though.

Wei.

> Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.