[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v10 4/5] remus: implement remus network buffering for nic devices



Hongyang Yang writes ("Re: [PATCH v10 4/5] remus: implement remus network 
buffering for nic devices"):
>    Sorry for the late reply, just notice this comment...

And here I am being even later replying...

> On 06/06/2014 01:24 AM, Ian Jackson wrote:
> > As far as I can see this attempts to search for an ifb which is not in
> > use.
> >
> > I see you claim a lock to ensure that you don't fail due to races with
> > other copies of this script.
> >
> > But are there potentially other things (not Xen related, parhaps) in
> > the system which might try to allocate an ifb using a similar
> > approach ?  How do we deal with the potential race with them ?

Have I missed an answer to this question ?

> > Also: I think you should:
> >   - write the IFB name to xenstore _before_ starting to configure it
> >   - in the loop I quote above, check in xenstore that the ifb is not
> >     in use by another domain
> >
> > Otherwise there seems to be the following risk:
> >   1. You pick ifbX using the loop above
> >   2. You start to configure ifbX, eventually resulting in a
> >      configuration which makes it not show up as free
> >   3. Something bad happens and you fail, before writing the
> >      ifb name to xenstore
> >
> > In this case, the ifb would be leaked.  (I see you do try to avoid
> > this with xs_write_failed, but scripts can fail for other reasons.)
> 
> If the setup failed for any reason, we will call teardown in the remus
> device framework, so the ifb won't be leaked.

I'm afraid that you can't assume that your script will necessarily
execute the teardown.  Perhaps the script got killed by the OOM
killer, or something else horrible went wrong.

So you need to make sure that all the states the system goes through,
however briefly and no matter what cleanup will be attempted on
failure.

> >> diff --git a/tools/libxl/libxl_types.idl b/tools/libxl/libxl_types.idl
> >> index 4278a6b..50bf1ef 100644
> >> --- a/tools/libxl/libxl_types.idl
> >> +++ b/tools/libxl/libxl_types.idl
> >> @@ -566,6 +566,8 @@ libxl_domain_remus_info = Struct("domain_remus_info",[
> >>       ("interval",     integer),
> >>       ("blackhole",    bool),
> >>       ("compression",  bool),
> >> +    ("netbuf",       bool),
> >> +    ("netbufscript", string),
> >
> > I think netbuf should be a defbool, not a bool.  Indeed, perhaps this
> > is true of the other options too.  Is there some reason it shouldn't
> > default to enabled ?
> 
> The netbuffering is enabled by default, this option is used to disable
> the netbuffering support.

Maybe I haven't followed the code here correctly, but I think that has
to be done with a defbool.  Where is the default set ?  I don't see it
in this patch.

Thanks,
Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.