[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v5 4/7] libxl: add infrastructure to track and query 'recent' domids



Durrant, Paul writes ("RE: [PATCH v5 4/7] libxl: add infrastructure to track 
and query 'recent' domids"):
> Ian Jackson <ian.jackson@xxxxxxxxxx>:
> > Paul Durrant writes ("[PATCH v5 4/7] libxl: add infrastructure to track
> > > +int libxl_clear_domid_history(libxl_ctx *ctx);
> > 
> > I think this needs a clear doc comment saying it is for use in host
> > initialisation only.  If it is run with any domains running, or
> > concurrent libxl processes, things may malfunction.
> 
> Ok. Not sure precisely what you mean by 'doc comment'... Do mean a
> comment in the header just above this declaration [...] ?

Yes, precisely that.  Thanks.

> > > +static bool libxl__read_recent(FILE *f, unsigned long *sec,
> > > +                               unsigned int *domid)
> > > +{
> > > +    int n;
> > > +
> > > +    assert(f);
> > > +
> > > +    n = fscanf(f, "%lu %u", sec, domid);
> > > +    if (n == EOF)
> > > +        return false;
> > 
> > Missing error handling in case of read error.
> 
> 'man fscanf' tells me:
> 
> "The value EOF is returned if the end of input is reached before
> either the first suc‐ cessful conversion or a matching failure
> occurs.  EOF is also returned if a read error occurs, in which case
> the error indicator for the stream (see ferror(3)) is set, and errno
> is set to indicate the error."
> 
> So EOF is set in all error cases. What am I missing?

I thought it treats read error the same as EOF.  But of course
actually I discovered a ferror() (duplicated) later...

> > > +    else if (n != 2) /* malformed entry */
> > > +        *domid = INVALID_DOMID;
> > 
> > Both call sites for this function have open-coded checks for this
> > return case, where they just go round again.  I think
> > libxl__read_recent should handle this itself, factoring the common
> > code into this function and avoiding that special case.
> 
> Ok. I thought it was more intuitive to have the function only ever
> read a single entry from the file, but I can easily add the retry
> loop if you prefer.

I think the purpose of this function is to contain all the code that
can be shared between the two call sites.

> > > +    return true;
> > 
> > I think this function should return an rc.  It could signal EOF by
> > setting *domid to INVALID_DOMID maybe, and errors by returning
> > ERROR_FAIL.
> 
> Ok. I thought it was slightly pointless to do that.

I don't have a 100% fixed opinion about the precise calling
convention.  But this function needs to be able to report three
distinct conditions, not two:
  - here is the entry you asked for
  - EOF, we have established that there are no more entries
  - failure to read the file, abandon all hope

Elsewhere in libxl the convention is usually to use an rc return value
to signal errors, and signal "no error, but no such thing" by writing
a sentinel rather than a value to an out parameter.

Returning an rc means that in the future if we want better control of
errors (i) this internal api is more like other internal apis (ii) the
exact error code is specified at the point in the code where the error
is recognised.

> > I doubt this is really needed but I don't mind it if you must.
> > 
> > > +    return fprintf(f, "%lu %u\n", sec, domid) > 0;
> > 
> > Wrong error handling.  This function should return rc.  fprintf
> > doesn't return a boolean.
> 
> And nor does this code expect it to (since it tests for '> 0').

Oh.  I didn't spot that.  This is contrary to libxl/CODING_STYLE.

  * Function calls which might fail (ie most function calls) are
    handled by putting the return/status value into a variable, and
    then checking it in a separate statement:
            char *dompath = libxl__xs_get_dompath(gc, bl->domid);
            if (!dompath) { rc = ERROR_FAIL; goto out; }

For precisely this kind of reason.

> >  Something should log errno (with LOGE
> > probably) if fprintf fails.
> 
> I can see you dislike boolean functions; I'll return an error as you desire.

See above about error handling.  Certainly a boolean cannot be used
for a function which might return "yes" or "no" or "argh, can't say".
For a function which might return "ok" or "argh", rc and ERROR_* is
clearly better since you get to invent the error code.

> > > +static int libxl__mark_domid_recent(libxl__gc *gc, uint32_t domid)
> > > +{
> > > +    long timeout = libxl__get_domid_reuse_timeout();
> > > +    libxl__flock *lock;
> > 
> > Please initialise lock = NULL so that it is easy to see that the out
> > block is correct.
> > 
> > (See tools/libxl/CODING_STYLE where this is discussed.)
> 
> Ok. Xen style generally avoids initializers where not strictly necessary.

libxl does not use "Xen style".

If you want to challenge the contents of libxl/CODING_STYLE, that's
fair enough of course, but maybe in the middle of this patch review is
not ideal ?

> > > +    lock = libxl__lock_domid_history(gc);
> > > +    if (!lock) {
> > > +        LOGED(ERROR, domid, "failed to acquire lock");
> > > +        goto out;
> > > +    }
> > > +
> > > +    old = libxl__domid_history_path(gc, NULL);
> > > +    of = fopen(old, "r");
> > > +    if (!of && errno != ENOENT)
> > > +        LOGED(WARN, domid, "failed to open '%s'", old);
> > 
> > This fopen code and its error handling is still duplicated between
> > libxl__mark_domid_recent and libxl__is_domid_recent. 
> 
> That's not quite true. The error semantics are different; the former does not 
> tolerate a failure to open the file, the latter does.

What is the reason for this difference in semantics ?  It seems to me
that either:
 (i) absence of the file means there are no recent domids (eg,
     after boot) and therefore both functions should tolerate it; or
 (ii) absence of the file means a system configuration error
     and therefore neither function should tolerate it.

> > Also failure to open the file should be an error, resulting failure of
> > this function and the whole surrounding operation, not simply produce
> > a warning in some logfile where it will be ignored.
> 
> But that will cause a failure when trying to create the first domain
> after boot, since the file won't exist.

I meant that failure to open *other than ENOENT*.

ISTM that of the two options above, (i) is to be preferred and
therefore that ENOENT should always be tolerated.  But maybe you can
explain to me why that isn't right.

> > > +    if (of && fclose(of) == EOF) {
> > > +        LOGED(ERROR, domid, "failed to close '%s'", old);
> > 
> > I don't see how of would be NULL here.
> 
> It will be NULL if the file did not exist, which will be the case until the 
> first domain destruction occurs.

Oh yes.  I am confused because I keep reading `of' as `output file'.

In which case, please see CODING_STYLE about putting the return value
in a separate statement.  This will also avoid duplicating the
`of=NULL' since it can go right after fclose.

Maybe the closing could be done by libxl__read_recent, if it took a
FILE** ?  That would remove some duplication and leave only an
error-check-free   if (of) fclose(of);   in each out block.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.