[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 3/6] libxl: add infrastructure to track and query 'retired' domids



Thanks.  I think this is the algorithm as we discussed, thanks.
I have some comments about the implementation...

Paul Durrant writes ("[PATCH v3 3/6] libxl: add infrastructure to track and 
query 'retired' domids"):
> A domid is considered retired if the domain it represents was destroyed
> less than a specified number of seconds ago. The number can be set using
> the environment variable LIBXL_DOMID_MAX_RETIREMENT. If the variable does
> not exist then a default value of 60s is used.
...

I'm afraid I think your update protocol for this file is wrong.  In
general it is a bad idea to try to write over a file in-place.  Doing
so is full of gotchas.  (In this particular case for example I think
an interrupted attempt at cleaning the file can produce a corrupted
file containing nonsense.)

Can we please use the standard write-to-new-file-and-rename ?
Ie, to launder
    flock(open("domid-history.lock"))
    fopen("domid-history","r")
    fopen("domid-history.new","w")
    fgets/fputs
    fclose
    rename
    close

(And no uses of ftell, fopen(,"r+"), etc.)

Reading can be done without taking the lock, if you so fancy.

I think there are a lot of missing error checks in this patch, but
since I'm asking for a different approach I won't point them out
individually.

> +    fd = open(name, O_RDWR|O_CREAT, 0644);
> +    if (fd < 0) {
> +        LOGE(ERROR, "unexpected error while trying open %s, errno=%d",
> +             name, errno);
> +        goto fail;
> +    }
> +
> +    for (;;) {
> +        ret = flock(fd, LOCK_EX);

I looked for a utility function to do this but didn't find one.
I think this is complicated because it needs to be a `carefd' in libxl
terms because of concurrent forking by other threads in the
application.

I suggest generalising libxl__lock_domain_userdata, which has all the
necessary code (and which also would permit removing the file in the
future).

I feel responsible for this inconvenience.  If this is too tiresome
for you, I could do that part for you...

> +/* Write a domid retirement record */
> +static void libxl__retire_domid(libxl__gc *gc, uint32_t domid)
> +{
...
> +    while (fgets(line, sizeof(line), f)) {
> +        unsigned long sec;
> +        unsigned int ignored;
> +
> +        roff = ftell(f);
> +
> +        if (sscanf(line, "%lu %u", &sec, &ignored) != 2)
> +            continue; /* Purge malformed lines */

I'm not sure why you bother with fgets into a buffer, when you could
just use fscanf rather than sscanf.  Your code doesn't need to take
much care about weird syntax which might occur (and indeed your code
here doesn't take such care).

> @@ -1331,6 +1462,7 @@ static void devices_destroy_cb(libxl__egc *egc,
>          if (!ctx->xch) goto badchild;
>  
>          if (!dis->soft_reset) {
> +            libxl__retire_domid(gc, domid);

I wonder if a better term than "retired" would be possible.  I
initially found this patch a bit confusing because I thought a retired
domid would be one which had *not* been used recently.  Maybe
"recent", "mark recent", etc. ?  Ultimately this is a bikeshed issue
which I will leave this up to you, though.


I don't much like the environment variable to configure this.  I don't
object to keeping it but can we have a comment saying this is not
intended for use in production ?  Personally I would rather it was
hardcoded, or failing that, written to some config file.


Finally, I think this patch needs an addition to xen-init-dom0 to
remove or empty the record file.  This is because while /run is
usually a tmpfs, this is not *necessarily* true.

Ian.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.