[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream



On Mon, 2015-06-15 at 14:44 +0100, Andrew Cooper wrote:
> From: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
> 
> Signed-off-by: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> CC: Ian Campbell <Ian.Campbell@xxxxxxxxxx>
> CC: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>
> CC: Wei Liu <wei.liu2@xxxxxxxxxx>

Overall looks good, I've got some comments below and I think it almost
certainly wants eyes from Ian who knows more about the dc infra etc.

> +void libxl__stream_read_start(libxl__egc *egc,
> +                              libxl__stream_read_state *stream)
> +{
> +    libxl__datacopier_state *dc = &stream->dc;
> +    int ret = 0;
> +
> +    /* State initialisation. */
> +    assert(!stream->running);
> +
> +    memset(dc, 0, sizeof(*dc));

libxl__datacopier_init, please

> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Start reading the stream header. */
> +    dc->readwhat = "stream header";
> +    dc->readbuf = &stream->hdr;
> +    stream->expected_len = dc->bytes_to_read = sizeof(stream->hdr);
> +    dc->used = 0;
> +    dc->callback = stream_header_done;

This pattern of resetting and reinitialising the dc occurs in multiple
places, I think a helper would be in order, some sort of
stream_next_record_init or something perhaps?

> +void libxl__stream_read_abort(libxl__egc *egc,
> +                              libxl__stream_read_state *stream, int rc)
> +{
> +    stream_failed(egc, stream, rc);
> +}
> +
> +static void stream_success(libxl__egc *egc, libxl__stream_read_state *stream)
> +{
> +    stream->rc = 0;
> +    stream->running = false;
> +
> +    stream_done(egc, stream);

Push the running = false into stream_done and flip the assert there?
Logically the stream is still running until it is done, so having done
assert it isn't running seems counter-intuitive.

> +static void stream_done(libxl__egc *egc,
> +                        libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +
> +    assert(!stream->running);
> +
> +    stream->completion_callback(egc, dcs, stream->rc);
> +}
> +
> +static void stream_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_hdr *hdr = &stream->hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }

I think you need to check errnoval == 0 in the !onwrite case, otherwise
you may miss a read error?

Also it looks like onwrite can be -1, which is a separate error case.

> +
> +static void record_header_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }

Same comments wrt the arguments as the previous one.

Maybe a common helper to check (and log) the status at the head of each
callback? So you can effectively do if (!everything_ok(stream, dc) goto
err?

> +    assert(!ret);
> +    if (rec_hdr->length) {
> +        free(stream->rec_body);
> +        stream->rec_body = NULL;

reset length too?

> +static void read_emulator_body(libxl__egc *egc,
> +                               libxl__stream_read_state *stream)
> +{
> +    libxl__domain_create_state *dcs = CONTAINER_OF(stream, *dcs, srs);
> +    libxl__datacopier_state *dc = &stream->dc;
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    libxl_sr_emulator_hdr *emu_hdr = stream->rec_body;
> +    STATE_AO_GC(stream->ao);
> +    char path[256];
> +    int ret = 0;
> +
> +    sprintf(path, XC_DEVICE_MODEL_RESTORE_FILE".%u", dcs->guest_domid);
> +
> +    dc->readwhat = "save/migration stream";
> +    dc->copywhat = "emulator context";
> +    dc->writewhat = "qemu save file";
> +    dc->readbuf = NULL;
> +    dc->writefd = open(path, O_WRONLY | O_CREAT | O_TRUNC, 0666);

Since it this is all done in the same process (or children of it) with
not setuid etc, I think 0600 would be better to avoid accidentally
leaving the save state world readable (just in case it matters).

Also, should consider whether this fd needs to be subject to the carefd
machinery.

Sharing the dc between al these differing usages is starting to rankle a
little, but I think it is necessary because it may have queued data from
a previous read which was larger than the current record, correct?

Hrm, isn't setting dc->used = 0 on each reset potentially throwing some
stuff away?

> +    if (dc->writefd == -1) {
> +        ret = ERROR_FAIL;
> +        LOGE(ERROR, "Unable to open '%s'", path);
> +        goto err;
> +    }
> +    dc->maxsz = dc->bytes_to_read = rec_hdr->length - sizeof(*emu_hdr);
> +    stream->expected_len = dc->used = 0;

expecting 0? This differs from the pattern common everywhere else and
I'm not sure why.

> +    dc->callback = emulator_body_done;
> +
> +    ret = libxl__datacopier_start(dc);
> +    if (ret)
> +        goto err;
> +    return;
> +
> + err:
> +    assert(ret);
> +    stream_failed(egc, stream, ret);
> +}
> +
> +static void emulator_body_done(libxl__egc *egc,
> +                               libxl__datacopier_state *dc,
> +                               int onwrite, int errnoval)
> +{
> +    /* Safe to be static, as it is a write-only discard buffer. */
> +    static char padding[1U << REC_ALIGN_ORDER];
> +
> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
> +    libxl_sr_rec_hdr *rec_hdr = &stream->rec_hdr;
> +    STATE_AO_GC(dc->ao);
> +    unsigned int nr_padding_bytes = (1U << REC_ALIGN_ORDER);
> +    int ret = 0;
> +
> +    if (onwrite || dc->used != stream->expected_len) {
> +        ret = ERROR_FAIL;
> +        LOG(ERROR, "write %d, err %d, expected %zu, got %zu",
> +            onwrite, errnoval, stream->expected_len, dc->used);
> +        goto err;
> +    }
> +
> +    /* Undo modifications for splicing the emulator context. */

Hrm, not so much undo as nuke and rebuild. Is that really necessary,
can't you just reset what you need to in the inverse of the other thing?

If there isn't a problem with buffered stuff on callback, then perhaps
it would be clearer to use a separate dc, at least for the qemu side. Or
to _always_ teardown and restart the dc from scratch instead of doing it
partially in some places and fully in others.


> +    memset(dc, 0, sizeof(*dc));
> +    dc->ao = stream->ao;
> +    dc->readfd = stream->fd;
> +    dc->writefd = -1;
> +
> +    /* Do we need to eat some padding out of the stream? */

Why only now and not for e.g. the xenstore stuff (which doesn't appear
to be explicitly padded).

And given that why not handle this in some central place rather than in
the emulator only place?

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.