[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 16/27] tools/libxl: Infrastructure for reading a libxl migration v2 stream



On 10/07/15 11:23, Ian Campbell wrote:
> On Thu, 2015-07-09 at 19:26 +0100, Andrew Cooper wrote:
>> From: Ross Lagerwall <ross.lagerwall@xxxxxxxxxx>
>>
>> This contains the event machinary and state machines to read an act on a
> "machinery"
>
> [...]
>
>
>> Large quantities of the logic here are completely overhauled since v1, mostly
>> as part of fixing the checkpoint buffering bug which was the cause of the
>> broken Remus failover.  The result is actually more simple overall;
> I agree, it looks much nicer, thanks!
>
>> +struct libxl__stream_read_state {
>> +    /* filled by the user */
>> +    libxl__ao *ao;
>> +    int fd;
>> +    void (*completion_callback)(libxl__egc *egc,
>> +                                libxl__stream_read_state *srs,
>> +                                int rc);
>> +    /* Private */
>> +    int rc;
>> +    bool running;
> [...]
>> +void libxl__stream_read_start(libxl__egc *egc,
>> +                              libxl__stream_read_state *stream)
>> +{
>> +    libxl__datacopier_state *dc = &stream->dc;
>> +    int ret = 0;
>> +
>> +    /* State initialisation. */
>> +    assert(!stream->running);
> Since running is declared private and there is no _init function (I
> think _start is effectively filling that role) I'm not sure that the
> caller can necessarily be expected to have initialised anything other
> than the ao, fd and callback fields.

It was a sanity check that _start() doesn't get called twice (guess what
I managed to do while developing).  It can probably be dropped.

>
> You might choose to handle this as a request for a doc comment ("must
> call LIBXL_FILLZERO on it to init"), or to add a separate init function
> containing the memset or to do away with this check. I've not gotten to
> the caller yet so I don't know which you will prefer.

It is all zeroed because of the way dcs is constructed.  I suppose I can
also drop the zeroing of the dc.

>
>> +
>> +    memset(dc, 0, sizeof(*dc));
>> +    dc->ao = stream->ao;
>> +    dc->readfd = stream->fd;
>> +    dc->writefd = -1;
>> +
>> +    /* Start reading the stream header. */
>> +    ret = setup_read(stream, "stream header",
>> +                     &stream->hdr, sizeof(stream->hdr),
>> +                     stream_header_done);
>> +    if (ret)
>> +        goto err;
>> +
>> +    stream->running = true;
>> +    stream->phase = SRS_PHASE_NORMAL;
>> +    LIBXL_STAILQ_INIT(&stream->record_queue);
>> +    stream->recursion_guard = 0;
>> +
>> +    assert(!ret);
>> +    return;
>> +
>> + err:
>> +    assert(ret);
>> +    stream_failed(egc, stream, ret);
> stream failed looks at stream->running, which due to the above might
> also be uninitialised here.

Oops yes. I fixed this in the write() side but forgot to propagate the
bugfix back to the read side.

>
>> +static void stream_done(libxl__egc *egc,
>> +                        libxl__stream_read_state *stream)
>> +{
>> +    libxl__sr_record_buf *rec, *trec;
>> +
>> +    assert(stream->running);
>> +    stream->running = false;
>> +
>> +    if (stream->emu_carefd)
>> +        libxl__carefd_close(stream->emu_carefd);
>> +
>> +    LIBXL_STAILQ_FOREACH_SAFE(rec, &stream->record_queue, entry, trec) {
>> +        free(rec->body);
>> +        free(rec);
>> +    }
> Am I right in thinking that we should only get here with a non-empty
> queue on failure? If so then perhaps:
>         assert(LIBXL_STAILQ_EMPTY(...) || stream->rc);
>         
> ?

I believe so.  There is no way the stream should succeed if there are
outstanding buffered records.

>
>> +
>> +    stream->completion_callback(egc, stream, stream->rc);
>> +}
>> +
>> +static void stream_continue(libxl__egc *egc,
>> +                            libxl__stream_read_state *stream)
>> +{
>> +    STATE_AO_GC(stream->ao);
>> +
>> +    /* Must not mutually recurse with process_record() */
>> +    assert(stream->recursion_guard == false);
>> +    stream->recursion_guard = true;
> This smells a bit like it ought to be a SRS_PHASE_PROCESSING or some
> such, but lets leave that alone...

This check is pre-emptively avoid the naive bug which would occur if
process_record() called back into stream_continue() and there were many
TOOLSTACK records back to back in the processing queue.

In that case (and potentially future records as well), the two functions
would mutually recurse based on the contents of the stream.

>
>> +
>> +    switch (stream->phase) {
>> +    case SRS_PHASE_NORMAL:
>> +        /*
>> +         * Normal phase of the stream.  We arrive here in several senarios.
> "scenarios"
>
>> +static void stream_header_done(libxl__egc *egc,
>> +                               libxl__datacopier_state *dc,
>> +                               int rc, int onwrite, int errnoval)
>> +{
>> +    libxl__stream_read_state *stream = CONTAINER_OF(dc, *stream, dc);
>> +    libxl__sr_hdr *hdr = &stream->hdr;
>> +    STATE_AO_GC(dc->ao);
>> +    int ret = 0;
>> +
>> +    if (rc || onwrite || errnoval) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "rc %d, onwrite %d, errnoval %d", rc, onwrite, errnoval);
> Could use LOGEV(ERRRO, errnoval, "rc %d, onweite %d", rc, onwrite);
> (for all cases I think).
>
> Actually, doesn't dc guarantee to always have already logged on fail?
> Comments in the libxl_internal.h suggest so, apart from the abort case,
> so I think maybe you can avoid logging explicitly here.

So it does.  That is handy.

>
>> +        goto err;
>> +    }
>> +
>> +    hdr->ident   = be64toh(hdr->ident);
>> +    hdr->version = be32toh(hdr->version);
>> +    hdr->options = be32toh(hdr->options);
>> +
>> +    if (hdr->ident != RESTORE_STREAM_IDENT) {
>> +        ret = ERROR_FAIL;
> Eventually I suspect the xapi people would like to see something more
> specific at least for the general "SRS header fail" if not the
> individual reasons.

If you don't object too strongly, I would prefer to leave that
bikeshedding to the error value improvements work.

>
>> +        LOG(ERROR,
>> +            "Invalid ident: expected 0x%016"PRIx64", got 0x%016"PRIx64,
>> +            RESTORE_STREAM_IDENT, hdr->ident);
>> +        goto err;
>> +    }
>> +    if (hdr->version != RESTORE_STREAM_VERSION) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "Unexpected Version: expected %u, got %u",
> hdr->version is a uint32_t, so PRIu32 would be more appropriate.

In both 32 and 64 builds they are equivalent.  All parameters are
promoted to unsigned int.

>
>> +            RESTORE_STREAM_VERSION, hdr->version);
>> +        goto err;
>> +    }
>> +    if (hdr->options & RESTORE_OPT_BIG_ENDIAN) {
>> +        ret = ERROR_FAIL;
>> +        LOG(ERROR, "Unable to handle big endian streams");
>> +        goto err;
>> +    }
>> +
>> +    LOG(DEBUG, "Stream v%u%s", hdr->version,
> and again.
>
> Actually looking around since you've used uintXX_t throughout the format
> structs, I think you need a lot more PRI[ux]FOO around the place.

Will do

>
> _If_ you've compile tested this for both 32- and 64-bit and it works we
> could perhaps leave that audit until later.
>
>> +static void setup_read_record(libxl__egc *egc,
>> +                              libxl__stream_read_state *stream)
>> +{
>> +    STATE_AO_GC(stream->ao);
>> +    libxl__sr_record_buf *rec = NULL;
>> +    int ret;
>> +
>> +    assert(stream->incoming_record == NULL);
>> +
>> +    stream->incoming_record = rec = libxl__zalloc(NOGC, sizeof(*rec));
> I recall Ian J and you discussing NOGC allocations on IRC. Was the
> conclusion that it was OK, or that it could be fixed later, or that it
> should be fixed now via an nested ao or something similar?
>
> Unless the answer is "fixed now" I think the reason for the NOGC should
> be in either the commit log or a comment (in the header, around about
> the definition of the allocated data structure).

I will add a note about in the commit message.  We agreed on IRC that
NOGC was OK.  It might be possible to switch to some nested ao later,
but that depends entirely on the COLO work.

~Andrew

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.