[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v2 06/10] libxl: event: Fix hang when mixing blocking and eventy calls



If the application calls libxl with ao_how==0 and also makes calls
like _occurred, libxl will sometimes get stuck.

The bug happens as follows (for example):

  Thread A
       libxl_do_thing(,ao_how==0)
       libxl_do_thing starts, sets up some callbacks
       libxl_do_thing exit path calls AO_INPROGRESS
       libxl__ao_inprogress goes into event loop
       eventloop_iteration sleeps on:
          - do_thing's current fd set
          - sigchld pipe if applicable
          - its poller

  Thread B
       libxl_something_occurred
       the something is to do with do_thing, above
       do_thing_next_callback does some more work
       do_thing_next_callback becomes interested in fd N
       thread B returns to application

Note that nothing wakes up thread A.  A is not listening on fd N.  So
do_thing_* will not spot when fd N signals.  do_thing will not make
further timely progress.  If there is no timeout thread A will never
wake up.

The problem here occurs because thread A is waiting on an out of date
osevent set.

There is also the possibility that a thread might block waiting for
libxl osevents but outside libxl, eg if the application used
libxl_osevent_beforepoll.  We will deal with that in a moment.

See the big comment in libxl_event.c for a fairly formal correctness
argument.

This depends on libxl__egc_ao_cleanup_1_baton being called everywhere
an egc or ao is disposed of.  Firstly egcs: in this patch we rename
libxl__egc_cleanup, which means we catch all the disposal sites.
Secondly aos: these are disposed of by (i) AO_CREATE_FAIL
(ii) ao__inprogress and (iii) an event which completes the ao later.
(i) and (ii) we handle by adding the call to _baton.  In the case of
(iii) any such function must be an event-generating function so it has
an egc too, so it will pass on the baton when the egc is disposed.

Reported-by: George Dunlap <george.dunlap@xxxxxxxxxx>
Signed-off-by: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
---
v2: Call libxl__egc_ao_cleanup_1_baton (renamed from __egc_cleanup) on
    all exits from ao_inprogress, even requests for async processing.
    Fixes a remaining instance of this bug (!)
    This involves disposing of ao->poller somewhat earlier.

v2: New correctness arguments in libxl_event.c comment and
    in commit message.
---
 tools/libxl/libxl_event.c    | 178 ++++++++++++++++++++++++++++++++++++++++---
 tools/libxl/libxl_internal.h |  33 ++++++--
 2 files changed, 194 insertions(+), 17 deletions(-)

diff --git a/tools/libxl/libxl_event.c b/tools/libxl/libxl_event.c
index 268a5da120..b50d4e5074 100644
--- a/tools/libxl/libxl_event.c
+++ b/tools/libxl/libxl_event.c
@@ -37,6 +37,140 @@ static void ao__check_destroy(libxl_ctx *ctx, libxl__ao 
*ao);
 
 
 /*
+ * osevent update baton handling
+ *
+ * We need the following property (the "unstale liveness property"):
+ *
+ * Whenever any thread is blocking in the libxl event loop[1], at
+ * least one thread must be using an up to date osevent set.  It is OK
+ * for all but one threads to have stale event sets, because so long
+ * as one waiting thread has the right event set, any actually
+ * interesting event will, if nothing else, wake that "right" thread
+ * up.  It will then make some progress and/or, if it exits, ensure
+ * that some other thread becomes the "right" thread.
+ *
+ * [1] TODO: Right now we are considering only the libxl event loop.
+ * We need to consider application event loop outside libxl too.
+ *
+ * Argument that our approach is sound:
+ *
+ * The issue we are concerned about is libxl sleeping on an out of
+ * date fd set, or too long a timeout, so that it doesn't make
+ * progress.  If the property above is satisfied, then if any thread
+ * is waiting in libxl at least one such thread will be waiting on a
+ * sufficient osevent set, so any relevant osevent will wake up a
+ * libxl thread which will either handle the event, or arrange that at
+ * least one other libxl thread has the right set.
+ *
+ * There are two calls to poll in libxl: one is the fd recheck, which
+ * is not blocking.  There is only the one blocking call, in
+ * eventloop_iteration.  poll runs with the ctx unlocked, so osevents
+ * might be added after it unlocks the ctx - that is what we are
+ * worried about.
+ *
+ * To demonstrate that the unstale liveness property is satisfied:
+ *
+ * We define a baton holder as follows: a libxl thread is a baton
+ * holder if
+ *   (a) it has an egc or an ao and holds the ctx lock, or
+ *   (b) it has an active non-app poller and no osevents have been
+ *       added since it released the lock, or
+ *   (c) it has an active non-app poller which has been woken
+ *       (by writing to its pipe), so it will not sleep
+ * We will maintain the invariant (the "baton invariant") that
+ * whenever there is any active poller, there is at least
+ * one baton holder.  ("non-app" means simply "not poller_app".)
+ *
+ * No thread outside libxl can have an active non-app poller: pollers
+ * are put on the active list by poller_get which is called in three
+ * places: libxl_event_wait, which puts it before returning;
+ * libxl__ao_create but only in the synchronous case, in which case
+ * the poller is put before returning; and the poller_app, during
+ * initialisation.
+ *
+ * So any time when all libxl threads are blocking (and therefore do
+ * not have the ctx lock), the non-app active pollers belong to those
+ * threads.  If at least one is a baton holder (the invariant), that
+ * thread has a good enough event set.
+ *
+ * Now we will demonstrate that the "baton invariant" is maintained:
+ *
+ * The rule is that any thread which might be the baton holder is
+ * responsible for checking that there continues to be a baton holder
+ * as needed.
+ *
+ * Firstly, consider the case when the baton holders (b) cease to be
+ * baton holders because osevents are added.
+ *
+ * There are only two kinds of osevents: timeouts and fds.  Every
+ * other internal event source reduces to one of these eventually.
+ * Both of these cases are handled (in the case of fd events, add and
+ * modify, separately), calling pollers_note_osevent_added.
+ *
+ * This walks the poller_active list, marking the active pollers
+ * osevents_added=1.  Such a poller cannot be the baton holder.  But
+ * pollers_note_osevent_added is called only from ev_* functions,
+ * which are only called from event-chain libxl code: ie, code with an
+ * ao or an egc.  So at this point we are a baton holder, and there is
+ * still a baton holder.
+ *
+ * Secondly, consider the case where baton holders (a) cease to be
+ * batton holders because they dispose of their egc or ao.  We call
+ * libxl__egc_ao_cleanup_1_baton on every exit path.  We arrange that
+ * everything that disposes of an egc or an ao checks that there is a
+ * new baton holder by calling libxl__egc_ao_cleanup_1_baton.
+ *
+ * This function handles the invariant explicitly: if we have any
+ * non-app active pollers it looks for one which is up to date (baton
+ * holder category (b)), and failing that it picks a victim to turn
+ * into the baton holder category (c) by waking it up.  (Correctness
+ * depends on this function not spotting its own thread as the
+ * baton-holder, since it is on its way to not being the baton-holder,
+ * so it must be called after the poller has been put back.)
+ *
+ * Thirdly, we must consider the case (c).  A thread in category (c)
+ * will reenter libxl when it gains the lock and necessarily then
+ * becomes a baton holder in category (a).
+ *
+ * So the "baton invariant" is maintained.  QED.
+ */
+static void pollers_note_osevent_added(libxl_ctx *ctx) {
+    libxl__poller *poller;
+    LIBXL_LIST_FOREACH(poller, &ctx->pollers_active, active_entry)
+        poller->osevents_added = 1;
+}
+
+void libxl__egc_ao_cleanup_1_baton(libxl__gc *gc)
+    /* Any poller we had must have been `put' already. */
+{
+    libxl__poller *search, *wake=0;
+
+    LIBXL_LIST_FOREACH(search, &CTX->pollers_active, active_entry) {
+        if (search == CTX->poller_app)
+            /* This one is special.  We can't give it the baton. */
+            continue;
+        if (!search->osevents_added)
+            /* This poller is up to date and will wake up as needed. */
+            return;
+        if (!wake)
+            wake = search;
+    }
+
+    if (!wake)
+        /* no-one in libxl waiting for any events */
+        return;
+
+    libxl__poller_wakeup(gc, wake);
+
+    wake->osevents_added = 0;
+    /* This serves to make _1_baton idempotent.  It is OK even though
+     * that poller may currently be sleeping on only old osevents,
+     * because it is going to wake up because we've just prodded it,
+     * and it pick up new osevents on its next iteration (or pass
+     * on the baton). */
+}
+
+/*
  * The counter osevent_in_hook is used to ensure that the application
  * honours the reentrancy restriction documented in libxl_event.h.
  *
@@ -194,6 +328,7 @@ int libxl__ev_fd_register(libxl__gc *gc, libxl__ev_fd *ev,
     ev->func = func;
 
     LIBXL_LIST_INSERT_HEAD(&CTX->efds, ev, entry);
+    pollers_note_osevent_added(CTX);
 
     rc = 0;
 
@@ -214,6 +349,8 @@ int libxl__ev_fd_modify(libxl__gc *gc, libxl__ev_fd *ev, 
short events)
     rc = OSEVENT_HOOK(fd,modify, noop, ev->fd, &ev->nexus->for_app_reg, 
events);
     if (rc) goto out;
 
+    if ((events & ~ev->events))
+        pollers_note_osevent_added(CTX);
     ev->events = events;
 
     rc = 0;
@@ -315,6 +452,7 @@ static int time_register_finite(libxl__gc *gc, 
libxl__ev_time *ev,
     LIBXL_TAILQ_INSERT_SORTED(&CTX->etimes, entry, ev, evsearch, /*empty*/,
                               timercmp(&ev->abs, &evsearch->abs, >));
 
+    pollers_note_osevent_added(CTX);
     return 0;
 }
 
@@ -1121,6 +1259,7 @@ static int beforepoll_internal(libxl__gc *gc, 
libxl__poller *poller,
     *nfds_io = used;
 
     poller->fds_deregistered = 0;
+    poller->osevents_added = 0;
 
     libxl__ev_time *etime = LIBXL_TAILQ_FIRST(&CTX->etimes);
     if (etime) {
@@ -1442,7 +1581,7 @@ static void egc_run_callbacks(libxl__egc *egc)
     }
 }
 
-void libxl__egc_cleanup(libxl__egc *egc)
+void libxl__egc_cleanup_2_ul_cb_gc(libxl__egc *egc)
 {
     EGC_GC;
     egc_run_callbacks(egc);
@@ -1752,13 +1891,15 @@ int libxl_event_wait(libxl_ctx *ctx, libxl_event 
**event_r,
         rc = eventloop_iteration(egc, poller);
         if (rc) goto out;
 
-        /* we unlock and cleanup the egc each time we go through this loop,
-         * so that (a) we don't accumulate garbage and (b) any events
-         * which are to be dispatched by callback are actually delivered
-         * in a timely fashion.
+        /* we unlock and cleanup the egc each time we go through this
+         * loop, so that (a) we don't accumulate garbage and (b) any
+         * events which are to be dispatched by callback are actually
+         * delivered in a timely fashion.  _1_baton will be
+         * called to pass the baton iff we actually leave; otherwise
+         * we are still carrying it.
          */
         CTX_UNLOCK;
-        libxl__egc_cleanup(egc);
+        libxl__egc_cleanup_2_ul_cb_gc(egc);
         CTX_LOCK;
     }
 
@@ -2031,14 +2172,24 @@ int libxl__ao_inprogress(libxl__ao *ao,
                  * synchronous cancellation ability. */
             }
 
+            /* The call to egc..1_baton is below, only if we are leaving. */
             CTX_UNLOCK;
-            libxl__egc_cleanup(&egc);
+            libxl__egc_cleanup_2_ul_cb_gc(&egc);
             CTX_LOCK;
         }
+
+        /* Dispose of this early so libxl__egc_ao_cleanup_1_baton
+         * doesn't mistake us for a baton-holder.  No-one much is
+         * going to look at this ao now so setting this to 0 is fine.
+         * We can't call _baton below _leave because _leave destroys
+         * our gc, which _baton needs. */
+        libxl__poller_put(CTX, ao->poller);
+        ao->poller = 0;
     } else {
         rc = 0;
     }
 
+    libxl__egc_ao_cleanup_1_baton(gc);
     ao->in_initiator = 0;
     ao__manip_leave(CTX, ao);
 
@@ -2051,6 +2202,9 @@ int libxl__ao_inprogress(libxl__ao *ao,
 static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
 /* Temporarily unlocks ctx, which must be locked exactly once on entry. */
 {
+    libxl__egc egc;
+    LIBXL_INIT_EGC(egc,ctx);
+
     int rc;
     ao__manip_enter(parent);
 
@@ -2071,9 +2225,6 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
 
     /* We keep calling abort hooks until there are none left */
     while (!LIBXL_LIST_EMPTY(&parent->abortables)) {
-        libxl__egc egc;
-        LIBXL_INIT_EGC(egc,ctx);
-
         assert(!parent->complete);
 
         libxl__ao_abortable *abrt = LIBXL_LIST_FIRST(&parent->abortables);
@@ -2086,15 +2237,20 @@ static int ao__abort(libxl_ctx *ctx, libxl__ao *parent)
                    "ao %p: abrt=%p: aborting", parent, abrt->ao);
         abrt->callback(&egc, abrt, ERROR_ABORTED);
 
+        /* The call to egc..1_baton is in the out block below. */
         libxl__ctx_unlock(ctx);
-        libxl__egc_cleanup(&egc);
+        libxl__egc_cleanup_2_ul_cb_gc(&egc);
         libxl__ctx_lock(ctx);
     }
 
     rc = 0;
 
  out:
+    libxl__egc_ao_cleanup_1_baton(&egc.gc);
     ao__manip_leave(ctx, parent);
+    /* The call to egc..2_ul_cb_gc is above.  This is sufficient
+     * because only code inside the loop adds anything to the egc, and
+     * we ensures that the egc is clean when we leave the loop. */
     return rc;
 }
 
diff --git a/tools/libxl/libxl_internal.h b/tools/libxl/libxl_internal.h
index b68ab218b6..eec4bf767d 100644
--- a/tools/libxl/libxl_internal.h
+++ b/tools/libxl/libxl_internal.h
@@ -634,9 +634,23 @@ struct libxl__poller {
      * event is deregistered, we set the fds_deregistered of all non-idle
      * pollers.  So afterpoll can tell whether any POLLNVAL is
      * plausibly due to an fd being closed and reopened.
+     *
+     * Additionally, we record whether any fd or time event sources
+     * have been registered.  This is necessary because sometimes we
+     * need to wake up the only libxl thread stuck in
+     * eventloop_iteration so that it will pick up new fds or earlier
+     * timeouts.  osevents_added is cleared by beforepoll, and set by
+     * fd or timeout event registration.  When we are about to leave
+     * libxl (strictly, when we are about to give up an egc), we check
+     * whether there are any pollers.  If there are, then at least one
+     * of them must have osevents_added clear.  If not, we wake up the
+     * first one on the list.  Any entry on pollers_active constitutes
+     * a promise to also make this check, so the baton will never be
+     * dropped.
      */
     LIBXL_LIST_ENTRY(libxl__poller) active_entry;
     bool fds_deregistered;
+    bool osevents_added;
 };
 
 struct libxl__gc {
@@ -2350,7 +2364,10 @@ _hidden libxl_device_model_version 
libxl__default_device_model(libxl__gc *gc);
         LIBXL_STAILQ_INIT(&(egc).ev_immediates);        \
     } while(0)
 
-_hidden void libxl__egc_cleanup(libxl__egc *egc);
+_hidden void libxl__egc_ao_cleanup_1_baton(libxl__gc *gc);
+  /* Passes the baton for added osevents.  See comment for
+   * osevents_added in struct libxl__poller. */
+_hidden void libxl__egc_cleanup_2_ul_cb_gc(libxl__egc *egc);
   /* Frees memory allocated within this egc's gc, and and report all
    * occurred events via callback, if applicable.  May reenter the
    * application; see restrictions above.  The ctx must be UNLOCKED. */
@@ -2361,9 +2378,11 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
     libxl__egc egc[1]; LIBXL_INIT_EGC(egc[0],ctx);      \
     EGC_GC
 
-#define EGC_FREE           libxl__egc_cleanup(egc)
-
-#define CTX_UNLOCK_EGC_FREE  do{ CTX_UNLOCK; EGC_FREE; }while(0)
+#define CTX_UNLOCK_EGC_FREE  do{                        \
+        libxl__egc_ao_cleanup_1_baton(&egc->gc);        \
+        CTX_UNLOCK;                                     \
+        libxl__egc_cleanup_2_ul_cb_gc(egc);             \
+    }while(0)
 
 
 /*
@@ -2468,8 +2487,9 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
 
 #define AO_INPROGRESS ({                                        \
         libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc);          \
+        /* __ao_inprogress will do egc..1_baton if needed */   \
         CTX_UNLOCK;                                             \
-        EGC_FREE;                                               \
+        libxl__egc_cleanup_2_ul_cb_gc(egc);                     \
         CTX_LOCK;                                               \
         int ao__rc = libxl__ao_inprogress(ao,                   \
                                __FILE__, __LINE__, __func__);   \
@@ -2481,8 +2501,9 @@ _hidden void libxl__egc_cleanup(libxl__egc *egc);
         libxl_ctx *ao__ctx = libxl__gc_owner(&ao->gc);          \
         assert(rc);                                             \
         libxl__ao_create_fail(ao);                              \
+        libxl__egc_ao_cleanup_1_baton(&egc->gc);                \
         libxl__ctx_unlock(ao__ctx); /* gc is now invalid */     \
-        EGC_FREE;                                               \
+        libxl__egc_cleanup_2_ul_cb_gc(egc);                     \
         (rc);                                                   \
     })
 
-- 
2.11.0


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.