Xen project Mailing List

[Xen-devel] [PATCH 4/4] libxl: More probably detect reentry of destroyed ctx

From: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>

Date: Mon, 9 Feb 2015 15:51:12 +0000

Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, Jim Fehlig <jfehlig@xxxxxxxx>, Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>

Delivery-date: Mon, 09 Feb 2015 15:51:39 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

In libxl_ctx_free: 1. Move the GC_INIT earlier, so that we can: 2. Take the lock around most of the work. This is technically unnecessary since calling any other libxl entrypoint on a ctx being passed to libxl_ctx_free risks undefined behaviour. But, taking the lock allows us to much more usually spot this. 3. Increment osevent_in_hook by 1000. If we are reentered after destroy, this will trip some of our entrypoints' asserts. It also means that if we crash for some other reason relating to reentering a destroyed ctx, the cause will be more obviously visible by examining ctx->osevent_in_hook (assuming that the memory previously used for the ctx hasn't been reused and overwritten). 4. Free the lock again. (pthread_mutex_destroy requires that the mutex be unlocked.) With this patch, I find that an occasional race previously reported as: libvirtd: libxl_internal.h:3265: libxl__ctx_unlock: Assertion `!r' failed. is now reported as: libvirtd: libxl_event.c:1236: libxl_osevent_occurred_fd: Assertion `!libxl__gc_owner(gc)->osevent_in_hook' failed. Examining the call trace with gdb shows this: (gdb) bt #0 0xb773f424 in __kernel_vsyscall () #1 0xb7022871 in raise () from /lib/i386-linux-gnu/i686/nosegneg/libc.so.6 #2 0xb7025d62 in abort () from /lib/i386-linux-gnu/i686/nosegneg/libc.so.6 #3 0xb701b958 in __assert_fail () from /lib/i386-linux-gnu/i686/nosegneg/libc.so.6 #4 0xb6f00390 in libxl_osevent_occurred_fd (ctx=0xb84813a8, for_libxl=0xb84d6848, fd=31, events_ign=0, revents_ign=1) at libxl_event.c:1236 #5 0xb1b70464 in libxlDomainObjFDEventCallback () from /usr/local/lib/libvirt/connection-driver/libvirt_driver_libxl.so #6 0xb72163b1 in virEventPollDispatchHandles () from /usr/local/lib/libvirt.so.0 #7 0xb7216be5 in virEventPollRunOnce () from /usr/local/lib/libvirt.so.0 #8 0xb7214a7e in virEventRunDefaultImpl () from /usr/local/lib/libvirt.so.0 #9 0xb77c7b98 in virNetServerRun () #10 0xb7771c63 in main () (gdb) print ctx->osevent_in_hook $2 = 1000 (gdb) which IMO demonstrates that libxl_osevent_occurred_fd is being called on a destroyed ctx. This is probably a symptom of the bug in libvirt fixed by these patches: https://www.redhat.com/archives/libvir-list/2015-February/msg00024.html Signed-off-by: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> CC: Ian Campbell <ian.campbell@xxxxxxxxxx> CC: Wei Liu <wei.liu2@xxxxxxxxxx> CC: Jim Fehlig <jfehlig@xxxxxxxx> --- tools/libxl/libxl.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c index cd6f42c..de75ac4 100644 --- a/tools/libxl/libxl.c +++ b/tools/libxl/libxl.c @@ -146,11 +146,13 @@ int libxl_ctx_free(libxl_ctx *ctx) { if (!ctx) return 0; - assert(!ctx->osevent_in_hook); - int i; GC_INIT(ctx); + CTX_LOCK; + assert(!ctx->osevent_in_hook); + CTX->osevent_in_hook += 1000; + /* Deregister all libxl__ev_KINDs: */ free_disable_deaths(gc, &CTX->death_list); @@ -196,6 +198,7 @@ int libxl_ctx_free(libxl_ctx *ctx) libxl__sigchld_notneeded(gc); libxl__pipe_close(ctx->sigchld_selfpipe); + CTX_UNLOCK; pthread_mutex_destroy(&ctx->lock); GC_FREE; -- 1.7.10.4 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.