[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH 4/4] libxl: More probably detect reentry of destroyed ctx
On Mon, Feb 09, 2015 at 03:51:12PM +0000, Ian Jackson wrote: > In libxl_ctx_free: > > 1. Move the GC_INIT earlier, so that we can: > > 2. Take the lock around most of the work. This is technically > unnecessary since calling any other libxl entrypoint on a ctx being > passed to libxl_ctx_free risks undefined behaviour. But, taking > the lock allows us to much more usually spot this. > > 3. Increment osevent_in_hook by 1000. If we are reentered after > destroy, this will trip some of our entrypoints' asserts. It also > means that if we crash for some other reason relating to reentering > a destroyed ctx, the cause will be more obviously visible by > examining ctx->osevent_in_hook (assuming that the memory previously > used for the ctx hasn't been reused and overwritten). > > 4. Free the lock again. (pthread_mutex_destroy requires that the > mutex be unlocked.) > > With this patch, I find that an occasional race previously reported > as: > libvirtd: libxl_internal.h:3265: libxl__ctx_unlock: Assertion `!r' failed. > is now reported as: > libvirtd: libxl_event.c:1236: libxl_osevent_occurred_fd: Assertion > `!libxl__gc_owner(gc)->osevent_in_hook' failed. > > Examining the call trace with gdb shows this: > (gdb) bt > #0 0xb773f424 in __kernel_vsyscall () > #1 0xb7022871 in raise () from /lib/i386-linux-gnu/i686/nosegneg/libc.so.6 > #2 0xb7025d62 in abort () from /lib/i386-linux-gnu/i686/nosegneg/libc.so.6 > #3 0xb701b958 in __assert_fail () from > /lib/i386-linux-gnu/i686/nosegneg/libc.so.6 > #4 0xb6f00390 in libxl_osevent_occurred_fd (ctx=0xb84813a8, > for_libxl=0xb84d6848, fd=31, events_ign=0, revents_ign=1) at > libxl_event.c:1236 > #5 0xb1b70464 in libxlDomainObjFDEventCallback () from > /usr/local/lib/libvirt/connection-driver/libvirt_driver_libxl.so > #6 0xb72163b1 in virEventPollDispatchHandles () from > /usr/local/lib/libvirt.so.0 > #7 0xb7216be5 in virEventPollRunOnce () from /usr/local/lib/libvirt.so.0 > #8 0xb7214a7e in virEventRunDefaultImpl () from /usr/local/lib/libvirt.so.0 > #9 0xb77c7b98 in virNetServerRun () > #10 0xb7771c63 in main () > (gdb) print ctx->osevent_in_hook > $2 = 1000 > (gdb) > which IMO demonstrates that libxl_osevent_occurred_fd is being called > on a destroyed ctx. > > This is probably a symptom of the bug in libvirt fixed by these > patches: > https://www.redhat.com/archives/libvir-list/2015-February/msg00024.html > > Signed-off-by: Ian Jackson <Ian.Jackson@xxxxxxxxxxxxx> > CC: Ian Campbell <ian.campbell@xxxxxxxxxx> > CC: Wei Liu <wei.liu2@xxxxxxxxxx> > CC: Jim Fehlig <jfehlig@xxxxxxxx> Acked-by: Wei Liu <wei.liu2@xxxxxxxxxx> > --- > tools/libxl/libxl.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/tools/libxl/libxl.c b/tools/libxl/libxl.c > index cd6f42c..de75ac4 100644 > --- a/tools/libxl/libxl.c > +++ b/tools/libxl/libxl.c > @@ -146,11 +146,13 @@ int libxl_ctx_free(libxl_ctx *ctx) > { > if (!ctx) return 0; > > - assert(!ctx->osevent_in_hook); > - > int i; > GC_INIT(ctx); > > + CTX_LOCK; > + assert(!ctx->osevent_in_hook); > + CTX->osevent_in_hook += 1000; > + It would be good if you add comment here to say this is used to help debugging. Wei. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |