[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-API] Fwd: xenbusb_nop_confighook_cb timeout



I need further and more contentious investigation…. but seems to go in the correct direction……


Egoitz Aurrekoetxea
Departamento de sistemas
egoitz@xxxxxxxxxx

Inicio del mensaje reenviado:

De: Egoitz Aurrekoetxea Aurre <egoitz@xxxxxxxxxxxxx>
Asunto: Re: xenbusb_nop_confighook_cb timeout
Fecha: 10 de octubre de 2012 22:16:44 GMT+02:00
Para: Mark Felder <feld@xxxxxxx>
Cco: Borja Marcos <BORJAMAR@xxxxxxxxxx>, Santiago Mercado <santi@xxxxxxxxxx>

mmmm...

one thing mates… I need to check this in a more slower and conscientiously way but...

I think… in subr_autoconf.c file in function boot_run_interrupt_driven_config_hooks it's entering in a while causing to be looped there forever… because it doesn't see the NULL it's awaiting the while and apart of this, seems nothing changes in the given structures when calling msleep…. because perhaps… nothing should change and it's always basically not seen NULL too….. So loops six times and gets there…. panicked… Look…

root@pruebas:/root # diff -u /usr/src/sys/kern/subr_autoconf.c-defecto /usr/src/sys/kern/subr_autoconf.c
--- /usr/src/sys/kern/subr_autoconf.c-defecto 2012-10-10 13:51:27.000000000 +0200
+++ /usr/src/sys/kern/subr_autoconf.c 2012-10-10 18:21:51.000000000 +0200
@@ -133,16 +133,17 @@
/* Block boot processing until all hooks are disestablished. */
mtx_lock(&intr_config_hook_lock);
warned = 0;
- while (!TAILQ_EMPTY(&intr_config_hook_list)) {
+ /* while (!TAILQ_EMPTY(&intr_config_hook_list)) { */
if (msleep(&intr_config_hook_list, &intr_config_hook_lock,
   0, "conifhk", WARNING_INTERVAL_SECS * hz) ==
   EWOULDBLOCK) {
+ printf("\n\n SARENET Individual lock name antes de unlock es : %s", intr_config_hook_lock.lock_object.lo_name);
mtx_unlock(&intr_config_hook_lock);
warned++;
run_interrupt_driven_config_hooks_warning(warned);
mtx_lock(&intr_config_hook_lock);
}
- }
+ /* } */
mtx_unlock(&intr_config_hook_lock);
}

TAILQ_EMPTY is at queue.h :

#define STAILQ_EMPTY(head) ((head)->stqh_first == NULL)

With the printf line entered by me… have not seen any text in intr_config_hook_lock.lock_object.lo_name struct element… So…. I commented the while as seen in the patch….

and the system is booting :)

root@pruebas:/root #
root@pruebas:/root #
root@pruebas:/root #
root@pruebas:/root # uptime
7:08PM  up 23 secs, 1 user, load averages: 0.96, 0.25, 0.09
root@pruebas:/root # uname -ar
FreeBSD pruebas.sare.net 9.1-RC2 FreeBSD 9.1-RC2 #0: Wed Oct 10 18:33:54 CEST 2012     root@xxxxxxxxxxxxxxxx:/usr/obj/usr/src/sys/XENHVM11  amd64
root@pruebas:/root #

So…. I'm guessing perhaps enters in the loop because the value is not exactly NULL and stays there till it attempts six times… and get there indefinitely like panicked….

Have tried it too with FreeBSD 9.0 RELENG_9_0….

As said at the beginning need to investigate further… but seems like we're going in the proper direction….

Cheers,



El 10/10/2012, a las 20:56, Mark Felder <feld@xxxxxxx> escribió:

This is also preventing my XCP 1.5beta to 1.6 testing :-(


Any suggestions are appreciated!


_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.