[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-API] Debugging XAPI daemon crash



Hi Ranjeet,

> It seems to be crashing in the same point as you had mentioned. Please
> find the SEGV backtrace attached.
> 
> (gdb) c
> Program received signal SIGSEGV, Segmentation fault.
> 0x085bc2d6 in stub_if_getaddr ()
>  (gdb) bt
> #0  0x085cca90 in segv_handler ()
> #1  <signal handler called>
> #2  0x085bc2d6 in stub_if_getaddr ()
> #3  0x0850ef8c in camlNetdev__get_all_ipv4_1325 ()
> 
> You had mentioned that this could be because of a bad C function binding.
> I wrote a small C stub to see whether it works for the xenbr0 interface
> and it seems to be working fine. How should I verify the binding.

The function that is failing seems to be this one: 
https://github.com/xapi-project/xen-api-libs/blob/clearwater/netdev/addr_stubs.c#L74

It has:

    int ret;
    struct ifaddrs *ifaddrs, *tmp;
    [...]
    ret = getifaddrs(&ifaddrs);
    [...]
    for (tmp = ifaddrs; tmp; tmp = tmp->ifa_next) {
        sock = tmp->ifa_addr;
        netmask = tmp->ifa_netmask;
        [...]

Could it be that the getifaddrs function does not set ifaddrs correctly? You 
should be able to test this with a small C program. Or is this what you have 
already done?

Cheers,
Rob

> Appreciate your help.
> 
> -Ranjeet
> 
> -----Original Message-----
> From: David Scott [mailto:dave.scott@xxxxxxxxxxxxx]
> Sent: Monday, March 24, 2014 3:46 AM
> To: Ranjeet R
> Cc: xen-api@xxxxxxxxxxxxx
> Subject: Re: [Xen-API] Debugging XAPI daemon crash
> 
> On 24/03/14 10:30, Ranjeet R wrote:
> > Hello Dave
> >
> > The binaries did not have debug symbols but I managed to rebuild the
> binaries with debug enabled.
> 
> Great.
> 
> > I tried starting the xapi process as it was started in the init.d
> scripts under gdb. However, in gdb, the xapi process forks another process
> and I am not able to debug it further (I tried setting detach_on_fork to
> off in gdb, but the primary process just goes to end of execution).
> >
> > I am using the following gdb command to debug
> >
> > gdb --args /usr/sbin/xapi -daemon -writeinitcomplete
> /var/run/xapi_init_complete.cookie -writereadyfile
> /var/run/xapi_startup.cookie -onsystemboot"
> >
> > Can you please help me in the steps that you use in debugging the XAPI
> process.
> 
> Ah, I think xapi forks a "watchdog" process near the start -- this is
> probably what you're seeing.
> 
> Try adding a "-nowatchdog" option to the command-line.
> 
> Dave
> 
> >
> > Thanks for your help,
> >
> > -Ranjeet
> >
> >
> >
> > -----Original Message-----
> > From: Dave Scott [mailto:Dave.Scott@xxxxxxxxxx]
> > Sent: Saturday, March 22, 2014 12:36 PM
> > To: Ranjeet R
> > Cc: xen-api@xxxxxxxxxxxxx
> > Subject: Re: [Xen-API] Debugging XAPI daemon crash
> >
> > Hi,
> >
> > I suspect the segfault is being caused by a bad C function binding. I've
> seen a similar crash before when querying an interface IP via getifaddrs
> (I think that was the function name) Could you run xapi in gdb and
> reproduce the crash? Printing the call stack would help to confirm this
> hypothesis. Provided the xapi binary still has debug symbols (ie hasn't
> been stripped) the ocaml functions (with fairly obvious mangled names)
> should also be on the stack too.
> >
> > Cheers,
> > Dave
> >
> >> On Mar 22, 2014, at 3:47 AM, "Ranjeet R" <rranjeet@xxxxxxxxxxx> wrote:
> >>
> >> Hello all
> >>
> >> I am trying to bring a DevCloud setup which has an XCP Kronos based
> XAPI daemon. I had changed the underlying network implementation (it is
> not a bridge, but an openvswitch-like network implementation) and the XAPI
> daemon crashes during bootup. Please find the XAPI logs below.
> >>
> >>
> >> starting up database engine D:72969b3eaf8e|redo_log] Flushing
> >> database to all active redo-logs starting up database engine
> >> D:72969b3eaf8e|xapi] About to flush database: /var/lib/xcp/state.db
> >> starting up database engine D:72969b3eaf8e|redo_log] Flushing
> >> database to all active redo-logs starting up database engine
> >> D:72969b3eaf8e|xapi] Performing initial DB GC thread_zero|dbsync
> >> (update_env) D:fd0aec7399c9|dbsync] Sync: sync_create_localhost
> >> dbsync
> >> (update_env) D:fd0aec7399c9|dbsync] creating localhost
> >>
> >> dmesg logs seem to suggest that xapi is crashing during startup.
> >>
> >> [    9.092377] xapi[2813]: segfault at 0 ip 085bc286 sp bf80ae30 error
> 4 in xapi[8048000+59f000]
> >> [    9.869971] xapi[2943]: segfault at 0 ip 085bc286 sp bf8ec450 error
> 4 in xapi[8048000+59f000]
> >>
> >> I looked the XAPI code to see where it fails and I don't see any logs
> >> after the following code point in ocaml / xapi / dbsync_slave.ml
> >>
> >> let create_localhost ~__context info =
> >>    let ip = get_my_ip_addr ~__context in
> >>
> >> I confirmed to see that "ifconfig xenbr0" has a valid management IP
> address and should not fail.
> >>
> >> How do I debug this crash further. Are there any ways to look at the
> stack trace where XAPI crashed. Any pointers to debug this further will be
> very helpful.
> >>
> >> -Ranjeet
> >>
> >>
> >> _______________________________________________
> >> Xen-api mailing list
> >> Xen-api@xxxxxxxxxxxxx
> >> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api
> >
> >
> >
> 
> 
> 
> 
> 
> _______________________________________________
> Xen-api mailing list
> Xen-api@xxxxxxxxxxxxx
> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api

_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.