[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] 1000 Domains: Not able to access Domu via xm console from Dom0



On Fri, 2012-12-14 at 13:06 +0000, Paul Harvey wrote:
> Program received signal SIGABRT, Aborted.
> 0x00007fe588ca8425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> (gdb) bt
> #0  0x00007fe588ca8425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> #1  0x00007fe588cabb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
> #2  0x00007fe588ce639e in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> #3  0x00007fe588d7c807 in __fortify_fail () from 
> /lib/x86_64-linux-gnu/libc.so.6
> #4  0x00007fe588d7b700 in __chk_fail () from /lib/x86_64-linux-gnu/libc.so.6
> #5  0x00007fe588d7c7be in __fdelt_warn () from /lib/x86_64-linux-gnu/libc.so.6
> #6  0x0000000000403ca8 in handle_io () at daemon/io.c:1059
> #7  0x00000000004021c5 in main (argc=2, argv=0x7fff58691d48) at 
> daemon/main.c:166

daemon/io.c:1059 in 4.1.2 is:
                                    FD_ISSET(xc_evtchn_fd(d->xce_handle),
                                             &readfds))
                                        handle_ring_read(d);

I rather suspect this is overrunning the readfds array.
http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_select.h.html 
suggests this is sized by FD_SETSIZE. On my system that appears to be 
statically 1024 (at least strace doesn't show a syscall to determine it in a 
simple test app although grep /usr/include suggests that might be an option on 
some systems).

It doesn't seem likely that there will be a simple solution to this. We
probably need to switch to something other than select(2). poll(2) seems
to handle arbitrary numbers of file descriptors. epoll(7) would be nice
(it supposedly scales better than poll) but is Linux specific. Another
option might be to fork multiple worker processes (might be a good idea
if xenconsole becomes a bottleneck).

It seems likely (based on a quick grep) that both xenstore (both the C
and ocaml variants) will suffer from the same issue.

I'm not sure why we have an evtchn handle per guest, other than this
comment which suggests it was simply expedient rather than a good
design:
        /* Opening evtchn independently for each console is a bit
         * wasteful, but that's how the code is structured... */
        dom->xce_handle = xc_evtchn_open(NULL, 0);
        if (dom->xce_handle == NULL) {
                err = errno;
                goto out;
        }
However this is just one open fd which scales with number of domains
(the others are the pty related ones) so just fixing this would just buy
a bit more time but not fix the underlying issue.

Ian.



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.