[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [Xen-users] 1000 Domains: Not able to access Domu via xm console from Dom0



On Mon, 2012-12-17 at 11:56 +0000, Ian Campbell wrote:
> On Fri, 2012-12-14 at 13:06 +0000, Paul Harvey wrote:
> > Program received signal SIGABRT, Aborted.
> > 0x00007fe588ca8425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> > (gdb) bt
> > #0  0x00007fe588ca8425 in raise () from /lib/x86_64-linux-gnu/libc.so.6
> > #1  0x00007fe588cabb8b in abort () from /lib/x86_64-linux-gnu/libc.so.6
> > #2  0x00007fe588ce639e in ?? () from /lib/x86_64-linux-gnu/libc.so.6
> > #3  0x00007fe588d7c807 in __fortify_fail () from 
> > /lib/x86_64-linux-gnu/libc.so.6
> > #4  0x00007fe588d7b700 in __chk_fail () from /lib/x86_64-linux-gnu/libc.so.6
> > #5  0x00007fe588d7c7be in __fdelt_warn () from 
> > /lib/x86_64-linux-gnu/libc.so.6
> > #6  0x0000000000403ca8 in handle_io () at daemon/io.c:1059
> > #7  0x00000000004021c5 in main (argc=2, argv=0x7fff58691d48) at 
> > daemon/main.c:166
> 
> daemon/io.c:1059 in 4.1.2 is:
>                                     FD_ISSET(xc_evtchn_fd(d->xce_handle),
>                                              &readfds))
>                                         handle_ring_read(d);
> 
> I rather suspect this is overrunning the readfds array.
> http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/sys_select.h.html 
> suggests this is sized by FD_SETSIZE. On my system that appears to be 
> statically 1024 (at least strace doesn't show a syscall to determine it in a 
> simple test app although grep /usr/include suggests that might be an option 
> on some systems).
> 
> It doesn't seem likely that there will be a simple solution to this. We
> probably need to switch to something other than select(2). poll(2) seems
> to handle arbitrary numbers of file descriptors. epoll(7) would be nice
> (it supposedly scales better than poll) but is Linux specific. Another
> option might be to fork multiple worker processes (might be a good idea
> if xenconsole becomes a bottleneck).

libevent wraps around different event APIs and provides consistent
interface across OSes. But I don't know whether adding libevent as Xen
tools dependency is a good idea.

> It seems likely (based on a quick grep) that both xenstore (both the C
> and ocaml variants) will suffer from the same issue.
> 

Yes, I ran a test and hit this limit in both Xenstored and Xenconsoled.

> I'm not sure why we have an evtchn handle per guest, other than this
> comment which suggests it was simply expedient rather than a good
> design:
>       /* Opening evtchn independently for each console is a bit
>        * wasteful, but that's how the code is structured... */
>       dom->xce_handle = xc_evtchn_open(NULL, 0);
>       if (dom->xce_handle == NULL) {
>               err = errno;
>               goto out;
>       }
> However this is just one open fd which scales with number of domains
> (the others are the pty related ones) so just fixing this would just buy
> a bit more time but not fix the underlying issue.
> 

Even if you work around this problem, you will still hit Xenstore limit.
So the underlying issue has to be fixed.


Wei.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.