[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen at scale



> 
> I'm using Xen for our cluster management project.  At the moment 219
> xen-1.2 nodes  are up (my interest in multicast is to udpcast virtual disk
> images to all those hosts).  At this time I'm just running two domains on
> each host but that will be going up soon (as well as the total number of 
> hosts).
> 
> For the most part all is well.  Sometimes a domain will spontaneously stop and
> I've yet to figure out why.  The leading suspect is memory pressure.

That's worrying. Please try and find a reliable way to reproduce,
then try a debug build. 

We've seen some weird hangs under extreme conditions with NFS
root, but we can reproduce these on stock Linux :-(

> My wish list based on my experience so far:
>        HIGHMEM4G - This is my top wish, and I will work on it myself
>             when I get a chance, though I can't say when that
> will be.

Agreed. It should just be a case of putting the CONFIG_HIGHMEM4G
stuff back into the following files:

./include/asm-xeno/fixmap.h
./include/asm-xeno/highmem.h
./include/asm-xeno/pgtable.h
./include/asm-xeno/page.h
./arch/xeno/config.in
./arch/xeno/mm/init.c
./arch/xeno/kernel/setup.c


>        halt - Xen is unstoppable!!!!   :-)
>              We had an A/C outage the other night, and while trying to
>              remotely shutdown sections of the cluster I discovered that
>              halting DOM0 just caused xen to reboot.

I guess we need an exit code from domains...

>        system info/logging -
>            It would be very handy to have total phys memory reported
>            somehow.   The cluster is heterogenous, and memory upgrades
>            happen, and the net result is I rely on /proc/meminfo to know
>            what is in each host.  The xenolinux meminfo just says what
>            that domain was allocated.   Is total RAM reported somewhere
>            I don't know?

The xc_physinfo library returns this (via the DOM0_PHYSINFO
hypercall). It should be easy to knock up a python wrapper for
this. Would you mind having a go?


>            Viewing the vfr rules without resorting to xen_read_console
>            would be handy as well.

This should be much better under the new IO world...
 
>            Getting xen console output to a log file would be useful.  
>          Running hundreds of xen_read_console is not so practical.
>          This would help in identifying those mysterious reboots.
>            Most hosts in this cluster do not have serial lines or heads
>          attached.  I think I saw console output was changing in 1.3
>          so maybe this has been worked on already.

xen_dmesg.py any good?

>        multicast - as mentioned
> 
>        FreeBSD - I'd love a freebsd domain.  I know y'all said this was
>            in the works.  Is there any current status?

NetBSD not good enough for you? ;-)


> Perhaps some of this is already there, and I just don't know.  More docs
> are always welcome.  My own documentation track record is not so great so I
> can sympathize on this one.

We'd *love* help on user docs.

Best,
Ian


-------------------------------------------------------
This SF.Net email is sponsored by: IBM Linux Tutorials
Free Linux tutorial presented by Daniel Robbins, President and CEO of
GenToo technologies. Learn everything from fundamentals to system
administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxxx
https://lists.sourceforge.net/lists/listinfo/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.