[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Please help estimate number of the domUs

Thank you very much for the clear and detailed explanations. Many things became clear to me now. I really appreciate that.

As for the number of LUNs I've read http://blogs.citrix.com/2011/06/01/sizing-luns-a-citrix-perspective/
article where it's recommended to create separate LUN for each 20-30 VMs.

So for example for 100 VMs there can be 4 LUNs each stores 25 VMs. Initially we'll have one XCP host, so 3.6TB with the average size of the VM = 30GB there will be 5 LUNs/SRs each 720GB though it complicates managing VMs a little. Does it make sense for the performance?

17.01.2013 01:07, admin@xxxxxxxxxxx ÐÐÑÐÑ:
For web server VPS instances, I usually see real world performance trend
most closely with the 4k random 67% write 33% read tests.  The reason
those VPS instances tend to skew toward vastly more writes than reads is
the http log files. The most heavily access web pages are cached in
memory in the VPS, so there are fewer read operations hitting the SAN.
The logs still need to be written, though.  The log file are being
written in random bursts when there are lots of different sites, even
though the individual log files are sequential.  These results for your
SAN were 4915 to 6317 (depending on queue depth).  This is the upper end
of my initial guess for your SAN (my guess was 2000 to 5000).  Based on
your benchmarks, your SAN can deliver about 5000 IOPS in the test that I
personally think most closely resembles the real world usage pattern for
a web server running in a VPS.

Note, the "4k random 67%read33%write" is actually mislabeled.  It should
say "4k random 67% write 33% read".

The wild card is database access, especially if you hosting databases
for people with a variety of skill levels.  If the tables are well
designed and properly indexed, then there will be very little disk
access.  If the tables are poorly designed and not indexed properly,
there will be a lot of disk access.  I have seen some customer sites
that need hundreds of IOPS just to service a tiny amount of traffic due
to poor database design.  On the other hand, I have seen a well designed
(and very well indexed) DB that averages 40 IOPS while servicing
millions of queries per day.

The sequential IO tests are an excellent test for how fast you will be
able to copy large files, which is important when you are migrating a
VPS between multiple SAN targets.  Generally speaking, sequential access
performance is usually far less important than random access
performance.  Random IO is far more common than sequential.  And when
you run a bunch of VPS instances, even sequential IO becomes random IO
simply because of all of the VPS instances accessing different areas of
the storage volume.  So I tend to look more at the random performance.

If you have multiple pools, create a separate LUN for each pool. If you
have only one pool, just create one LUN.  This is true regardless of how
many physical XCP/XenServer nodes exist within that pool.  XCP and
XenServer are smart enough to make sure each VPS can only access its own
data blocks even when many VPS share the same LUN.  If you have only one
pool, simply create one LUN. Then on the XCP/XenServer side, add that
iSCSI target as a storage repository.  Then create your VPS instances on
the storage repository.  XCP/XenServer will handle everything else under
the hood.  You don't need to manually install a cluster aware file
system or use a separate LUN per VPS.

On 1/16/2013 6:55 AM, Andrey wrote:
Ok. I performed tests with icf in Iometer-config-file.zip file (8
workers and 120 GB max file size) on RAID1+0 LUN, please see attached.
In this tests IOPS are much smaller. What is the real word performance
then? I'm little confused. Also is that right that I should not create
one big LUN for VMs and create few LUNs with the LUN size = (size of
20-30 VMs)*(average size of VM) for better performance?

16.01.2013 02:07, admin@xxxxxxxxxxx ÐÐÑÐÑ:
Those numbers are higher than I would have expected given the hardware
you listed.  For mixed random access, I expected your hardware would
have delivered 2000 to 5000, not 49747.  Of course, I test with 100%
random and 67% writes.  You were testing with 60% random and 35%
writes.  There could be considerable caching involved (especially with
read tests), but it is hard to say without more data points.

If you want to run more benchmarks with IOmeter, I would suggest trying
the ones that ZFSBuild uses from
http://www.zfsbuild.com/pics/Graphs/Iometer-config-file.zip . That zip
file contains an IOMeter.icf file.  More details about those benchmarks
are at

Anyway, I am a lot more familiar with the benchmarks from ZFSBuild.  If
you run those benchmarks and post those results, then I could give you a
very good idea what level of real world performance to expect.

Here are some InfiniBand based benchmarks using that the ZFSBuild
IOmeter file:

Here are some graphs of single ethernet port benchmarks: (comparing some
hardware from 2010 with hardware from 2012)

On 1/15/2013 2:33 PM, Andrey wrote:
Just finished measuring SAN performance with IOmeter
(http://vmktree.org/iometer/OpenPerformanceTest.icf and 5 minutes each
test) on RAID10 (data, 16GB maximum test file) and RAID50 (backup, 8GB
maximum test file) both 3.6TB with one ext4 partition. SAN is
configure in dual-path configuration and server has multipath
configured with 2 HBA adapters. Here are the results:

RAID 5+0:
|       Test name        |   Avg iops     |    AvgMBps    |
| Max Throughput-100%Read     |    47528    |    1485    |
| RealLife-60%Rand-65%Read     |    24760    |    193    |
| Max Throughput-50%Read     |    6959    |    217    |
| Random-8k-70%Read         |    26612    |    207    |

RAID 1+0:
|       Test name        |   Avg iops     |    AvgMBps    |
| Max Throughput-100%Read     |    44031    |    1375    |
| RealLife-60%Rand-65%Read     |    49474    |    386    |
| Max Throughput-50%Read     |    43002    |    1343    |
| Random-8k-70%Read         |    49930    |    390    |

Caching is in action or else?

13.01.2013 23:52, admin@xxxxxxxxxxx ÐÐÑÐÑ:
You should measure the performance of the SAN using something like
IOmeter (running IOmeter on the hardware you plan to run XenServer or
XCP on).  Assuming you configure those drives in RAID10, I would guess
that SAN would deliver about 2,000 to 5,000 IOPS.  If you use RAID5
(please don't), then you will see far less IOPS during mixed read and
write tests.

If you want to deploy 100 VMs onto that SAN, then each VM is only have
to have 20-50 IOPS (assuming RAID10).  The performance in each VM will
be less than fantastic.  If the VMs need to do any IO intensive tasks,
the owners of the VMs are probably going to complain about sluggish
performance.  I don't think the SAN you listed can deliver enough IOPS
to satisfy 100 VMs.

On 1/13/2013 12:17 PM, Andrey wrote:
Well, storage is the direct-connect HP P2000 G3 FC dual-controller
array with 600GBx24 disks in dual-path configuration (two HBA
ports ->
two controllers ports). I guess it is quite enough.

13.01.2013 20:45, admin@xxxxxxxxxxx ÐÐÑÐÑ:
You will probably run out of disk IO before you run into any hard
in XenServer or XCP.

What type of SAN are you going to use?  What type of network
interconnect will you use to link your XenServer/XCP nodes to your
How many IOPS does your SAN deliver over your chosen network

On 1/13/2013 9:03 AM, Andrey wrote:
Sure, will try. I see in XenServer 6.1 FAQ that maximum supported
number of guests is 150 and it requires increasing dom0_mem to max
4096. It's obvious that internal limits are not quite realistic
so it
will be good result for me if we able to run at least 100
guests. It
seems that it is more realistic number although some resources note
maximum number of VMs as 4-10 per CPU core (so 32-80 in my
case). But
in all these cases 192 GB RAM would be redundant I think.

With regards, Andrey

11.01.2013 16:43, Wei Liu ÐÐÑÐÑ:
On Fri, 2013-01-11 at 12:24 +0000, Andrey wrote:
Thank you for the answer

I'm really consider the case with creating as many DomUs as
with typical load and get practical info.

What about network capacity? Does this math implies to the
resources? Should we shape the DomUs bandwidth to prevent network
overload? Can CPU be bottleneck in this configuration?

The math I did was to show you some internal infrastructure limits
I know.

CPU / network overloading is another topic. TBH I haven't done
tests on CPUs and network.

And whether you will hit any bottlenecks in CPU / network or not
closely to your use case. Boot up DomUs and do some typical
workload is
a good idea.


Xen-users mailing list

Xen-users mailing list

Xen-users mailing list

Xen-users mailing list

Xen-users mailing list

Xen-users mailing list

Xen-users mailing list

Xen-users mailing list

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.