[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Please help estimate number of the domUs

To: xen-users@xxxxxxxxxxxxx
From: "admin@xxxxxxxxxxx" <admin@xxxxxxxxxxx>
Date: Thu, 17 Jan 2013 09:11:12 -0600
Delivery-date: Thu, 17 Jan 2013 15:12:25 +0000
List-id: Xen user discussion <xen-users.lists.xen.org>

I have not personally seen a performance issue related to the number ofVMs per LUN, but I have never tried 100 VMs per LUN. Certain SANs mayimplement the access queues in such a way to effectively cause 20 VMs tobe a sweet spot like that article describes, but I suspect it would varyfrom one SAN to the next. Even though I have not run into that issue,it is very possible that article is correct. Especially in your casewith 100 VMs, you may want to follow the advice in that article andsplit your VMs across several LUNs.

What ever you do, always leave extra room in each LUN for specialoperations (such as snapshots, migrations, etc). For example, any timeyou need to migrate storage with XCP/XenServer, you will need a lot ofdisk space available on both the target and source SRs to complete thetask. The extra disk space will be allocated and used behind thescenes, but the operation will fail if there is not enough disk spaceavailable. If your VMs are each 30GB, make sure you plan to leave30-60GB of additional free space in each LUN to make sure you have roomfor those operations.


On 1/16/2013 11:31 PM, Andrey wrote:

Thank you very much for the clear and detailed explanations. Manythings became clear to me now. I really appreciate that.

As for the number of LUNs I've readhttp://blogs.citrix.com/2011/06/01/sizing-luns-a-citrix-perspective/

article where it's recommended to create separate LUN for each 20-30 VMs.

So for example for 100 VMs there can be 4 LUNs each stores 25 VMs.Initially we'll have one XCP host, so 3.6TB with the average size ofthe VM = 30GB there will be 5 LUNs/SRs each 720GB though itcomplicates managing VMs a little. Does it make sense for theperformance?


17.01.2013 01:07, admin@xxxxxxxxxxx ÐÐÑÐÑ:

For web server VPS instances, I usually see real world performance trend
most closely with the 4k random 67% write 33% read tests.  The reason
those VPS instances tend to skew toward vastly more writes than reads is
the http log files. The most heavily access web pages are cached in
memory in the VPS, so there are fewer read operations hitting the SAN.
The logs still need to be written, though.  The log file are being
written in random bursts when there are lots of different sites, even
though the individual log files are sequential.  These results for your
SAN were 4915 to 6317 (depending on queue depth).  This is the upper end
of my initial guess for your SAN (my guess was 2000 to 5000). Based on
your benchmarks, your SAN can deliver about 5000 IOPS in the test that I
personally think most closely resembles the real world usage pattern for
a web server running in a VPS.

Note, the "4k random 67%read33%write" is actually mislabeled. It should
say "4k random 67% write 33% read".

The wild card is database access, especially if you hosting databases
for people with a variety of skill levels.  If the tables are well
designed and properly indexed, then there will be very little disk
access.  If the tables are poorly designed and not indexed properly,
there will be a lot of disk access.  I have seen some customer sites
that need hundreds of IOPS just to service a tiny amount of traffic due
to poor database design.  On the other hand, I have seen a well designed
(and very well indexed) DB that averages 40 IOPS while servicing
millions of queries per day.

The sequential IO tests are an excellent test for how fast you will be
able to copy large files, which is important when you are migrating a
VPS between multiple SAN targets.  Generally speaking, sequential access
performance is usually far less important than random access
performance.  Random IO is far more common than sequential.  And when
you run a bunch of VPS instances, even sequential IO becomes random IO
simply because of all of the VPS instances accessing different areas of
the storage volume.  So I tend to look more at the random performance.

If you have multiple pools, create a separate LUN for each pool. If you
have only one pool, just create one LUN.  This is true regardless of how
many physical XCP/XenServer nodes exist within that pool.  XCP and
XenServer are smart enough to make sure each VPS can only access its own
data blocks even when many VPS share the same LUN.  If you have only one
pool, simply create one LUN. Then on the XCP/XenServer side, add that
iSCSI target as a storage repository.  Then create your VPS instances on
the storage repository.  XCP/XenServer will handle everything else under
the hood.  You don't need to manually install a cluster aware file
system or use a separate LUN per VPS.


On 1/16/2013 6:55 AM, Andrey wrote:

Ok. I performed tests with icf in Iometer-config-file.zip file (8
workers and 120 GB max file size) on RAID1+0 LUN, please see attached.
In this tests IOPS are much smaller. What is the real word performance
then? I'm little confused. Also is that right that I should not create
one big LUN for VMs and create few LUNs with the LUN size = (size of
20-30 VMs)*(average size of VM) for better performance?

16.01.2013 02:07, admin@xxxxxxxxxxx ÐÐÑÐÑ:

Those numbers are higher than I would have expected given the hardware
you listed.  For mixed random access, I expected your hardware would
have delivered 2000 to 5000, not 49747.  Of course, I test with 100%
random and 67% writes.  You were testing with 60% random and 35%
writes.  There could be considerable caching involved (especially with
read tests), but it is hard to say without more data points.

If you want to run more benchmarks with IOmeter, I would suggesttrying

the ones that ZFSBuild uses from
http://www.zfsbuild.com/pics/Graphs/Iometer-config-file.zip . That zip

file contains an IOMeter.icf file. More details about thosebenchmarks

are at
http://www.zfsbuild.com/2012/12/14/zfsbuild2012-benchmark-methods/

Anyway, I am a lot more familiar with the benchmarks fromZFSBuild. Ifyou run those benchmarks and post those results, then I could giveyou a

very good idea what level of real world performance to expect.

Here are some InfiniBand based benchmarks using that the ZFSBuild
IOmeter file:

http://www.zfsbuild.com/2012/12/15/zfsbuild2012-infiniband-performance/

Here are some graphs of single ethernet port benchmarks: (comparingsome

hardware from 2010 with hardware from 2012)

http://www.zfsbuild.com/2012/12/14/zfsbuild2012-performance-compared-to-zfsbuild2010/





On 1/15/2013 2:33 PM, Andrey wrote:

Just finished measuring SAN performance with IOmeter

(http://vmktree.org/iometer/OpenPerformanceTest.icf and 5 minuteseachtest) on RAID10 (data, 16GB maximum test file) and RAID50 (backup,8GB

maximum test file) both 3.6TB with one ext4 partition. SAN is
configure in dual-path configuration and server has multipath
configured with 2 HBA adapters. Here are the results:

RAID 5+0:
-----------------------------------------------------------------
|       Test name        |   Avg iops     |    AvgMBps |
-----------------------------------------------------------------
| Max Throughput-100%Read     |    47528    |    1485    |
| RealLife-60%Rand-65%Read     |    24760    |    193    |
| Max Throughput-50%Read     |    6959    |    217    |
| Random-8k-70%Read         |    26612    |    207    |
-----------------------------------------------------------------

RAID 1+0:
-----------------------------------------------------------------
|       Test name        |   Avg iops     |    AvgMBps |
-----------------------------------------------------------------
| Max Throughput-100%Read     |    44031    |    1375    |
| RealLife-60%Rand-65%Read     |    49474    |    386    |
| Max Throughput-50%Read     |    43002    |    1343    |
| Random-8k-70%Read         |    49930    |    390    |
-----------------------------------------------------------------

Caching is in action or else?

13.01.2013 23:52, admin@xxxxxxxxxxx ÐÐÑÐÑ:

You should measure the performance of the SAN using something like

IOmeter (running IOmeter on the hardware you plan to runXenServer orXCP on). Assuming you configure those drives in RAID10, I wouldguess

that SAN would deliver about 2,000 to 5,000 IOPS.  If you use RAID5

(please don't), then you will see far less IOPS during mixed readand

write tests.

If you want to deploy 100 VMs onto that SAN, then each VM is onlyhaveto have 20-50 IOPS (assuming RAID10). The performance in each VMwillbe less than fantastic. If the VMs need to do any IO intensivetasks,

the owners of the VMs are probably going to complain about sluggish

performance. I don't think the SAN you listed can deliver enoughIOPS

to satisfy 100 VMs.

On 1/13/2013 12:17 PM, Andrey wrote:

Well, storage is the direct-connect HP P2000 G3 FC dual-controller
array with 600GBx24 disks in dual-path configuration (two HBA
ports ->
two controllers ports). I guess it is quite enough.

13.01.2013 20:45, admin@xxxxxxxxxxx ÐÐÑÐÑ:

You will probably run out of disk IO before you run into any hard
limits
in XenServer or XCP.

What type of SAN are you going to use?  What type of network
interconnect will you use to link your XenServer/XCP nodes to your
SAN?
How many IOPS does your SAN deliver over your chosen network
interconnect?

On 1/13/2013 9:03 AM, Andrey wrote:

Sure, will try. I see in XenServer 6.1 FAQ that maximum supported

number of guests is 150 and it requires increasing dom0_mem tomax

4096. It's obvious that internal limits are not quite realistic
so it
will be good result for me if we able to run at least 100
guests. It

seems that it is more realistic number although some resourcesnote

maximum number of VMs as 4-10 per CPU core (so 32-80 in my
case). But
in all these cases 192 GB RAM would be redundant I think.

With regards, Andrey

11.01.2013 16:43, Wei Liu ÐÐÑÐÑ:

On Fri, 2013-01-11 at 12:24 +0000, Andrey wrote:

Thank you for the answer

I'm really consider the case with creating as many DomUs as
possible
with typical load and get practical info.

What about network capacity? Does this math implies to the
network

resources? Should we shape the DomUs bandwidth to preventnetwork

overload? Can CPU be bottleneck in this configuration?

The math I did was to show you some internal infrastructurelimits

that
I know.

CPU / network overloading is another topic. TBH I haven't done
stress
tests on CPUs and network.

And whether you will hit any bottlenecks in CPU / network or not
relates
closely to your use case. Boot up DomUs and do some typical
workload is
a good idea.


Wei.


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users



_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users




_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users

References:
- [Xen-users] Please help estimate number of the domUs
  - From: Andrey
- Re: [Xen-users] Please help estimate number of the domUs
  - From: Wei Liu
- Re: [Xen-users] Please help estimate number of the domUs
  - From: Andrey
- Re: [Xen-users] Please help estimate number of the domUs
  - From: Wei Liu
- Re: [Xen-users] Please help estimate number of the domUs
  - From: Andrey
- Re: [Xen-users] Please help estimate number of the domUs
  - From: admin@xxxxxxxxxxx
- Re: [Xen-users] Please help estimate number of the domUs
  - From: Andrey
- Re: [Xen-users] Please help estimate number of the domUs
  - From: admin@xxxxxxxxxxx
- Re: [Xen-users] Please help estimate number of the domUs
  - From: Andrey
- Re: [Xen-users] Please help estimate number of the domUs
  - From: admin@xxxxxxxxxxx
- Re: [Xen-users] Please help estimate number of the domUs
  - From: Andrey
- Re: [Xen-users] Please help estimate number of the domUs
  - From: admin@xxxxxxxxxxx
- Re: [Xen-users] Please help estimate number of the domUs
  - From: Andrey

Prev by Date: [Xen-users] Network Interface Question
Next by Date: [Xen-users] XCP 1.1 SR_BACKEND_FAILURE_46 The VDI is not available [opterr=VDI already attached RW] issue
Previous by thread: Re: [Xen-users] Please help estimate number of the domUs
Next by thread: Re: [Xen-users] Please help estimate number of the domUs
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.