[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-API] XCP 1.1 - poor disk performance, high load and wa


  • To: xen-api@xxxxxxxxxxxxx
  • From: SpamMePlease PleasePlease <spankthespam@xxxxxxxxx>
  • Date: Sat, 22 Sep 2012 19:36:23 +0100
  • Delivery-date: Sat, 22 Sep 2012 18:36:41 +0000
  • List-id: User and development list for XCP and XAPI <xen-api.lists.xen.org>

Hi all,

Since I am still looking for a solution to this mystery, I've posted
some additional data about the hardware: http://pastebin.com/Z4BVShvF
- you can find there output of lspci, dmidecode and dmesg.

As a sidenote, I've reinstalled the OS with 4K sectors in mind and
GPT, and the performance is still crawling when comparing to the
installation on older hardware (the same vm imports/exports in ~50m vs
~3m).

I really hope someone can shed some light on what's happening and how to fix it.

Cheers, S.

On Fri, Sep 21, 2012 at 3:18 PM, SpamMePlease PleasePlease
<spankthespam@xxxxxxxxx> wrote:
> Actually, I've reinstalled the OS without md raid (despite the fact I
> have this configuration working perfectly fine on another server) and
> I still have extremely poor performance when importing vm's:
>
> * the process takes over hour for a vm exported in ~3 minutes
> * top looks like that, when importing the vm to fresh, empty (no
> running vm's) system:
>
> top - 16:09:40 up  1:49,  3 users,  load average: 1.36, 1.44, 1.38
> Tasks: 134 total,   1 running, 133 sleeping,   0 stopped,   0 zombie
> Cpu(s):  0.3%us,  0.0%sy,  0.0%ni, 60.7%id, 39.0%wa,  0.0%hi,  0.0%si,  0.0%st
> Mem:    771328k total,   763052k used,     8276k free,   269656k buffers
> Swap:   524280k total,        0k used,   524280k free,   305472k cached
>
>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>  6433 root      20   0  217m  25m 8348 S  0.7  3.4   0:35.63 xapi
>    92 root      20   0     0    0    0 S  0.3  0.0   0:00.02 bdi-default
> 11775 root      20   0  2036 1072  584 S  0.3  0.1   0:01.63 xe
> 15058 root      20   0  2424 1120  832 S  0.3  0.1   0:06.77 top
> 17127 root      20   0  2424 1108  828 R  0.3  0.1   0:00.13 top
>
> * the iostat looks like that:
>
> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>            0.15    0.00    0.20   44.85    0.05   54.75
>
> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
> avgqu-sz   await  svctm  %util
> sda               0.00     0.40  2.40  2.40  1864.00   575.60   508.25
>     2.97  649.58 208.33 100.00
> sda1              0.00     0.00  2.40  0.40  1864.00    11.20   669.71
>     0.82  346.43 281.43  78.80
> sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>     0.00    0.00   0.00   0.00
> sda3              0.00     0.40  0.00  2.00     0.00   564.40   282.20
>     2.14 1074.00 500.00 100.00
> sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00
>     0.00    0.00   0.00   0.00
> dm-0              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>     0.00    0.00   0.00   0.00
> dm-1              0.00     0.00  0.00  2.20     0.00   759.20   345.09
>     2.57  976.36 454.55 100.00
> tda               0.00     0.00  1.00  8.60     8.00   756.80    79.67
>    16.00 1546.04 104.17 100.00
> xvda              0.00   130.00  1.00  8.60     8.00   756.80    79.67
>   148.83 10160.42 104.17 100.00
>
> * smartclt shows both disks to be in perfect health
> * hdparm reports decent speeds on raw sda/sdb devices (~160Mb/s)
> * I was pointed out that the drives are new 4k sector size ones, and
> I've modified install.img .py files to accomodate that change in few
> places, and will try to reinstall the machine afterwards
>
> Any clue?
> S.
>
> On Fri, Sep 21, 2012 at 3:17 PM, SpamMePlease PleasePlease
> <spankthespam@xxxxxxxxx> wrote:
>> Actually, I've reinstalled the OS without md raid (despite the fact I
>> have this configuration working perfectly fine on another server) and
>> I still have extremely poor performance when importing vm's:
>>
>> * the process takes over hour for a vm exported in ~3 minutes
>> * top looks like that, when importing the vm to fresh, empty (no
>> running vm's) system:
>>
>> top - 16:09:40 up  1:49,  3 users,  load average: 1.36, 1.44, 1.38
>> Tasks: 134 total,   1 running, 133 sleeping,   0 stopped,   0 zombie
>> Cpu(s):  0.3%us,  0.0%sy,  0.0%ni, 60.7%id, 39.0%wa,  0.0%hi,  0.0%si,  
>> 0.0%st
>> Mem:    771328k total,   763052k used,     8276k free,   269656k buffers
>> Swap:   524280k total,        0k used,   524280k free,   305472k cached
>>
>>   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
>>  6433 root      20   0  217m  25m 8348 S  0.7  3.4   0:35.63 xapi
>>    92 root      20   0     0    0    0 S  0.3  0.0   0:00.02 bdi-default
>> 11775 root      20   0  2036 1072  584 S  0.3  0.1   0:01.63 xe
>> 15058 root      20   0  2424 1120  832 S  0.3  0.1   0:06.77 top
>> 17127 root      20   0  2424 1108  828 R  0.3  0.1   0:00.13 top
>>
>> * the iostat looks like that:
>>
>> avg-cpu:  %user   %nice %system %iowait  %steal   %idle
>>            0.15    0.00    0.20   44.85    0.05   54.75
>>
>> Device:         rrqm/s   wrqm/s   r/s   w/s   rsec/s   wsec/s avgrq-sz
>> avgqu-sz   await  svctm  %util
>> sda               0.00     0.40  2.40  2.40  1864.00   575.60   508.25
>>     2.97  649.58 208.33 100.00
>> sda1              0.00     0.00  2.40  0.40  1864.00    11.20   669.71
>>     0.82  346.43 281.43  78.80
>> sda2              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>     0.00    0.00   0.00   0.00
>> sda3              0.00     0.40  0.00  2.00     0.00   564.40   282.20
>>     2.14 1074.00 500.00 100.00
>> sdb               0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>     0.00    0.00   0.00   0.00
>> dm-0              0.00     0.00  0.00  0.00     0.00     0.00     0.00
>>     0.00    0.00   0.00   0.00
>> dm-1              0.00     0.00  0.00  2.20     0.00   759.20   345.09
>>     2.57  976.36 454.55 100.00
>> tda               0.00     0.00  1.00  8.60     8.00   756.80    79.67
>>    16.00 1546.04 104.17 100.00
>> xvda              0.00   130.00  1.00  8.60     8.00   756.80    79.67
>>   148.83 10160.42 104.17 100.00
>>
>> * smartclt shows both disks to be in perfect health
>> * hdparm reports decent speeds on raw sda/sdb devices (~160Mb/s)
>> * I was pointed out that the drives are new 4k sector size ones, and
>> I've modified install.img .py files to accomodate that change in few
>> places, and will try to reinstall the machine afterwards
>>
>> Any clue?
>> S.
>>
>> On Mon, Sep 17, 2012 at 2:44 PM, Denis Cardon
>> <denis.cardon@xxxxxxxxxxxxxxxxxxxxxx> wrote:
>>> Hi George,
>>>
>>>> By default XCP 1.1. does not support software raid.
>>>>
>>>> Under certain condition you can use is, but you need to know you
>>>> going
>>>> do deep water. And it's better to know how to swim... I mean,
>>>> understand
>>>> internals of XCP.
>>>>
>>>> If you are 'just a user' - do not use software raid.
>>>>
>>>> If you wish some help here -  say which device (virtual or real) is
>>>> bottleneck.
>>>
>>> first thank you for the dedicated time you spend on the xcp mailing list. 
>>> It is definitly a big advantage for the project to have a dynamic mailing 
>>> list.
>>>
>>> I've come across the slow md raid5 issue with a XCP 1.1 a month ago, and 
>>> didn't have much time to look into it since it is a non production small 
>>> dev/test server and I'm using hardware raid on production servers.
>>>
>>> Like the initial poster's mail, on my dev server io write access goes to 
>>> hell and loadavg skyrockets even with very light disk write. However the 
>>> behavior is not consistent with a standard io staturation since parallel io 
>>> access are not much affected... That is to say that I can launch a "iozone 
>>> -a -n512m -g512m" on a 256MB VM and at the same time a "find /" still goes 
>>> thought quite smoothly... Using vmstat on dom0 sometime I saw up to 60k 
>>> blocks per seconds (4k block I guess) so thoughput seems to be acceptable 
>>> some time. Note : I'm activated cache on the SATA disk (I know, bad idea 
>>> for production but ok for me for dev). I've not experienced such behavior 
>>> with installation with hardware RAID.
>>>
>>> If I have some time this week, I'll try to convert the setup with ext3 
>>> partition (thin provisioning) to be able to make iozone directly on the SR. 
>>> I know md soft RAID is not a supported configuration, and I agree it should 
>>> not be unless the sata disk cache is deactivate (which gives bad 
>>> performance). However in some very small setup or dev setup, it might still 
>>> be interesting.
>>>
>>> Cheers and keep on the good work!
>>>
>>> Denis
>>>
>>>>
>>>>
>>>> On 16.09.2012 20:43, SpamMePlease PleasePlease wrote:
>>>> > All,
>>>> >
>>>> > I've installed XCP 1.1 from latest available ISO on Hetzner's 4S
>>>> > server with usage of md raid 1 and LVM for local storage
>>>> > repository.
>>>> > The problem Im seeing is extremely poor disk performance -
>>>> > importing
>>>> > vm file that was exported in ~3m from another XCP 1.1 (also
>>>> > Hetzner's
>>>> > server, but bit older one, EQ6) takes up to 2 hours, and in
>>>> > meantime
>>>> > the dom0 becomes almost unusable, the load goes up to 2, it has
>>>> > constant 50% (or higher) of wa(it) and is extremely sluggish.
>>>> >
>>>> > Now, I wouldnt mind the sluggishness of dom0, but 2h for vm import
>>>> > seems crazy and unacceptable. I've made multiple installations of
>>>> > the
>>>> > server to make sure I am not doing anything wrong, but the same
>>>> > setup
>>>> > works flawless on older machine. I've tested the drives and they
>>>> > seem
>>>> > to be fine, with up to ~170mb/s throughtput on SATA3.
>>>> >
>>>> > Is there anything else I can check to see if its hardware problem,
>>>> > or
>>>> > anything that could be configured on dom0 to make it operational
>>>> > and
>>>> > usable?
>>>> >
>>>> > Kind regards,
>>>> > S.
>>>> >
>>>> > _______________________________________________
>>>> > Xen-api mailing list
>>>> > Xen-api@xxxxxxxxxxxxx
>>>> > http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api
>>>>
>>>> _______________________________________________
>>>> Xen-api mailing list
>>>> Xen-api@xxxxxxxxxxxxx
>>>> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api
>>>>
>>>
>>> _______________________________________________
>>> Xen-api mailing list
>>> Xen-api@xxxxxxxxxxxxx
>>> http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api

_______________________________________________
Xen-api mailing list
Xen-api@xxxxxxxxxxxxx
http://lists.xen.org/cgi-bin/mailman/listinfo/xen-api


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.