[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-users] I/O performance problem using LVM mirrors to back phy: devices

So, we just moved to some much faster hardware.   intel q6600 CPU, 8Gb
unbuffered ECC, ICH7 sata (2x1TB disks)   - and we were irritated
and puzzled to find that the new setup had really, really slow I/O.

The odd thing is that the performance is fine if you just mount the LV
directly from the Dom0... but if you xm block-attach it to the Dom0 
and then mount it, you get 1/10th the speed.  

we are running CentOS 5.1, kernel 2.6.18-53.1.14.el5

full writeup:

We noted that the performance of mirrored logical volumes accessed
through xenblk was about 1/10th that of non-mirrored LVs, or of LVs
mirrored with the --corelog option.  Mirrored LVs performed fine when
accessed normally within the dom0, but performance dropped when
accessed via xm block-attach.  This was, to our minds, ridiculous.

First, we created two logical volumes in the volume group "test":
one with mirroring and a mirror log and one with the --corelog option.

 # lvcreate -m 1 -L 2G -n test_mirror test
 # lvcreate -m 1 --corelog -L 2G -n test_core test

Then we made filesystems and mounted them:

 # mke2fs -j /dev/test/test*
 # mkdir -p /mnt/test/mirror
 # mkdir -p /mnt/test/core
 # mount /dev/test/test_mirror /mnt/test/mirror

Next we started oprofile, instructing it to count BUS_IO_WAIT events:

 # opcontrol --start --event=BUS_IO_WAIT:500:0xc0

Then we ran bonnie on each device in sequence, stopping oprofile and
saving the output each time.

 # bonnie++ -d /mnt/test/mirror
 # opcontrol --stop
 # opcontrol --save=mirrorlog
 # opcontrol --reset

The LV with the corelog displayed negligible iowait, as expected.
However, the other experienced quite a bit:

# opreport -t 1 --symbols session:iowait_mirror
warning: /ahci could not be found.
CPU: Core 2, speed 2400.08 MHz (estimated)
Counted BUS_IO_WAIT events (IO requests waiting in the bus queue) with a unit 
mask of 0xc0 (All cores) count 500
Processes with a thread ID of 0
Processes with a thread ID of 463
Processes with a thread ID of 14185
samples  %        samples  %        samples  %        app name                 
symbol name
32       91.4286  15       93.7500  0              0  
xen-syms-2.6.18-53.1.14.el5.debug pit_read_counter
1         2.8571  0              0  0              0  ahci                     
(no symbols)
1         2.8571  0              0  0              0  vmlinux                  
1         2.8571  0              0  0              0  vmlinux                  

>From this, it seemed clear that the culprit was in the
pit_read_counter function.

Any ideas on where to take it from here?  

Credit to Chris Takemura <chris@xxxxxxxxx>  for repeating the problem with
oprofile, and the writeup

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.