[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Disk i/o on Dom0 suddenly too slow


On Wed, Jul 10, 2013 at 11:07 AM, Micky <mickylmartin@xxxxxxxxx> wrote:
> First off, thanks for checking.
> Secondly, I have managed to resolve the disk dumping issues from LVM
> snapshots and preliminary tests are satisfactory.
> Turns out, the default scheduler CFQ was not suited for this workload.
> Dom0: echo deadline > /sys/block/sda/queue/scheduler
> DomU: echo noop > /sys/block/xvda/queue/scheduler
> If you need reasons, let me know and I'll explain the findings further.
> Since I am using megaraid controller, I looked at LSI recommendations
> and tweaked kernel further.
> This overall gave me 50% performance boost on cheap Seagate disks.
> No more sluggishness!!
> About the script:
> 1) Good catch. That was indeed the purpose of creating $ddpid. Seems
> like a typo.
> 2) We use RHEL/CentOS in production so I have never had such an issue
> so didn't consider. But you could do something like:
> [[ $(ps -p $(pidof lvdisplay) -o etimes:1=) -gt 300 ]] do something if
> it executes for more than 5 mins
> 3) My tests at time showed 512k snapshot chunk size gave more speed to
> dd writes. But now after I have switched to deadline scheduler, there
> are best results without specifying -c parameter to lvm and dd'ing
> with bs=100M. Also, there's no need for ionice since it's works with
> CFQ only.
> 4) It takes the same amount of CPU time though. Dumping and
> compressing large chunks at the same time with pipes and stdouts can
> cause weird issues with FIFOs. IMHO, why risk taking a chance of
> having corrupt backups when the only real way in the world to test the
> backups is by restoring them! A little certainty of knowing of not
> having a dirty backup is worth little more of I/O expense!
> 5) Affirmative. That is why two separate config variables exist there:
>> My script is currently much simpler, I simply create the snapshots and
>> remove the old ones (no full copies of the snapshots/etc).
> Seems fine. In my case there are more than few nodes and tens of
> domains. So the above works pretty well for me as short term backup
> strategy!
>> I use backuppc which I've got working for one system to snapshot the VM,
>> mount the image, backup with rsync, then umount and remove the snapshot. I
>> still like to keep a full image snapshot, and even better to send that raw
>> image offsite.
> I use Burp from inside the domu.
>> It would be interesting to hear if you have any additional 
>> information/comments?
> Well, I started with few small machines and one after another SSDs
> died on me either due to a firmware problems or bad blocks. I tried
> Crucial, switched to Intel and then Samsung. The latter were ones that
> ran fine for the longest time. Now I just use these for personal
> laptops.
>> Another scenario I shutdown the VM (using an image file), then simply copy
>> the file via some tools into chunks of 100M, then startup the VM.
> Seems fine from administration point of view but people have become
> uptime conscious these days.
>> In my opinion, gluster will add a lot of overhead anyway, and maybe is not
>> sufficiently stable, and certainly I don't know it well enough to put into
>> production. While LVM + MD + DRBD are all simple, low overhead, well
>> understood, etc... Each read/write with LVM/MD/DRBD is simply a remap
>> process to a physical device read/write, while glusterfs seems more of a
>> filesystem with more overhead/complexity.
> And I haven't played much with DRBD so there are only guesses. My
> understanding with network based domains' I/O is that unless you have
> high speed disks or network equipment or preferably a SAN, the domains
> will suffer from I/O latency if there are more than a few. Simply the
> gigabit switches and so called 6Gb/s SAS drives aren't sufficient.
>> Running multiple VM's on a single storage device, especially spinning disks,
>> seems to be challenging to ensure the right performance with all the
>> contention/etc... Using SSD's should be a lot simpler/easier, but LVM
>> performance is making that really difficult, and I still don't understand
>> why performance is so horrible. At some point, I'll join the LVM list and
>> investigate in more detail, but I've got "good enough" performance so far,
>> and have other higher priority issues on my list...
> So true. Try the workaround I mentioned above of switching the
> scheduler to noop or deadline, and see if you find any improvements.
>> Thanks again.
> Quite welcome!

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.