[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Disk i/o on Dom0 suddenly too slow



Adam:
P.S. 
https://github.com/bassu/xen-scripts/commit/20294000bee25fa986adfe284fc3d0c2aa11965f

On Wed, Jul 10, 2013 at 11:07 AM, Micky <mickylmartin@xxxxxxxxx> wrote:
> First off, thanks for checking.
> Secondly, I have managed to resolve the disk dumping issues from LVM
> snapshots and preliminary tests are satisfactory.
>
> Turns out, the default scheduler CFQ was not suited for this workload.
> Dom0: echo deadline > /sys/block/sda/queue/scheduler
> DomU: echo noop > /sys/block/xvda/queue/scheduler
>
> If you need reasons, let me know and I'll explain the findings further.
>
> Since I am using megaraid controller, I looked at LSI recommendations
> and tweaked kernel further.
> This overall gave me 50% performance boost on cheap Seagate disks.
>
> No more sluggishness!!
>
> About the script:
>
> 1) Good catch. That was indeed the purpose of creating $ddpid. Seems
> like a typo.
>
> 2) We use RHEL/CentOS in production so I have never had such an issue
> so didn't consider. But you could do something like:
> [[ $(ps -p $(pidof lvdisplay) -o etimes:1=) -gt 300 ]] do something if
> it executes for more than 5 mins
>
> 3) My tests at time showed 512k snapshot chunk size gave more speed to
> dd writes. But now after I have switched to deadline scheduler, there
> are best results without specifying -c parameter to lvm and dd'ing
> with bs=100M. Also, there's no need for ionice since it's works with
> CFQ only.
>
> 4) It takes the same amount of CPU time though. Dumping and
> compressing large chunks at the same time with pipes and stdouts can
> cause weird issues with FIFOs. IMHO, why risk taking a chance of
> having corrupt backups when the only real way in the world to test the
> backups is by restoring them! A little certainty of knowing of not
> having a dirty backup is worth little more of I/O expense!
>
> 5) Affirmative. That is why two separate config variables exist there:
> BACKUP_DIR and PROCESS_DIR
>
>> My script is currently much simpler, I simply create the snapshots and
>> remove the old ones (no full copies of the snapshots/etc).
>
> Seems fine. In my case there are more than few nodes and tens of
> domains. So the above works pretty well for me as short term backup
> strategy!
>
>> I use backuppc which I've got working for one system to snapshot the VM,
>> mount the image, backup with rsync, then umount and remove the snapshot. I
>> still like to keep a full image snapshot, and even better to send that raw
>> image offsite.
>
> I use Burp from inside the domu.
>
>> It would be interesting to hear if you have any additional 
>> information/comments?
>
> Well, I started with few small machines and one after another SSDs
> died on me either due to a firmware problems or bad blocks. I tried
> Crucial, switched to Intel and then Samsung. The latter were ones that
> ran fine for the longest time. Now I just use these for personal
> laptops.
>
>> Another scenario I shutdown the VM (using an image file), then simply copy
>> the file via some tools into chunks of 100M, then startup the VM.
>
> Seems fine from administration point of view but people have become
> uptime conscious these days.
>
>> In my opinion, gluster will add a lot of overhead anyway, and maybe is not
>> sufficiently stable, and certainly I don't know it well enough to put into
>> production. While LVM + MD + DRBD are all simple, low overhead, well
>> understood, etc... Each read/write with LVM/MD/DRBD is simply a remap
>> process to a physical device read/write, while glusterfs seems more of a
>> filesystem with more overhead/complexity.
>
> And I haven't played much with DRBD so there are only guesses. My
> understanding with network based domains' I/O is that unless you have
> high speed disks or network equipment or preferably a SAN, the domains
> will suffer from I/O latency if there are more than a few. Simply the
> gigabit switches and so called 6Gb/s SAS drives aren't sufficient.
>
>> Running multiple VM's on a single storage device, especially spinning disks,
>> seems to be challenging to ensure the right performance with all the
>> contention/etc... Using SSD's should be a lot simpler/easier, but LVM
>> performance is making that really difficult, and I still don't understand
>> why performance is so horrible. At some point, I'll join the LVM list and
>> investigate in more detail, but I've got "good enough" performance so far,
>> and have other higher priority issues on my list...
>
> So true. Try the workaround I mentioned above of switching the
> scheduler to noop or deadline, and see if you find any improvements.
>
>> Thanks again.
> Quite welcome!

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.