[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-users] Disk i/o on Dom0 suddenly too slow
Adam: P.S. https://github.com/bassu/xen-scripts/commit/20294000bee25fa986adfe284fc3d0c2aa11965f On Wed, Jul 10, 2013 at 11:07 AM, Micky <mickylmartin@xxxxxxxxx> wrote: > First off, thanks for checking. > Secondly, I have managed to resolve the disk dumping issues from LVM > snapshots and preliminary tests are satisfactory. > > Turns out, the default scheduler CFQ was not suited for this workload. > Dom0: echo deadline > /sys/block/sda/queue/scheduler > DomU: echo noop > /sys/block/xvda/queue/scheduler > > If you need reasons, let me know and I'll explain the findings further. > > Since I am using megaraid controller, I looked at LSI recommendations > and tweaked kernel further. > This overall gave me 50% performance boost on cheap Seagate disks. > > No more sluggishness!! > > About the script: > > 1) Good catch. That was indeed the purpose of creating $ddpid. Seems > like a typo. > > 2) We use RHEL/CentOS in production so I have never had such an issue > so didn't consider. But you could do something like: > [[ $(ps -p $(pidof lvdisplay) -o etimes:1=) -gt 300 ]] do something if > it executes for more than 5 mins > > 3) My tests at time showed 512k snapshot chunk size gave more speed to > dd writes. But now after I have switched to deadline scheduler, there > are best results without specifying -c parameter to lvm and dd'ing > with bs=100M. Also, there's no need for ionice since it's works with > CFQ only. > > 4) It takes the same amount of CPU time though. Dumping and > compressing large chunks at the same time with pipes and stdouts can > cause weird issues with FIFOs. IMHO, why risk taking a chance of > having corrupt backups when the only real way in the world to test the > backups is by restoring them! A little certainty of knowing of not > having a dirty backup is worth little more of I/O expense! > > 5) Affirmative. That is why two separate config variables exist there: > BACKUP_DIR and PROCESS_DIR > >> My script is currently much simpler, I simply create the snapshots and >> remove the old ones (no full copies of the snapshots/etc). > > Seems fine. In my case there are more than few nodes and tens of > domains. So the above works pretty well for me as short term backup > strategy! > >> I use backuppc which I've got working for one system to snapshot the VM, >> mount the image, backup with rsync, then umount and remove the snapshot. I >> still like to keep a full image snapshot, and even better to send that raw >> image offsite. > > I use Burp from inside the domu. > >> It would be interesting to hear if you have any additional >> information/comments? > > Well, I started with few small machines and one after another SSDs > died on me either due to a firmware problems or bad blocks. I tried > Crucial, switched to Intel and then Samsung. The latter were ones that > ran fine for the longest time. Now I just use these for personal > laptops. > >> Another scenario I shutdown the VM (using an image file), then simply copy >> the file via some tools into chunks of 100M, then startup the VM. > > Seems fine from administration point of view but people have become > uptime conscious these days. > >> In my opinion, gluster will add a lot of overhead anyway, and maybe is not >> sufficiently stable, and certainly I don't know it well enough to put into >> production. While LVM + MD + DRBD are all simple, low overhead, well >> understood, etc... Each read/write with LVM/MD/DRBD is simply a remap >> process to a physical device read/write, while glusterfs seems more of a >> filesystem with more overhead/complexity. > > And I haven't played much with DRBD so there are only guesses. My > understanding with network based domains' I/O is that unless you have > high speed disks or network equipment or preferably a SAN, the domains > will suffer from I/O latency if there are more than a few. Simply the > gigabit switches and so called 6Gb/s SAS drives aren't sufficient. > >> Running multiple VM's on a single storage device, especially spinning disks, >> seems to be challenging to ensure the right performance with all the >> contention/etc... Using SSD's should be a lot simpler/easier, but LVM >> performance is making that really difficult, and I still don't understand >> why performance is so horrible. At some point, I'll join the LVM list and >> investigate in more detail, but I've got "good enough" performance so far, >> and have other higher priority issues on my list... > > So true. Try the workaround I mentioned above of switching the > scheduler to noop or deadline, and see if you find any improvements. > >> Thanks again. > Quite welcome! _______________________________________________ Xen-users mailing list Xen-users@xxxxxxxxxxxxx http://lists.xen.org/xen-users
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |