[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] add_random must be set to 1 for me - Archlinux HVM x64 - XenServer 7 Latest Patched

On 2016-10-25 13:40, WebDawg wrote:
On Tue, Oct 25, 2016 at 6:29 AM, Austin S. Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
On 2016-10-24 14:53, WebDawg wrote:

On Wed, Oct 19, 2016 at 2:23 PM, Austin S. Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
So adding 3 more vCPU for a total of 4x on the domU, this just by
itself speeds up the dd write to xvda to 20MB a second.  But, the IO
load also adds a sy aka system cpu time load to almost all the CPU's.
(the CPU load has been sy load the entire time) All in all it sticks
to about 200-300% CPU use at this point.

The only thing that changes anything at this point is to add the
oflag=direct to dd.  When I add that CPU use dramatically lowers and
write speed goes much higher.  Still the CPU use compared to debian is
no change.

OK, this suggests the issue is somewhere in the caching in the guest OS.  My
first thoughts knowing that are:
1. How much RAM does the VM have?
2. What value does `cat /proc/sys/vm/vfs_cache_pressure` show?
3. Are you doing anything with memory ballooning?

 /proc/sys/vm/vfs_cache_pressure shows a value of 100 on both domU's.



So I do not know what the deal is but when you mentioned ballooning I
was looking at the memory settings of the domU.  These are the
settings that I had:

Static:  128 MiB/ 2 GiB
Dynamic:  2 GiB/ 4 GiB

I cannot even set this at command line.  I wanted to replicate the
problem after I fixed it and tried this:

xe vm-memory-limits-set dynamic-max=400000000 dynamic-min=200000000
static-max=200000000 static-min=16777216 name-label=domU-name


Error parameters: Memory limits must satisfy: static_min <=
dynamic_min <= dynamic_max <= static_max

The dynamic MAX was bigger then the static MAX, which is impossible to
set, but somehow happened.

I do not know if it happened from import from XenServer 6.5 to
XenServer 7, or the multiple software products I was using to manage
it or just something got corrupted.

So I have been checking it all out after setting everything to this:

Static:  128 MiB/ 2 GiB
Dynamic:  2 GiB/ 2 GiB

It is all working as expected now!

Like I said...I cannot understand how the dynamic MAX was bigger then
the static MAX when XenServer does not allow you to set this.  Does
anyone have any experience in setting these bad settings and can
explain why I was having such bad CPU use issues?
I have no idea how it could have happened, or why it was causing what you were seeing to happen.

Debian is:  3.16.0-4-amd64

archlinux is:  4.8.4-1-ARCH #1 SMP PREEMPT Sat Oct 22 18:26:57 CEST
2016 x86_64 GNU/Linux

scsi_mod.use_blk_mq=0 and it looks like it did nothing.  My
/proc/cmdline shows that it is there and it should be doing
something....but my scheduler setting in the queue dir still says

I think looking into this that the none is a result of the xen PVHVM
block front driver?

Probably.  I've not been able to find any way to turn it off for the Xen PV
block device driver (which doesn't really surprise me, xen-blkfront has
multiple (virtual) 'hardware' queues, and stuff like that is exactly what
blk-mq was designed to address (although it kind of sucks for anything like
that except NVMe devices right now)).

If I remember too, the kernel line stuff to tune this, is not in the
docs yet :/  At least the ones I was looking at.
Well, I've been looking straight at the code in Linux, and can't find it either (although I could just be missing it, C is not my language of choice, and I have even less experience reading kernel code).

If someone could shed some insight why enabling IO generation/linking
of timing/entropy data to /dev/random makes the 'system work' this
would be great.  Like I said, I am just getting into this and I will
be doing more tuning if I can.

ONCE AGAIN, I am wrong here.  add_random does nothing to help me
anymore.  In fact I cannot find any setting under queue that does
anything to help, at least in what I am trying to fix.

I am sorry for this false information.

I'm kind of surprised at this though, since I've
got half a dozen domains running fine with blk-mq getting within 1% of
disk access speed the host sees (and the host is using blk-mq too, both
the device-mapper layer, and the lower block layer).  Some info about the
rest of the storage stack might be helpful (ie, what type of backing
are you using for the VM disks (on LVM, MD RAID, flat partitions, flat
files, etc), what Xen driver (raw disk, blktap, something else?), and
are you accessing in the VM (raw disk, partition, LVM volume, etc))?

This is a RAID 6 SAS Array.

The kernel that I am using (archlinux: linux), is all vanilla except
for, it looks like, one patch:


That patch changes from 7 to

These are the results from some tests:

 dd if=/dev/zero of=/root/test2 bs=1M oflag=direct
 19.7 MB/s
 CPU:  20%

 dd if=/dev/zero of=/root/test2 bs=1M
 2.5 MB/s
 CPU:  100%

This brings one other thing to mind: What filesystem are you using in the
domU?  My guess is that this is some kind of interaction between the
filesystem and the blk-mq code.  One way to check that would be to try
writing directly to a disk from the domU instead of through the filesystem.

I am using ext4.
Even aside from the fact that you figured out what was causing the issue, the fact that your using ext4 pretty much completely rules out the filesystem as a potential cause.

This is sy CPU / system CPU use; so something in the kernel?

One the debian domU almost no CPU is hit.

I am also thinking that 20MB/s is bad in general for my RAID6 as
almost nothing is reading and writing to it.  But one thing at a time
and the only reason I mention it, is that it might help to figure out
this issue.

So, now that it looks like I have figured out what is wrong, but not
figured out how it got that wrong:  Does anyone have any pointers for
increasing the speed of the Local Storage array?  I know I can add the
backup battery, but even without that...a SAS RAID6 running at 20mb a
second in domU seems so slow....
It really depends on a bunch of factors. Things ranging like how many disks in the array, how many arrays is the HBA managing, is it set up for multipath, how much RAM dom0 and the domU have, what kind of memory bandwidth you have, and even how well the HBA driver is written can have an impact.

In dom0 I get about 20-21 mb a second with dd oflag=direct
I would expect this to be better without oflag=direct. Direct I/O tends to slow things down because it completely bypasses the page cache _and_ it pretty much forces things to be synchronous (which means it slows down writes _way_ more than it slows down reads).

As a couple of points of comparison, using the O_DIRECT flag on the output file (which is what oflag=direct does in dd) cuts write speed by about 25% on the high-end SSD I have in my laptop, and roughly 30% on the consumer grade SATA drives in my home server system.

And, as a general rule, performance with O_DIRECT isn't a good reference point for most cases, since very little software uses it without being configured to do so, and pretty much everything in widespread use that can use it is set up so it's opt-in (and usually used with AIO as well, which dd can't emulate).

I think XenServer or Xen has some type of disk io load balancing stuff
or am I wrong?
Not that I know of. All the I/O scheduling is done by Domain 0 (or at least all the scheduling on the host side). You might try booting dom0 with blk-mq disabled if it isn't already.

Xen-users mailing list



Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.