Re: [Xen-users] add_random must be set to 1 for me - Archlinux HVM x64 - XenServer 7 Latest Patched

On Wed, Oct 19, 2016 at 2:23 PM, Austin S. Hemmelgarn
<ahferroin7@xxxxxxxxx> wrote:
> Just to clarify, is this 100% utilization of one CPU, or 100% utilization of
> _all_ CPU's involved?  In the first case, that points to an issue with how
> the kernel is dispatching requests, while the second case points elsewhere
> (I'm not sure exactly where though).

So adding 3 more vCPU for a total of 4x on the domU, this just by
itself speeds up the dd write to xvda to 20MB a second.  But, the IO
load also adds a sy aka system cpu time load to almost all the CPU's.
(the CPU load has been sy load the entire time) All in all it sticks
to about 200-300% CPU use at this point.

>> I also get about 2-4MB a second IO.
>> I can make this go away by doing this:
>> echo 1 > /sys/block/xvda/queue/add_random
This is a lie!  I am sorry sorry sorry, I tried it many many many
times before I sent this email and it seems like it was making things
happen. I tried it all last week and today though and now nothing.

The only thing that changes anything at this point is to add the
oflag=direct to dd.  When I add that CPU use dramatically lowers and
write speed goes much higher.  Still the CPU use compared to debian is
no change.

Debian is:  3.16.0-4-amd64

archlinux is:  4.8.4-1-ARCH #1 SMP PREEMPT Sat Oct 22 18:26:57 CEST
2016 x86_64 GNU/Linux

>> Reading
>> https://wiki.archlinux.org/index.php/Improving_performance#Tuning_IO_schedulers
>> The Archlinux wiki still talks about enabling the block-multiqueue
>> layer by using scsi_mod.use_blk_mq=1 but I did not do that so it must
>> be just enabled now or something?
> You should be able to use that same kernel parameter with a value of 0 to
> disable it.  My guess is that the Arch developers toggled the default in the
> kernel config.  If you're using a PV block interface though (unless it's
> PV-SCSI), you may not be able to turn it off since there doesn't appear to
> be any switch (or at least I haven't found one) to disable it for the
> regular Xen PV block interface.
I tried using:

scsi_mod.use_blk_mq=0 and it looks like it did nothing.  My
/proc/cmdline shows that it is there and it should be doing
something....but my scheduler setting in the queue dir still says

I think looking into this that the none is a result of the xen PVHVM
block front driver?

>> If someone could shed some insight why enabling IO generation/linking
>> of timing/entropy data to /dev/random makes the 'system work' this
>> would be great.  Like I said, I am just getting into this and I will
>> be doing more tuning if I can.
ONCE AGAIN, I am wrong here.  add_random does nothing to help me
anymore.  In fact I cannot find any setting under queue that does
anything to help, at least in what I am trying to fix.

I am sorry for this false information.

> I'm kind of surprised at this though, since I've
> got half a dozen domains running fine with blk-mq getting within 1% of the
> disk access speed the host sees (and the host is using blk-mq too, both in
> the device-mapper layer, and the lower block layer).  Some info about the
> rest of the storage stack might be helpful (ie, what type of backing storage
> are you using for the VM disks (on LVM, MD RAID, flat partitions, flat
> files, etc), what Xen driver (raw disk, blktap, something else?), and what
> are you accessing in the VM (raw disk, partition, LVM volume, etc))?

This is a RAID 6 SAS Array.

The kernel that I am using (archlinux: linux), is all vanilla except
for, it looks like, one patch:


That patch changes from 7 to

These are the results from some tests:

 dd if=/dev/zero of=/root/test2 bs=1M oflag=direct
 19.7 MB/s
 CPU:  20%

 dd if=/dev/zero of=/root/test2 bs=1M
 2.5 MB/s
 CPU:  100%

This is sy CPU / system CPU use; so something in the kernel?

One the debian domU almost no CPU is hit.

I am also thinking that 20MB/s is bad in general for my RAID6 as
almost nothing is reading and writing to it.  But one thing at a time
and the only reason I mention it, is that it might help to figure out
this issue.

