After some additional debugging, I found out that on the machine
the speed is perfect when running stock CentOS 6 kernel (2.6.32).
When using a 4.9.x of 4.18.x kernel, the speed is degraded again.
[Adding Roger]
On Mon, 2018-10-08 at 13:10 +0200, Jean-Louis Dupond wrote:
Hi,
We are hitting some I/O limitation on some of our Xen hypervisors.
The hypervisors are running CentOS 6 with Xen 4.6.6-12.el6 and
4.9.105+
kernels.
The hypervisors are attached with 10G network to the SAN network.
And
there is no congestion at all.
Storage is exported via iSCSI and we use multipathd for failover.
Now we see a performance of +-200MB/sec write speed, but only a poor
20-30mb/sec read speed on a LUN on the SAN.
This is while testing this on dom0. Same speeds on domU.
If I do the same test on a Xen 4.4.4-34.el6 hypervisor to the same
LUN
(but attached with 1G), I max out the link (100MB read/write).
Right. But, if I've understood correctly, you're changing two things (I
mean between the two tests), i.e., the hypervisor and the NIC.
(BTW, is dom0 kernel the same, or does that also change?).
This makes it harder to narrow things down to where the problem could
be.
What would be useful to see would be the results of running:
- Xen 4.4.4-34.el6, with 4.9.105+ dom0 kernel on the 10G NIC / host,
and compare this with Xen 4.6.6-12.el6, with the same kernel on the
same NIC / host;
- Xen 4.6.6-12.el6, with 4.9.105+ dom0 kernel on the 1G NIC / host,
and compare this with Xen 4.4.4-34.el6, with the same kernel on the
same NIC / host.
This will tell us, if there is a regression between Xen 4.4.x and Xen
4.6.x (as that is _the_only_ thing that varies).
And this is assuming the versions of the dom0 kernels, and of all the
other components involved are the same. If they're not, we need to go
checking, changing one component at a time.
So it really looks like the Xen 4.6 hypervisors are reaching some
bottleneck. But we couldn't locate it yet :)
There seems to be issues, but from the tests you've performed so far, I
don't think we can conclude the problem is in Xen. And we need to know
at least where the problem most likely is, in order to have any chance
to find it! :-)
The hypervisor's dom0 has 8 vCPU and 8GB RAM, which should be plenty!
Probably. But, just in case, have you tried increasing, e.g., the
number of dom0's vcpus? Is things like vcpu-pinning or similar features
being used? Is the host a NUMA box? (Or, more generally, what are the
characteristics of the host[s]?)
Regards,
Dario
_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-users