[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/4] xen/arm: update the docs about heterogeneous computing





On 19/02/2018 20:28, Stefano Stabellini wrote:
On Sat, 17 Feb 2018, Julien Grall wrote:
Hi,

On 17/02/2018 00:31, Stefano Stabellini wrote:
On Fri, 16 Feb 2018, Julien Grall wrote:
On 16/02/2018 21:15, Stefano Stabellini wrote:
On Fri, 16 Feb 2018, Julien Grall wrote:
On 16/02/2018 20:50, Stefano Stabellini wrote:
On Fri, 16 Feb 2018, Julien Grall wrote:
Hi Stefano,

On 15/02/18 23:17, Stefano Stabellini wrote:
Update the documentation of the hmp-unsafe option to explain how
to
use
it safely, together with the right cpu affinity setting, on
big.LITTLE
systems.

Also update the warning message to point users to the docs.

Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
CC: jbeulich@xxxxxxxx
CC: konrad.wilk@xxxxxxxxxx
CC: tim@xxxxxxx
CC: wei.liu2@xxxxxxxxxx
CC: andrew.cooper3@xxxxxxxxxx
CC: George.Dunlap@xxxxxxxxxxxxx
CC: ian.jackson@xxxxxxxxxxxxx

---
      docs/misc/xen-command-line.markdown | 10 +++++++++-
      xen/arch/arm/smpboot.c              |  9 +++++----
      2 files changed, 14 insertions(+), 5 deletions(-)

diff --git a/docs/misc/xen-command-line.markdown
b/docs/misc/xen-command-line.markdown
index 2184cb9..a1ebeea 100644
--- a/docs/misc/xen-command-line.markdown
+++ b/docs/misc/xen-command-line.markdown
@@ -1007,7 +1007,15 @@ Control Xens use of the APEI Hardware
Error
Source
Table, should one be found.
        Say yes at your own risk if you want to enable
heterogenous
computing
      (such as big.LITTLE). This may result to an unstable and
insecure
-platform. When the option is disabled (default), CPUs that are
not
+platform, unless you manually specify the cpu affinity of all
domains
so
+that all vcpus are scheduled on the same class of pcpus (big or
LITTLE
+but not both). vcpu migration between big cores and LITTLE
cores is
not
+supported. Thus, if the first 4 pcpus are big and the last 4
are
LITTLE,
+all domains need to have either cpus = "0-3" or cpus = "4-7" in
their
VM
+config. Moreover, dom0_vcpus_pin needs to be passed on the Xen
command
+line.

In your example here you suggest to have all the vCPUs of a guest
to
either on
big or LITTLE cores. How about giving an example where the guest
can
have
2
LITTLE vCPUs and one big vCPU?

I would rather discourage it at the moment, given that it requires
more
complex cpu affinity settings, or vcpu pinning. Also, I am afraid
that
without matching corresponding topology information on the guest
device
tree, guests might not work as expected in such a scenario.

What do you think?

You already know my view on this. I would rather strongly discourage
anyone
pinning all vCPUs of a domain to big cores. We should avoid to provide
shortcuts to use that could have potentially damageable impact on
their
platform without telling them.

Do you have a link to a doc somewhere that provides more details about
this? We could add a link to it here to inform users. It would be
useful.

This is quite well described in
https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html#CPU-Allocation see
"cpus".

OK, I'll add the link in a new big.LITTLE doc. Also, do you have any
documentation or link about big core being potentially damaging? It
would be good to provide information about that too in the big.LITTLE
doc.

I don't have specific documentation to point on it but I would quite
interesting to know what is your documentation regarding why always running on
big is safe.

Hi Julien,

If I go on Amazon, I buy a big.LITTLE board, then it overheats and
breaks due to the big cores, I would call this a malfunction and return
the item expecting a refund.

Why would they refund you? You run software that has not been proofed with their board. The vendor will nicely tell you to look for another software and will not give you the refund.


Unless the hardware vendor states explicitly that the big cores cannot
be used all the time, then this use-case falls within the reasonable
usage of the platform. In fact, it is using a piece of the hardware the
way it was designed to be used. If the hardware itself is unstable, it
should be documented in the vendor's docs, and I would like to add a
link to it so that users are appropriately warned.
In normal circumstance, you have software controlling the overheat. But in case of Xen who is going to do that job? If it is the firmware and assuming it does not need to be taught, then this is likely going to work out-of-box with Xen. If it is the OS/Hypervisor, then you are going to get into trouble.

As you can see we already have different expectation on how the hardware should behave.



I provided you quite a few insights why this may not safe on all
platforms and we all remember those phones burning you when playing game or
watching a video. So I don't feel Xen Project should encourage those setups by
default.

I would recommend you to read the thread about big.LITTLE in Xen from 2016:
https://lists.xenproject.org/archives/html/xen-devel/2016-09/msg01802.html

A few interesting things from that conversation:

"big.LITTLE is a generic term to have 'power hungry and powerful core
powerful' (big) with slower and battery-saving cores (LITTLE)."

"The use case of big.LITTLE is big cores are used for short period of burst
and little core are used for the rest (e.g listening audio, fetching mail...).
If you want to reduce latency when switch between big and little CPUs, you may
want to put them within the same cluster."

These two sentences are good, and I copy/paste them into the new doc,
but still they don't clarify the safety of the big cores usage.

You assume the software stack is correct. However, we clearly no that CPUFrequency/Power management on Xen Arm is not there... From that you can't even assume the basic functionality of the board will function properly when running Xen.

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.