[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/4] xen/arm: update the docs about heterogeneous computing



On Mon, 19 Feb 2018, Julien Grall wrote:
> On 19/02/2018 20:28, Stefano Stabellini wrote:
> > On Sat, 17 Feb 2018, Julien Grall wrote:
> > > Hi,
> > > 
> > > On 17/02/2018 00:31, Stefano Stabellini wrote:
> > > > On Fri, 16 Feb 2018, Julien Grall wrote:
> > > > > On 16/02/2018 21:15, Stefano Stabellini wrote:
> > > > > > On Fri, 16 Feb 2018, Julien Grall wrote:
> > > > > > > On 16/02/2018 20:50, Stefano Stabellini wrote:
> > > > > > > > On Fri, 16 Feb 2018, Julien Grall wrote:
> > > > > > > > > Hi Stefano,
> > > > > > > > > 
> > > > > > > > > On 15/02/18 23:17, Stefano Stabellini wrote:
> > > > > > > > > > Update the documentation of the hmp-unsafe option to explain
> > > > > > > > > > how
> > > > > > > > > > to
> > > > > > > > > > use
> > > > > > > > > > it safely, together with the right cpu affinity setting, on
> > > > > > > > > > big.LITTLE
> > > > > > > > > > systems.
> > > > > > > > > > 
> > > > > > > > > > Also update the warning message to point users to the docs.
> > > > > > > > > > 
> > > > > > > > > > Signed-off-by: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> > > > > > > > > > CC: jbeulich@xxxxxxxx
> > > > > > > > > > CC: konrad.wilk@xxxxxxxxxx
> > > > > > > > > > CC: tim@xxxxxxx
> > > > > > > > > > CC: wei.liu2@xxxxxxxxxx
> > > > > > > > > > CC: andrew.cooper3@xxxxxxxxxx
> > > > > > > > > > CC: George.Dunlap@xxxxxxxxxxxxx
> > > > > > > > > > CC: ian.jackson@xxxxxxxxxxxxx
> > > > > > > > > > 
> > > > > > > > > > ---
> > > > > > > > > >       docs/misc/xen-command-line.markdown | 10 +++++++++-
> > > > > > > > > >       xen/arch/arm/smpboot.c              |  9 +++++----
> > > > > > > > > >       2 files changed, 14 insertions(+), 5 deletions(-)
> > > > > > > > > > 
> > > > > > > > > > diff --git a/docs/misc/xen-command-line.markdown
> > > > > > > > > > b/docs/misc/xen-command-line.markdown
> > > > > > > > > > index 2184cb9..a1ebeea 100644
> > > > > > > > > > --- a/docs/misc/xen-command-line.markdown
> > > > > > > > > > +++ b/docs/misc/xen-command-line.markdown
> > > > > > > > > > @@ -1007,7 +1007,15 @@ Control Xens use of the APEI Hardware
> > > > > > > > > > Error
> > > > > > > > > > Source
> > > > > > > > > > Table, should one be found.
> > > > > > > > > >         Say yes at your own risk if you want to enable
> > > > > > > > > > heterogenous
> > > > > > > > > > computing
> > > > > > > > > >       (such as big.LITTLE). This may result to an unstable
> > > > > > > > > > and
> > > > > > > > > > insecure
> > > > > > > > > > -platform. When the option is disabled (default), CPUs that
> > > > > > > > > > are
> > > > > > > > > > not
> > > > > > > > > > +platform, unless you manually specify the cpu affinity of
> > > > > > > > > > all
> > > > > > > > > > domains
> > > > > > > > > > so
> > > > > > > > > > +that all vcpus are scheduled on the same class of pcpus
> > > > > > > > > > (big or
> > > > > > > > > > LITTLE
> > > > > > > > > > +but not both). vcpu migration between big cores and LITTLE
> > > > > > > > > > cores is
> > > > > > > > > > not
> > > > > > > > > > +supported. Thus, if the first 4 pcpus are big and the last
> > > > > > > > > > 4
> > > > > > > > > > are
> > > > > > > > > > LITTLE,
> > > > > > > > > > +all domains need to have either cpus = "0-3" or cpus =
> > > > > > > > > > "4-7" in
> > > > > > > > > > their
> > > > > > > > > > VM
> > > > > > > > > > +config. Moreover, dom0_vcpus_pin needs to be passed on the
> > > > > > > > > > Xen
> > > > > > > > > > command
> > > > > > > > > > +line.
> > > > > > > > > 
> > > > > > > > > In your example here you suggest to have all the vCPUs of a
> > > > > > > > > guest
> > > > > > > > > to
> > > > > > > > > either on
> > > > > > > > > big or LITTLE cores. How about giving an example where the
> > > > > > > > > guest
> > > > > > > > > can
> > > > > > > > > have
> > > > > > > > > 2
> > > > > > > > > LITTLE vCPUs and one big vCPU?
> > > > > > > > 
> > > > > > > > I would rather discourage it at the moment, given that it
> > > > > > > > requires
> > > > > > > > more
> > > > > > > > complex cpu affinity settings, or vcpu pinning. Also, I am
> > > > > > > > afraid
> > > > > > > > that
> > > > > > > > without matching corresponding topology information on the guest
> > > > > > > > device
> > > > > > > > tree, guests might not work as expected in such a scenario.
> > > > > > > > 
> > > > > > > > What do you think?
> > > > > > > 
> > > > > > > You already know my view on this. I would rather strongly
> > > > > > > discourage
> > > > > > > anyone
> > > > > > > pinning all vCPUs of a domain to big cores. We should avoid to
> > > > > > > provide
> > > > > > > shortcuts to use that could have potentially damageable impact on
> > > > > > > their
> > > > > > > platform without telling them.
> > > > > > 
> > > > > > Do you have a link to a doc somewhere that provides more details
> > > > > > about
> > > > > > this? We could add a link to it here to inform users. It would be
> > > > > > useful.
> > > > > 
> > > > > This is quite well described in
> > > > > https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html#CPU-Allocation
> > > > > see
> > > > > "cpus".
> > > > 
> > > > OK, I'll add the link in a new big.LITTLE doc. Also, do you have any
> > > > documentation or link about big core being potentially damaging? It
> > > > would be good to provide information about that too in the big.LITTLE
> > > > doc.
> > > 
> > > I don't have specific documentation to point on it but I would quite
> > > interesting to know what is your documentation regarding why always
> > > running on
> > > big is safe.
> > 
> > Hi Julien,
> > 
> > If I go on Amazon, I buy a big.LITTLE board, then it overheats and
> > breaks due to the big cores, I would call this a malfunction and return
> > the item expecting a refund.
> 
> Why would they refund you? You run software that has not been proofed with
> their board. The vendor will nicely tell you to look for another software and
> will not give you the refund.

I guess it depends on the board. I don't think many dev boards (like
Pine64) require you to use a specific kernel version or hypevisor
version on them (I hope!).


> > Unless the hardware vendor states explicitly that the big cores cannot
> > be used all the time, then this use-case falls within the reasonable
> > usage of the platform. In fact, it is using a piece of the hardware the
> > way it was designed to be used. If the hardware itself is unstable, it
> > should be documented in the vendor's docs, and I would like to add a
> > link to it so that users are appropriately warned.
> In normal circumstance, you have software controlling the overheat. But in
> case of Xen who is going to do that job? If it is the firmware and assuming it
> does not need to be taught, then this is likely going to work out-of-box with
> Xen. If it is the OS/Hypervisor, then you are going to get into trouble.
> 
> As you can see we already have different expectation on how the hardware
> should behave.

I am starting to see your point. What if we add the following statement,
it should non-controversial and informative:

"Big cores are more powerful than LITTLE cores, but often use much more
power. Typically, they are recommended for burst activity, especially in
battery powered environments. Please check your vendor's big.LITTLE
and power management documentation."

As I was thinking about this, I realized that the same issue could occur
even with just the first patch
(https://marc.info/?l=xen-devel&m=151873668223723). If the first cpu
type is big, we would default to use only big cpus all the time, right?

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.