[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 4/4] xen/arm: update the docs about heterogeneous computing



On Mon, 19 Feb 2018, Julien Grall wrote:
> On 19/02/2018 21:05, Stefano Stabellini wrote:
> > On Mon, 19 Feb 2018, Julien Grall wrote:
> > > On 19/02/2018 20:28, Stefano Stabellini wrote:
> > > > On Sat, 17 Feb 2018, Julien Grall wrote:
> > > > > Hi,
> > > > > 
> > > > > On 17/02/2018 00:31, Stefano Stabellini wrote:
> > > > > > On Fri, 16 Feb 2018, Julien Grall wrote:
> > > > > > > On 16/02/2018 21:15, Stefano Stabellini wrote:
> > > > > > > > On Fri, 16 Feb 2018, Julien Grall wrote:
> > > > > > > > > On 16/02/2018 20:50, Stefano Stabellini wrote:
> > > > > > > > > > On Fri, 16 Feb 2018, Julien Grall wrote:
> > > > > > > > > > > Hi Stefano,
> > > > > > > > > > > 
> > > > > > > > > > > On 15/02/18 23:17, Stefano Stabellini wrote:
> > > > > > > > > > > > Update the documentation of the hmp-unsafe option to
> > > > > > > > > > > > explain
> > > > > > > > > > > > how
> > > > > > > > > > > > to
> > > > > > > > > > > > use
> > > > > > > > > > > > it safely, together with the right cpu affinity setting,
> > > > > > > > > > > > on
> > > > > > > > > > > > big.LITTLE
> > > > > > > > > > > > systems.
> > > > > > > > > > > > 
> > > > > > > > > > > > Also update the warning message to point users to the
> > > > > > > > > > > > docs.
> > > > > > > > > > > > 
> > > > > > > > > > > > Signed-off-by: Stefano Stabellini
> > > > > > > > > > > > <sstabellini@xxxxxxxxxx>
> > > > > > > > > > > > CC: jbeulich@xxxxxxxx
> > > > > > > > > > > > CC: konrad.wilk@xxxxxxxxxx
> > > > > > > > > > > > CC: tim@xxxxxxx
> > > > > > > > > > > > CC: wei.liu2@xxxxxxxxxx
> > > > > > > > > > > > CC: andrew.cooper3@xxxxxxxxxx
> > > > > > > > > > > > CC: George.Dunlap@xxxxxxxxxxxxx
> > > > > > > > > > > > CC: ian.jackson@xxxxxxxxxxxxx
> > > > > > > > > > > > 
> > > > > > > > > > > > ---
> > > > > > > > > > > >        docs/misc/xen-command-line.markdown | 10
> > > > > > > > > > > > +++++++++-
> > > > > > > > > > > >        xen/arch/arm/smpboot.c              |  9
> > > > > > > > > > > > +++++----
> > > > > > > > > > > >        2 files changed, 14 insertions(+), 5 deletions(-)
> > > > > > > > > > > > 
> > > > > > > > > > > > diff --git a/docs/misc/xen-command-line.markdown
> > > > > > > > > > > > b/docs/misc/xen-command-line.markdown
> > > > > > > > > > > > index 2184cb9..a1ebeea 100644
> > > > > > > > > > > > --- a/docs/misc/xen-command-line.markdown
> > > > > > > > > > > > +++ b/docs/misc/xen-command-line.markdown
> > > > > > > > > > > > @@ -1007,7 +1007,15 @@ Control Xens use of the APEI
> > > > > > > > > > > > Hardware
> > > > > > > > > > > > Error
> > > > > > > > > > > > Source
> > > > > > > > > > > > Table, should one be found.
> > > > > > > > > > > >          Say yes at your own risk if you want to enable
> > > > > > > > > > > > heterogenous
> > > > > > > > > > > > computing
> > > > > > > > > > > >        (such as big.LITTLE). This may result to an
> > > > > > > > > > > > unstable
> > > > > > > > > > > > and
> > > > > > > > > > > > insecure
> > > > > > > > > > > > -platform. When the option is disabled (default), CPUs
> > > > > > > > > > > > that
> > > > > > > > > > > > are
> > > > > > > > > > > > not
> > > > > > > > > > > > +platform, unless you manually specify the cpu affinity
> > > > > > > > > > > > of
> > > > > > > > > > > > all
> > > > > > > > > > > > domains
> > > > > > > > > > > > so
> > > > > > > > > > > > +that all vcpus are scheduled on the same class of pcpus
> > > > > > > > > > > > (big or
> > > > > > > > > > > > LITTLE
> > > > > > > > > > > > +but not both). vcpu migration between big cores and
> > > > > > > > > > > > LITTLE
> > > > > > > > > > > > cores is
> > > > > > > > > > > > not
> > > > > > > > > > > > +supported. Thus, if the first 4 pcpus are big and the
> > > > > > > > > > > > last
> > > > > > > > > > > > 4
> > > > > > > > > > > > are
> > > > > > > > > > > > LITTLE,
> > > > > > > > > > > > +all domains need to have either cpus = "0-3" or cpus =
> > > > > > > > > > > > "4-7" in
> > > > > > > > > > > > their
> > > > > > > > > > > > VM
> > > > > > > > > > > > +config. Moreover, dom0_vcpus_pin needs to be passed on
> > > > > > > > > > > > the
> > > > > > > > > > > > Xen
> > > > > > > > > > > > command
> > > > > > > > > > > > +line.
> > > > > > > > > > > 
> > > > > > > > > > > In your example here you suggest to have all the vCPUs of
> > > > > > > > > > > a
> > > > > > > > > > > guest
> > > > > > > > > > > to
> > > > > > > > > > > either on
> > > > > > > > > > > big or LITTLE cores. How about giving an example where the
> > > > > > > > > > > guest
> > > > > > > > > > > can
> > > > > > > > > > > have
> > > > > > > > > > > 2
> > > > > > > > > > > LITTLE vCPUs and one big vCPU?
> > > > > > > > > > 
> > > > > > > > > > I would rather discourage it at the moment, given that it
> > > > > > > > > > requires
> > > > > > > > > > more
> > > > > > > > > > complex cpu affinity settings, or vcpu pinning. Also, I am
> > > > > > > > > > afraid
> > > > > > > > > > that
> > > > > > > > > > without matching corresponding topology information on the
> > > > > > > > > > guest
> > > > > > > > > > device
> > > > > > > > > > tree, guests might not work as expected in such a scenario.
> > > > > > > > > > 
> > > > > > > > > > What do you think?
> > > > > > > > > 
> > > > > > > > > You already know my view on this. I would rather strongly
> > > > > > > > > discourage
> > > > > > > > > anyone
> > > > > > > > > pinning all vCPUs of a domain to big cores. We should avoid to
> > > > > > > > > provide
> > > > > > > > > shortcuts to use that could have potentially damageable impact
> > > > > > > > > on
> > > > > > > > > their
> > > > > > > > > platform without telling them.
> > > > > > > > 
> > > > > > > > Do you have a link to a doc somewhere that provides more details
> > > > > > > > about
> > > > > > > > this? We could add a link to it here to inform users. It would
> > > > > > > > be
> > > > > > > > useful.
> > > > > > > 
> > > > > > > This is quite well described in
> > > > > > > https://xenbits.xen.org/docs/unstable/man/xl.cfg.5.html#CPU-Allocation
> > > > > > > see
> > > > > > > "cpus".
> > > > > > 
> > > > > > OK, I'll add the link in a new big.LITTLE doc. Also, do you have any
> > > > > > documentation or link about big core being potentially damaging? It
> > > > > > would be good to provide information about that too in the
> > > > > > big.LITTLE
> > > > > > doc.
> > > > > 
> > > > > I don't have specific documentation to point on it but I would quite
> > > > > interesting to know what is your documentation regarding why always
> > > > > running on
> > > > > big is safe.
> > > > 
> > > > Hi Julien,
> > > > 
> > > > If I go on Amazon, I buy a big.LITTLE board, then it overheats and
> > > > breaks due to the big cores, I would call this a malfunction and return
> > > > the item expecting a refund.
> > > 
> > > Why would they refund you? You run software that has not been proofed with
> > > their board. The vendor will nicely tell you to look for another software
> > > and
> > > will not give you the refund.
> > 
> > I guess it depends on the board. I don't think many dev boards (like
> > Pine64) require you to use a specific kernel version or hypevisor
> > version on them (I hope!).
> 
> The problem I saw is some vendor decided to offload some firmware tasks to the
> Operating System. This means that Xen needs to handle those drivers in order
> to get full support of the board.
> 
> > 
> > 
> > > > Unless the hardware vendor states explicitly that the big cores cannot
> > > > be used all the time, then this use-case falls within the reasonable
> > > > usage of the platform. In fact, it is using a piece of the hardware the
> > > > way it was designed to be used. If the hardware itself is unstable, it
> > > > should be documented in the vendor's docs, and I would like to add a
> > > > link to it so that users are appropriately warned.
> > > In normal circumstance, you have software controlling the overheat. But in
> > > case of Xen who is going to do that job? If it is the firmware and
> > > assuming it
> > > does not need to be taught, then this is likely going to work out-of-box
> > > with
> > > Xen. If it is the OS/Hypervisor, then you are going to get into trouble.
> > > 
> > > As you can see we already have different expectation on how the hardware
> > > should behave.
> 
> Hmmm I should have finished that paragraph. I meant that I would choose the
> more conservative way in the documentation because I would not assume we have
> the full stack working on Xen nowadays. If someone knows her platform is fine
> to always run on big cores, then it can still do it.
> 
> What I want to avoid is providing a way that we are not 100% sure will work on
> all the platforms.
> 
> > 
> > I am starting to see your point. What if we add the following statement,
> > it should non-controversial and informative:
> > 
> > "Big cores are more powerful than LITTLE cores, but often use much more
> > power. Typically, they are recommended for burst activity, especially in
> > battery powered environments. Please check your vendor's big.LITTLE
> > and power management documentation."
> 
> Sounds good to me.
> 
> > 
> > As I was thinking about this, I realized that the same issue could occur
> > even with just the first patch
> > (https://marc.info/?l=xen-devel&m=151873668223723). If the first cpu
> > type is big, we would default to use only big cpus all the time, right?
> 
> Hmmm you are right. However we have no easy way to know whether you boot on
> big or little CPUs :/. Shall we update the warning?

Yes, I'll point users to big.LITTLE.txt, where we have the space to
explain the problem properly.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.