[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] cpufreq implementation for OMAP under xen hypervisor.



On Tue, 9 Sep 2014, Oleksandr Dmytryshyn wrote:
> On Fri, Sep 5, 2014 at 12:56 AM, Stefano Stabellini
> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> > On Thu, 4 Sep 2014, Oleksandr Dmytryshyn wrote:
> >> Hi to all.
> >>
> >> I want to implement cpufreq driver in the next way:
> >> 1. Cpufreq governor will be implemented in the Xen
> >> 2. dom0 will only change cpu frequency and voltage of the physical cpus
> >>
> >> But there are some nuances:
> >> 1. dom0 driver should read an information about operation points
> >> (frequencies and voltages) and cpu supply source from the device tree for 
> >> each
> >> physical cpu. In the omap processor case this driver suspects that
> >> those settings
> >> located in the /cpus/cpu@0/ node. But hypervisor creates an cpu node
> >> for each vcpu
> >> for kernel dom0 in the device tree and those information is lost in the 
> >> dom0.
> >> 2. What about this case if we will have some physical cpus with different
> >> operation points (for example 2 cpus) and we give only one cpu for dom0?
> >>
> >> How should I transfer all information from the original cpu@xxxxxx@n nodes
> >> about all physical cpus to the kernel dom0 driver? Maybe an additional
> >> nodes should be created by the hypervisor in the device tree for dom0
> >> and named as pcpu@xxxxxxx@n?
> >
> > If we do that, wouldn't we require changes to the core OMAP drivers or
> > cpu initialization code in Linux (to parse "pcpu" instead of "cpu"
> > nodes)?  I don't expect they would be easy to upstream or maintain going
> > forward.
> >
> > I am trying to think of an alternative, such as passing the real cpu
> > nodes to dom0 but then adding status = "disabled", but I am not sure
> > whether Linux checks the status for cpu nodes. In addition this scheme
> > wouldn't support the case where dom0 has more vcpus than pcpus on the
> > system. Granted it is not very common and might even be detrimental for
> > performances, but we should be able to support it.
> >
> > Ian, what do you think about this?
> >
> >
> >
> >> Oleksandr Dmytryshyn | Product Engineering and Development
> >> GlobalLogic
> >> M +38.067.382.2525
> >> www.globallogic.com
> >>
> >> http://www.globallogic.com/email_disclaimer.txt
> >>
> >>
> >> On Tue, Sep 2, 2014 at 9:46 PM, Andrii Tseglytskyi
> >> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> >> >
> >> > Hi Stefano,
> >> >
> >> > Thank you for explanation.
> >> > I think this requires more and deeper investigation, but for sure dom0
> >> > must be able to do this.
> >> > Let us investigate this.
> >> >
> >> > Thank you,
> >> >
> >> > Regards,
> >> > Andrii
> >> >
> >> > On Tue, Sep 2, 2014 at 9:39 PM, Stefano Stabellini
> >> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >> > > On Tue, 2 Sep 2014, Andrii Tseglytskyi wrote:
> >> > >> On Tue, Sep 2, 2014 at 4:00 AM, Stefano Stabellini
> >> > >> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >> > >> > On Fri, 29 Aug 2014, Andrii Tseglytskyi wrote:
> >> > >> >> Hi,
> >> > >> >>
> >> > >> >> Stefano, Ian,
> >> > >> >>
> >> > >> >> Could you please clarify the following point:
> >> > >> >>
> >> > >> >> I agree that decision about frequency change should be taken by Xen
> >> > >> >> hypervisor. But what about hardware frequency changing?
> >> > >> >> In general when frequency changed to bigger value (for example 
> >> > >> >> from 1
> >> > >> >> GHz to 1.5 GHz) for ARM kernels sequence looks like the following:
> >> > >> >>
> >> > >> >> 1) cpufreq governor decides that frequency should be changed. This
> >> > >> >> decision is taken after analysing of CPU performance data taking in
> >> > >> >> account governor policy.
> >> > >> >> 2) cpufreq governor asks cpufreq driver about new frequency.
> >> > >> >> 3) cpufreq driver compares current and target frequencies and asks
> >> > >> >> cpufreq regulator about voltage change.
> >> > >> >> 4) cpufreq regulator send i2c command to standalone microchip, 
> >> > >> >> which
> >> > >> >> is responsible for voltage changing.
> >> > >> >> 5) cpufreq driver asks clock framework about new frequency for CPU 
> >> > >> >> clock
> >> > >> >> 6) clock framework performs frequency sanity checks, taking in 
> >> > >> >> account
> >> > >> >> clock parents and clock divider settings, and call platform 
> >> > >> >> specific
> >> > >> >> "set_frequency" callback.
> >> > >> >> 7) platform specific callback performs proper HW registers
> >> > >> >> configuration for newly selected frequency
> >> > >> >>
> >> > >> >> Also there are some special cases - for example for OMAP5+ when
> >> > >> >> frequency is changed to 1.5 GHz+, two additional HW IPs should be
> >> > >> >> triggered (ABB and DCC, if someone is familiar with OMAP5+ )
> >> > >> >>
> >> > >> >> So, for generic ARM kernel we have 3 entities to change frequency:
> >> > >> >>
> >> > >> >> - cpufreq governor
> >> > >> >> - cpufreq driver
> >> > >> >> - cpufreq regulator
> >> > >> >>
> >> > >> >> + 2 additional IP for OMAP5+
> >> > >> >> - ABB
> >> > >> >> - DCC
> >> > >> >>
> >> > >> >> Taking in account all above, it looks like it would be better to
> >> > >> >> implement only Xen cpufreq governor. Xen will take a decision about
> >> > >> >> new frequency, and kernel dom0 will perform other steps. Dom0 
> >> > >> >> contains
> >> > >> >> all generic and platform specific frameworks, needed for frequency
> >> > >> >> changing.
> >> > >> >>
> >> > >> >> What do you think ?
> >> > >> >
> >> > >> > Keep in mind that the architecture must be able to handle the case 
> >> > >> > where
> >> > >> > dom0 has only 1 or 2 vcpus on a 4 or 8 cores system with multiple
> >> > >> > physical cpus.
> >> > >> > Could dom0 change the frequency of a physical core or a physical 
> >> > >> > cpu is
> >> > >> > not even running on? If that is not a problem, because cpus and
> >> > >> > frequency changing are decoupled enough in Linux to allow it, then 
> >> > >> > I am
> >> > >> > OK with it. But I suspect they are not.
> >> > >> >
> >> > >>
> >> > >> Not sure that I got your point correctly - dom0 will change frequency
> >> > >> on physical CPU.
> >> > >> And in case of OMAP - this changing affects on both ARM physical cpus
> >> > >> - changing is coupled.
> >> > >> In case of other ARM platforms - changing may be not coupled (I've
> >> > >> heard that Snapdragon can change cpu freqs independently on each
> >> > >> physical cpu)
> >> > >
> >> > > Let me explain with a concrete example.
> >> > >
> >> > > Let's suppose that the platform has 2 physical cpus, each cpu has 4
> >> > > cores.  Let's also supposed that dom0 has only 2 vcpus, currently
> >> > > running on core0 and core1 of cpu0.
> >> > >
> >> > > In this case would dom0 be able to change the frequency of core3 of
> >> > > cpu1, given that is not even running on it?
> >> > > If it can be done without any hacks, then we can go ahead with this
> >> > > approach.
> >> >
> >> >
> >> >
> >> > --
> >> >
> >> > Andrii Tseglytskyi | Embedded Dev
> >> > GlobalLogic
> >> > www.globallogic.com
> >>
> 
> Hi to all.
> 
> I've done next work to check how dom0 can change the frequency of the cpus.
> 1. I've written a small HACK in the hypervisor which copies all settings
> from the physical cpu0 to the vcpu0.
> Thus kernel can read information about OPP states and power supply.
> 2. I've turned on the cpufreq driver in kernel dom0.
> 3. I've turned on all needed regulators.
> 4. I've reverted a patch whitch disables CPUFREQ in kernel dom0 when
> it is running under the xen.
> 
> Now kernel dom0 has 2 virtual cpus and board has 2 physical cpus.
> CPUFREQ driver uses vcpus as source to calculate frequency and changes this
> frequency on the physical cpus.
> 
> I've checked that in this case kernel dom0 can change the frequency of the 
> cpus.
> 
> I want to disable CPUFREQ governor driver in dom0 and leave only cpu0-cpufreq
> driver (driver for OMAP processors). CPUFREQ governor driver will be 
> implemented
> in the xen.
> 
> Here are some questions:
> 
> 1. How implement an mechanism which allows the hypervisor to give
> commands to the
> CPUFREQ driver in the dom0 (i. e. 'set frequency')?

We could set up an event channel for that. Dom0 would receive an
evtchn_irq interrupt and check what needs to be done.


> 2. Could we use a limitation when the number of VCPUs is domain0 must
> be equal to
> the number of physical CPUs, and the domain0 VCPU must be pinned to
> the respective
> physical CPU? This limitation us used for the existing Domain0 based cpufreq.

I think that the limitation would drastically limit the usefulness of the
solution. I wouldn't want to have to resort to this.
Nowadays it is extremely common to have dom0 running with fewer vcpus
that the amount of pcpus in the system.

Unless there is a way to tell Linux to leave the vcpus idle all the time.


> 3. If we will use this limitation can I simply copy all settings from physical
> cpus to the appropriate vcpus in the device tree in xen? In this case
> the existing
> kernel code will not be modified and simple CPUFREQ driver will be written
> which will receive commands from xen and redirect them to the cpu0-cpufreq
> driver (or another registered cpufreq driver for any cpu).

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.