[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] cpufreq implementation for OMAP under xen hypervisor.



The issue with that limitation is that it doesn't scale well on large
systems. You really wouldn't want dom0 to have 18 vcpus on a Xeon E5
because it would badly affect performances. Even on a 8 cores SoC it
would be best to assign less than 8 vcpus to dom0.

On Wed, 10 Sep 2014, Vitaly Chernooky wrote:
> I've intensively discussed my suggestions here and now it is transparent to 
> me that we should not try to use Cpufreq on ARM
> SoCs without direct 1:1 pcpu:vcpu mapping in dom0. So if someone want to 
> break 1:1 mapping he should forget Cpufreq.
> With best regards,
> 
> 
> On Wed, Sep 10, 2014 at 12:58 AM, Stefano Stabellini 
> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>       On Tue, 9 Sep 2014, Vitaly Chernooky wrote:
>       > Hi All!
>       >
>       > On Fri, Sep 5, 2014 at 12:56 AM, Stefano Stabellini 
> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
>       >Â Â Â ÂOn Thu, 4 Sep 2014, Oleksandr Dmytryshyn wrote:
>       >Â Â Â Â> Hi to all.
>       >Â Â Â Â>
>       >Â Â Â Â> I want to implement cpufreq driver in the next way:
>       >Â Â Â Â> 1. Cpufreq governor will be implemented in the Xen
>       >Â Â Â Â> 2. dom0 will only change cpu frequency and voltage of the 
> physical cpus
>       >Â Â Â Â>
>       >Â Â Â Â> But there are some nuances:
>       >Â Â Â Â> 1. dom0 driver should read an information about operation 
> points
>       >Â Â Â Â> (frequencies and voltages) and cpu supply source from the 
> device tree for each
>       >Â Â Â Â> physical cpu. In the omap processor case this driver suspects 
> that
>       >Â Â Â Â> those settings
>       >Â Â Â Â> located in the /cpus/cpu@0/ node. But hypervisor creates an 
> cpu node
>       >Â Â Â Â> for each vcpu
>       >Â Â Â Â> for kernel dom0 in the device tree and those information is 
> lost in the dom0.
>       >Â Â Â Â> 2. What about this case if we will have some physical cpus 
> with different
>       >Â Â Â Â> operation points (for example 2 cpus) and we give only one 
> cpu for dom0?
>       >Â Â Â Â>
>       >Â Â Â Â> How should I transfer all information from the original 
> cpu@xxxxxx@n nodes
>       >Â Â Â Â> about all physical cpus to the kernel dom0 driver? Maybe an 
> additional
>       >Â Â Â Â> nodes should be created by the hypervisor in the device tree 
> for dom0
>       >Â Â Â Â> and named as pcpu@xxxxxxx@n?
>       >
>       >Â Â Â ÂIf we do that, wouldn't we require changes to the core OMAP 
> drivers or
>       >Â Â Â Âcpu initialization code in Linux (to parse "pcpu" instead of 
> "cpu"
>       >   Ânodes)? I don't expect they would be easy to upstream or 
> maintain going
>       >Â Â Â Âforward.
>       >
>       >Â Â Â ÂI am trying to think of an alternative, such as passing the 
> real cpu
>       >Â Â Â Ânodes to dom0 but then adding status = "disabled", but I am not 
> sure
>       >Â Â Â Âwhether Linux checks the status for cpu nodes. In addition this 
> scheme
>       >Â Â Â Âwouldn't support the case where dom0 has more vcpus than pcpus 
> on the
>       >Â Â Â Âsystem. Granted it is not very common and might even be 
> detrimental for
>       >Â Â Â Âperformances, but we should be able to support it.
>       >
>       >
>       > In case where dom0 has more vcpus than pcpus on the
>       > system, the dom0 kernel is the most bug-prone place for pcpu cpufreq 
> governor. So I still believe that
>       separate driver
>       > domain with direct 1:1 vcpu:pcpu mapping is the best place for 
> cpufreq governor. But it also reasonable to run
>       cpufreq
>       > governor as userspace daemon in dom0.
>       >
>       > Also what do you think about PM QoS support? On bare metal cpufreq is 
> tightly integrated with PM QoS and
>       intensively
>       > cooperate in frequency scaling.
> 
> Device PM needs to be done in Dom0.
> CPU an Platform level PM architecturally belongs to Xen, but I do
> understand that to do that in Xen we would need to add lots of code to
> the hypervisor. There is no silver bullet here.
> 
> A driver domain with 1:1 vcpu:pcpu mapping could work, but what kernel
> are you going to use for that? Linux? Wouldn't Linux be too big for a
> cpufreq driver domain, especially in embedded deployments? I think it
> would need at least 32MB to run.
> 
> 
> > With best regards,
> > Â
> >Â Â Â ÂIan, what do you think about this?
> >
> >
> >
> >Â Â Â Â> Oleksandr Dmytryshyn | Product Engineering and Development
> >Â Â Â Â> GlobalLogic
> >Â Â Â Â> M +38.067.382.2525
> >Â Â Â Â> www.globallogic.com
> >Â Â Â Â>
> >Â Â Â Â> http://www.globallogic.com/email_disclaimer.txt
> >Â Â Â Â>
> >Â Â Â Â>
> >Â Â Â Â> On Tue, Sep 2, 2014 at 9:46 PM, Andrii Tseglytskyi
> >Â Â Â Â> <andrii.tseglytskyi@xxxxxxxxxxxxxxx> wrote:
> >Â Â Â Â> >
> >Â Â Â Â> > Hi Stefano,
> >Â Â Â Â> >
> >Â Â Â Â> > Thank you for explanation.
> >Â Â Â Â> > I think this requires more and deeper investigation, but for sure 
> >dom0
> >Â Â Â Â> > must be able to do this.
> >Â Â Â Â> > Let us investigate this.
> >Â Â Â Â> >
> >Â Â Â Â> > Thank you,
> >Â Â Â Â> >
> >Â Â Â Â> > Regards,
> >Â Â Â Â> > Andrii
> >Â Â Â Â> >
> >Â Â Â Â> > On Tue, Sep 2, 2014 at 9:39 PM, Stefano Stabellini
> >Â Â Â Â> > <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >Â Â Â Â> > > On Tue, 2 Sep 2014, Andrii Tseglytskyi wrote:
> >Â Â Â Â> > >> On Tue, Sep 2, 2014 at 4:00 AM, Stefano Stabellini
> >Â Â Â Â> > >> <stefano.stabellini@xxxxxxxxxxxxx> wrote:
> >Â Â Â Â> > >> > On Fri, 29 Aug 2014, Andrii Tseglytskyi wrote:
> >Â Â Â Â> > >> >> Hi,
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> Stefano, Ian,
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> Could you please clarify the following point:
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> I agree that decision about frequency change should be 
> >taken by Xen
> >Â Â Â Â> > >> >> hypervisor. But what about hardware frequency changing?
> >Â Â Â Â> > >> >> In general when frequency changed to bigger value (for 
> >example from 1
> >Â Â Â Â> > >> >> GHz to 1.5 GHz) for ARM kernels sequence looks like the 
> >following:
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> 1) cpufreq governor decides that frequency should be 
> >changed. This
> >Â Â Â Â> > >> >> decision is taken after analysing of CPU performance data 
> >taking in
> >Â Â Â Â> > >> >> account governor policy.
> >Â Â Â Â> > >> >> 2) cpufreq governor asks cpufreq driver about new frequency.
> >Â Â Â Â> > >> >> 3) cpufreq driver compares current and target frequencies 
> >and asks
> >Â Â Â Â> > >> >> cpufreq regulator about voltage change.
> >Â Â Â Â> > >> >> 4) cpufreq regulator send i2c command to standalone 
> >microchip, which
> >Â Â Â Â> > >> >> is responsible for voltage changing.
> >Â Â Â Â> > >> >> 5) cpufreq driver asks clock framework about new frequency 
> >for CPU clock
> >Â Â Â Â> > >> >> 6) clock framework performs frequency sanity checks, taking 
> >in account
> >Â Â Â Â> > >> >> clock parents and clock divider settings, and call platform 
> >specific
> >Â Â Â Â> > >> >> "set_frequency" callback.
> >Â Â Â Â> > >> >> 7) platform specific callback performs proper HW registers
> >Â Â Â Â> > >> >> configuration for newly selected frequency
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> Also there are some special cases - for example for OMAP5+ 
> >when
> >Â Â Â Â> > >> >> frequency is changed to 1.5 GHz+, two additional HW IPs 
> >should be
> >Â Â Â Â> > >> >> triggered (ABB and DCC, if someone is familiar with OMAP5+ )
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> So, for generic ARM kernel we have 3 entities to change 
> >frequency:
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> - cpufreq governor
> >Â Â Â Â> > >> >> - cpufreq driver
> >Â Â Â Â> > >> >> - cpufreq regulator
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> + 2 additional IP for OMAP5+
> >Â Â Â Â> > >> >> - ABB
> >Â Â Â Â> > >> >> - DCC
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> Taking in account all above, it looks like it would be 
> >better to
> >Â Â Â Â> > >> >> implement only Xen cpufreq governor. Xen will take a 
> >decision about
> >Â Â Â Â> > >> >> new frequency, and kernel dom0 will perform other steps. 
> >Dom0 contains
> >Â Â Â Â> > >> >> all generic and platform specific frameworks, needed for 
> >frequency
> >Â Â Â Â> > >> >> changing.
> >Â Â Â Â> > >> >>
> >Â Â Â Â> > >> >> What do you think ?
> >Â Â Â Â> > >> >
> >Â Â Â Â> > >> > Keep in mind that the architecture must be able to handle 
> >the case where
> >Â Â Â Â> > >> > dom0 has only 1 or 2 vcpus on a 4 or 8 cores system with 
> >multiple
> >Â Â Â Â> > >> > physical cpus.
> >Â Â Â Â> > >> > Could dom0 change the frequency of a physical core or a 
> >physical cpu is
> >Â Â Â Â> > >> > not even running on? If that is not a problem, because cpus 
> >and
> >Â Â Â Â> > >> > frequency changing are decoupled enough in Linux to allow 
> >it, then I am
> >Â Â Â Â> > >> > OK with it. But I suspect they are not.
> >Â Â Â Â> > >> >
> >Â Â Â Â> > >>
> >Â Â Â Â> > >> Not sure that I got your point correctly - dom0 will change 
> >frequency
> >Â Â Â Â> > >> on physical CPU.
> >Â Â Â Â> > >> And in case of OMAP - this changing affects on both ARM 
> >physical cpus
> >Â Â Â Â> > >> - changing is coupled.
> >Â Â Â Â> > >> In case of other ARM platforms - changing may be not coupled 
> >(I've
> >Â Â Â Â> > >> heard that Snapdragon can change cpu freqs independently on 
> >each
> >Â Â Â Â> > >> physical cpu)
> >Â Â Â Â> > >
> >Â Â Â Â> > > Let me explain with a concrete example.
> >Â Â Â Â> > >
> >Â Â Â Â> > > Let's suppose that the platform has 2 physical cpus, each cpu 
> >has 4
> >   Â> > > cores. Let's also supposed that dom0 has only 2 vcpus, 
> >currently
> >Â Â Â Â> > > running on core0 and core1 of cpu0.
> >Â Â Â Â> > >
> >Â Â Â Â> > > In this case would dom0 be able to change the frequency of 
> >core3 of
> >Â Â Â Â> > > cpu1, given that is not even running on it?
> >Â Â Â Â> > > If it can be done without any hacks, then we can go ahead with 
> >this
> >Â Â Â Â> > > approach.
> >Â Â Â Â> >
> >Â Â Â Â> >
> >Â Â Â Â> >
> >Â Â Â Â> > --
> >Â Â Â Â> >
> >Â Â Â Â> > Andrii Tseglytskyi | Embedded Dev
> >Â Â Â Â> > GlobalLogic
> >Â Â Â Â> > www.globallogic.com
> >Â Â Â Â>
> >
> >Â Â Â Â_______________________________________________
> >Â Â Â ÂXen-devel mailing list
> >Â Â Â ÂXen-devel@xxxxxxxxxxxxx
> >Â Â Â Âhttp://lists.xen.org/xen-devel
> >
> >
> >
> >
> > --
> > Vitaly Chernooky |ÂSenior Developer - Product Engineering and Development
> > GlobalLogic
> > P +380.44.4929695 ext.1136 M +380.98.7920568 S cvv_2k
> > www.globallogic.com
> >
> > http://www.globallogic.com/email_disclaimer.txt
> >
> >
> 
> 
> 
> 
> --
> Vitaly Chernooky |ÂSenior Developer - Product Engineering and Development
> GlobalLogic
> P +380.44.4929695 ext.1136 M +380.98.7920568 S cvv_2k
> www.globallogic.com
> 
> http://www.globallogic.com/email_disclaimer.txt
> 
> 
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.