[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] cpufreq implementation for OMAP under xen hypervisor.



On Wed, Sep 10, 2014 at 10:31 PM, Konrad Rzeszutek Wilk
<konrad.wilk@xxxxxxxxxx> wrote:
> On Wed, Sep 10, 2014 at 07:35:47PM +0100, Stefano Stabellini wrote:
>> On Wed, 10 Sep 2014, Andrii Tseglytskyi wrote:
>> > Hi,
>> >
>> > On Wed, Sep 10, 2014 at 12:42 PM, Ian Campbell <Ian.Campbell@xxxxxxxxxx> 
>> > wrote:
>> > >
>> > > On Tue, 2014-09-09 at 22:41 +0100, Stefano Stabellini wrote:
>> > > > On Tue, 9 Sep 2014, Ian Campbell wrote:
>> > > > > On Thu, 2014-09-04 at 22:56 +0100, Stefano Stabellini wrote:
>> > > > > > I am trying to think of an alternative, such as passing the real 
>> > > > > > cpu
>> > > > > > nodes to dom0 but then adding status = "disabled", but I am not 
>> > > > > > sure
>> > > > > > whether Linux checks the status for cpu nodes.
>> > > > >
>> > > > > status = "disabled" is defined to have a specific (i.e. non-default)
>> > > > > meaning for cpu nodes, Julien mentioned this when I tried to add a
>> > > > > similar patch to Xen to ignore them. I think it basically means 
>> > > > > "present
>> > > > > but not running, you should start them!".
>> > > > >
>> > > > > >  In addition this scheme
>> > > > > > wouldn't support the case where dom0 has more vcpus than pcpus on 
>> > > > > > the
>> > > > > > system. Granted it is not very common and might even be 
>> > > > > > detrimental for
>> > > > > > performances, but we should be able to support it.
>> > > > >
>> > > > > It's a bit of an edge case, for sure. I guess it wouldn't be totally
>> > > > > unreasonable to say that if you use this sort of configuration you 
>> > > > > may
>> > > > > not get cpufreq support.
>> > > > >
>> > > > > > Ian, what do you think about this?
>> > > > >
>> > > > > All the options suck in one way or another AFAICT. I think we are 
>> > > > > going
>> > > > > to be looking for the least bad solution not necessarily a good one.
>> > > > >
>> > > > > Fundamentally are we trying to avoid having to have a i2c subsystem 
>> > > > > etc
>> > > > > in the hypervisor to be be able to change the voltages before/after
>> > > > > changing the frequency?
>> > > > >
>> > > > > We can't just say "that's part of the cpufreq driver" since different
>> > > > > boards using the same SoC might use different voltage regulators, 
>> > > > > over
>> > > > > i2c or some other bus etc, so we end up with a matrix.
>> > > > >
>> > > > > It's arguable that we should be letting dom0 poke at that regulator
>> > > > > functionality anyway, at least not all of it. Taking that ability 
>> > > > > away
>> > > > > would necessarily imply more platform specific functionality in the
>> > > > > hypervisor.
>> > > >
>> > > > Right.
>> > > > I am afraid that in order to avoid more code in Xen, we end up with an
>> > > > unmaintainable interface and unupstreamable hacks in dom0.
>> > >
>> > > That's what I'm worried about to. Hence I'm wondering if we should just
>> > > do this in the hypervisor.
>> > >
>> > > Although there are a myriad of them the parts used to do voltage control
>> > > tend to be fairly simple.
>> > >
>> > > One concern I have is that i2c busses also tend to have other things on
>> > > them which dom0 might legitimately access (e.g. rtc), I'm not sure what
>> > > to suggest here.
>> >
>> > I would try to avoid i2c transactions in Xen. I2C driver is quite
>> > complicated in Linux kernel. It consists of several parts - common
>> > core + platform specific. I'm pretty sure Xen should not handle this.
>> > I think that establishing of event channel for frequency changing is a
>> > good idea. It would be good to try to implement this. In process of
>> > implementation we will see what is need to be resolved.
>>
>> OK, that's reasonable.
>>
>>
>> > The only question here is how to pass physical cpu to dom0.
>>
>> We can use a device tree based interface to pass the information to
>> dom0, but requiring a number of dom0 vcpus equal to the number of
>> physical cpus and in addition to that having to pin the vcpus each to a
>> different pcpu is quite a stringent limitation. However I don't know the
>> frequency changing interfaces in Linux well enough to know how hard
>> would be to lift it.
>>
>>
>> > Regarding x86.
>> > I'm not sure but maybe ACPI interface encapsulate voltage changing as well?
>>
>> I think so (but I am not an expert on that).
>
> The usual states are P and C states. The P states is the closes to what you
> are looking at:
>
> struct acpi_processor_px {
>         u64 core_frequency;     /* megahertz */
>         u64 power;      /* milliWatts */
>         u64 transition_latency; /* microseconds */
>         u64 bus_master_latency; /* microseconds */
>         u64 control;    /* control value */
>         u64 status;     /* success indicator */
> };
>
>>
>>
>>
>> > Regards,
>> > Andrii
>> >
>> >
>> > --
>> >
>> > Andrii Tseglytskyi | Embedded Dev
>> > GlobalLogic
>> > www.globallogic.com
>> >
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxx
>> http://lists.xen.org/xen-devel


                    Cpufreq driver implementation.
                                 ____________
                                /            \
                                | xenpm tool |
                                \____________/
 Dom0 kernel user-space
---------------------------------------------------------------------------

                          ________________               _____
                         /                \             /     \  CPU
                         | DevTree Parser |          /->| ARM | driver
                         \________________/          |  \_____/
 Dom0 kernel                                         |     |
-----------------------------------------------------|-----|---------------
                                                     |     |
              _____________________________________  |     |
             |     __________        ___________   | |     |
             |    /          \      /           \  | |     |
             |    | ondemand |      | userspace |  | |     |
 Registered  |    \__________/      \___________/  | |     |
  cpufreq    |   _____________       ___________   | |     |
 governor    |  /             \     /           \  | |     |
             |  | performance |     | powersave |  | |     |
             |  \_____________/     \___________/  | |     |
             |_____________________________________| |     |
                               ^                     |     |
                               |                     |     |
                         ______|_______              |     |
                        /              \             |     |  Change
                        | cpufreq core |-------------/     | frequency
                        \______________/ set/get freq      |
                                         commands          |
 Xen                                                       |
-----------------------------------------------------------|--------------
 Hardware                                                __V__
                                                        |     |
                                                        | CPU |
                                                        |_____|


Description of the implementation:
Cpufreq core and registered cpufreq governors are located in xen. Dom0
has CPU driver
which can only change frequency of the physical CPUs. In addition this driver
can change CPUs regulator voltage. I'll reuse some ACPI-specific
variables for ARM.
Thus I can make minimum modification in the xen cpufreq driver and all utilities
(as xenpm) will be working without modification if the xen code. In first
implementation xenpm tool won't show information about C-states, but it can show
information about P-states and can change cpufreq parameters and
change governor.
DevTree parser is a part of the CPU driver in Dom0 and it will read information
from /cpus/cpu@0/private_data path instead of the original /cpus path.

Steps of the initialization:
1. Xen copies all cpu@xxxxxx@N nodes (from input device tree) with properties to
/cpus/cpu@0/private_data node (device tree for Dom0). Thus we can have
any number
of VCPUs in Dom0 and we give all information about all physical CPUs in
the private_data node.

2. Driver in Dom0 will parse /cpus/cpu@0/private_data path instead of the /cpus
path and give the information about CPUs parameters to the hypervisor via
XENPF_set_processor_pminfo hypercall. (Some parameters are calculated in the
Dom0 driver and can not be calculated  in the hypervisor).

3. Cpufreq core driver in the hypervisor will communicate via some interface
with Dom0 (event channel can be used to notify Dom0) and give some commands
to the CPU driver in Dom0. Those command are set/get frequency, etc.

Can I implement cpufreq driver in this way?

Oleksandr Dmytryshyn | Product Engineering and Development
GlobalLogic
M +38.067.382.2525
www.globallogic.com

http://www.globallogic.com/email_disclaimer.txt

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.