[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/8] ACPI: processor: add __acpi_processor_[un]register_driver helpers.



> From: Konrad Rzeszutek Wilk [mailto:konrad.wilk@xxxxxxxxxx]
> Sent: Saturday, January 14, 2012 6:25 AM
> 
> > > > sure. VCPU!=PCPU requirement is orthogonal to the basic part for gearing
> > > > ACPI information to Xen.
> 
> .. snip..
> > > >
> > > > >  1). For new distros (Ubuntu, Fedora), the default is all VCPUs.
> > > >
> > > > good to know that.
> > > >
> > > > >      Enterprising users might use dom0_max_vcpus to limit the VCPU
> > > count,
> > > > >      but most won't.
> > > > >      Which mean we can concentrate on bringing the _Pxx/_Cxx
> parsing
> > > > >      up to the hypervisor. Which is really neccessary on any chipset
> > > > >      which has the notion of TurboBoost (otherwise the Xen scheduler
> > > > >      won't pick this up and won't engage this mode in certain
> > > > >      workloads).
> 
> .. snip..
> 
> > yes, this is a big question. First, I'd like to give my sincere thanks to 
> > you and
> > Liang for help push this part to Linux upstream. You've done a great job. 
> > :-)
> 
> Thanks!
> > Unfortunately I can't afford the time in the short term, due to extremely 
> > busy
> > tasks in other projects, at least in the whole Q1. Really sorry about that. 
> > :/
> 
> Bummer.
> >
> > I would very appreciate your help if you could continue lending some time
> here,
> > since you've done plenty of works on the cleanup. The majority of the tricky
> > part is related to VCPU/PCPU handling. If putting that part aside, the
> remaining
> > logic should be much simpler.
> 
> I was trying to figure out how difficult it would be to just bring Pxx states 
> to
> the Xen hypervisor using the existing ACPI interfaces. And while it did not 
> pass
> all the _Pxx states (seems that all the _PCT, _PSS, _PSD, _PPC flags need to
> be enabled in the hypercall to make this work), it demonstrates what I had in
> mind.
> 
> 
> #include <linux/device.h>
> #include <linux/kernel.h>
> #include <linux/module.h>
> #include <linux/init.h>
> #include <linux/types.h>
> #include <acpi/acpi_bus.h>
> #include <acpi/acpi_drivers.h>
> #include <acpi/processor.h>
> #include <linux/cpumask.h>
> 
> #include <xen/interface/platform.h>
> #include <asm/xen/hypercall.h>
> 
> #define DRV_NAME      "ACPI_PXX"
> #define DRV_CLASS     "ACPI_PXX_CLASS"
> MODULE_AUTHOR("Konrad Rzeszutek Wilk");
> MODULE_DESCRIPTION("ACPI Processor Driver to send data to Xen
> hypervisor");
> MODULE_LICENSE("GPL");
> 
> static int parse_acpi_cxx(struct acpi_processor *_pr)
> {
>       struct acpi_processor_cx *cx;
>       int i;
> 
>       for (i = 1; i <= _pr->power.count; i++) {
>               cx = &_pr->power.states[i];
>               if (!cx->valid)
>                       continue;
>               pr_info("%s: %d %d %d 0x%x\n", __func__,
>                       cx->type, cx->latency, cx->power, (u32)cx->address);
>       }
>       /* TODO: Under Xen, the C-states information is not present.
>        * Figure out why. */

it's possible related to this long thread:

http://lists.xen.org/archives/html/xen-devel/2011-08/msg00511.html

IOW, Xen doesn't export mwait capability to dom0, which impacts _PDC setting.
Final solution is to have a para-virtualized PDC call for that.

>       return 0;
> }
> static int push_pxx_to_hypervisor(struct acpi_processor *_pr)
> {
>       int ret = -EINVAL;
>       struct xen_platform_op op = {
>               .cmd                    = XENPF_set_processor_pminfo,
>               .interface_version      = XENPF_INTERFACE_VERSION,
>               .u.set_pminfo.id        = _pr->acpi_id,
>               .u.set_pminfo.type      = XEN_PM_PX,
>       };
>       struct xen_processor_performance *xen_perf;
>       struct xen_processor_px *xen_states, *xen_px = NULL;
>       struct acpi_processor_px *px;
>       unsigned i;
> 
>       xen_perf = &op.u.set_pminfo.perf;
>       xen_perf->flags = XEN_PX_PSS;
> 
>       xen_perf->state_count = _pr->performance->state_count;
>       xen_states = kzalloc(xen_perf->state_count *
>                            sizeof(struct xen_processor_px), GFP_KERNEL);
>       if (!xen_states)
>               return -ENOMEM;
> 
>       for (i = 0; i < _pr->performance->state_count; i++) {
>               xen_px = &(xen_states[i]);
>               px = &(_pr->performance->states[i]);
> 
>               xen_px->core_frequency = px->core_frequency;
>               xen_px->power = px->power;
>               xen_px->transition_latency = px->transition_latency;
>               xen_px->bus_master_latency = px->bus_master_latency;
>               xen_px->control = px->control;
>               xen_px->status = px->status;
>       }
>       set_xen_guest_handle(xen_perf->states, xen_states);
>       ret = HYPERVISOR_dom0_op(&op);
>       kfree(xen_states);
>       return ret;
> }
> static int parse_acpi_pxx(struct acpi_processor *_pr)
> {
>       struct acpi_processor_px *px;
>       int i;
> 
>       for (i = 0; i < _pr->performance->state_count;i++) {
>               px = &(_pr->performance->states[i]);
>               pr_info("%s: [%d]: %d, %d, %d, %d, %d, %d\n", __func__,
>                       i, (u32)px->core_frequency, (u32)px->power,
>                       (u32)px->transition_latency,
>                       (u32)px->bus_master_latency,
>                       (u32)px->control, (u32)px->status);
>       }
>       if (xen_initial_domain())
>               return push_pxx_to_hypervisor(_pr);
>       return 0;
> }
> static int parse_acpi_data(void)
> {
>       int cpu;
>       int err = -ENODEV;
>       struct acpi_processor *_pr;
>       struct cpuinfo_x86 *c = &cpu_data(0);
> 
>       /* TODO: Under AMD, the information is populated
>        * using the powernow-k8 driver which does an MSR_PSTATE_CUR_LIMIT
>        * MSR which returns the wrong value so the population of 'processors'
>        * has bogus data. So only run this under Intel for right now. */
>       if (!cpu_has(c, X86_FEATURE_EST))
>               return -ENODEV;
>       for_each_possible_cpu(cpu) {
>               _pr = per_cpu(processors, cpu);
>               if (!_pr)
>                       continue;
> 
>               if (_pr->flags.power)
>                       (void)parse_acpi_cxx(_pr);
> 
>               if (_pr->performance->states)
>                       err = parse_acpi_pxx(_pr);
>               if (err)
>                       break;
>       }
>       return -ENODEV; /* force it to unload */
> }
> static int __init acpi_pxx_init(void)
> {
>       return parse_acpi_data();
> }
> static void __exit acpi_pxx_exit(void)
> {
> }
> module_init(acpi_pxx_init);
> module_exit(acpi_pxx_exit);

the prerequisites for this module to work correctly, is that dom0 has the right
configurations to have all necessary Cx/Px information ready before this 
module is loaded. That may mean enabling full CONFIG_CPU_IDLE and 
CONFIG_CPUFREQ,
which in current form may add some negative impact, e.g. dom0 will try to 
control
Px/Cx to conflict with Xen. So some tweaks may be required in that part.

given our purpose now, is to come up a cleaner approach which tolerate some
assumptions (e.g. #VCPU of dom0 == #PCPU), there's another option following this
trend (perhaps compensate your idea). We can register a Xen-cpuidle and 
xen-cpufreq driver to current Linux cpuidle and cpufreq framework, which plays 
mainly two roles:
        - a dummy driver to prevent dom0 touching actual Px/Cx
        - parse ACPI Cx/Px information to Xen, in a similar way you did above

there may have some other trickiness, but the majority code will be 
self-contained.

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.