[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [RFC XEN PATCH] x86/mwait-idle: Use ACPI for CPUs without hardcoded C-state table
On 18.07.2023 13:04, Simon Gaiser wrote: > mwait-idle includes a hardcoded config for many CPUs. But some are > missing, for example Tiger Lake. Linux' driver reads the config from > ACPI in those cases. This adds this to Xen's implementation. > > The Linux driver also has a feature to combine the internal table with > the infos from ACPI. This is not implemented here, for CPUs with > internal config nothing is changed. > > Signed-off-by: Simon Gaiser <simon@xxxxxxxxxxxxxxxxxxxxxx> > --- > > Sending this as RFC to get feedback on implementing it this way. I tried > to keep this change small and to avoid changing the existing code path > for CPUs with a config included in the driver. On the other hand this > makes it look a bit hacky. > > I'm not quite sure if initializing mwait-idle in set_cx_pminfo this way > is correct. For example set_cx has some smp_wmb call I'm not sure when > it's needed, so might be very well missing something. > > What also surprised me is that the existing code in mwait-idle first > calls cpuidle_current_governor->enable(processor_powers[cpu]) and later > setup the C-state config in processor_powers[cpu]. This seems the be the > wrong order (but, I think, current not important since > menu_enable_device doesn't use that part of the struct). > > When I brought up the topic of this patch the first time Jan had an > interesting questions [1]: > >> It hasn't become clear to me why Linux now has two CPU idle drivers >> consuming ACPI data (intel_idle and the purely ACPI-based one). > > I'm not quite sure either. Linux' intel_idle.c states: > > intel_idle is a cpuidle driver that loads on all Intel CPUs with > MWAIT in lieu of the legacy ACPI processor_idle driver. The intent > is to make Linux more efficient on these processors, as intel_idle > knows more than ACPI, as well as make Linux more immune to ACPI BIOS > bugs. > > The commit that first added ACPI support to the Linux driver [2] says: > > The main functional difference between intel_idle with this change > and the ACPI processor driver is that the former sets the target > residency to be equal to the exit latency (provided by _CST) for > C1-type C-states and to 3 times the exit latency value for the other > C-state types, whereas the latter obtains the target residency by > multiplying the exit latency by the same number (2 by default) for > all C-state types. Therefore it is expected that in general using > the former instead of the latter on the same system will lead to > improved energy-efficiency. > > This sounds less impressive and doesn't explain why not to just change > the standard ACPI driver to use the better latency settings. On the > Linux what might play also a role is that the mwait driver also gained > the option to combine the internal settings with what it reads from > ACPI. That would be probably harder to include in the generic ACPI > driver. > > This also raises the option to change the latency setting in Xen's > generic ACPI driver for the affected CPUs instead of touching > mwait-idle. > > Currently I'm interested in this driver mainly for S0ix support. I did > nearly all my testing while using the mwait-idle driver to keep > differences to Linux as small as possible. (At first by hacking in some > config for the Tiger Lake CPU of my test system. Now with this patch.). > At some point I observed worse S0ix residency with Xen's generic ACPI > idle driver than with mwait-idle. But when I tried to reproduce this for > writing this e-mail I wasn't able to reproduce this measurement and got > the same residency for both idle drivers. So either I did messed up my > previous measurements or I have some unaccounted changes in my test > setup. Taking all together perhaps more of an argument to see about making the ACPI driver better. If with the same ACPI data the mwait-idle one can (maybe) achieve better results, surely there's a way for the ACPI driver to achieve the same, likely with less of a change? Therefore only a few basic comments. > @@ -1360,24 +1381,27 @@ long set_cx_pminfo(uint32_t acpi_id, struct > xen_processor_power *power) > set_cx(acpi_power, &xen_cx); > } > > - if ( !cpu_online(cpu_id) ) > - { > - uint32_t apic_id = x86_cpu_to_apicid[cpu_id]; > - > - /* > - * If we've just learned of more available C states, wake the CPU if > - * it's parked, so it can go back to sleep in perhaps a deeper state. > - */ > - if ( park_offline_cpus && apic_id != BAD_APICID ) > - { > - unsigned long flags; > - > - local_irq_save(flags); > - apic_wait_icr_idle(); > - apic_icr_write(APIC_DM_NMI | APIC_DEST_PHYSICAL, apic_id); > - local_irq_restore(flags); > + if ( cpu_id == 0 && pm_idle_save == NULL ) { > + /* Now that we have the ACPI info from dom0, try again to setup > + * mwait-idle*/ > + ret = mwait_idle_init(&cpu_nfb, true); > + if (ret >= 0) { > + unsigned int cpu; > + /* mwait-idle took over, call it's initializer for all CPUs*/ > + for_each_present_cpu ( cpu ) > + { > + cpu_nfb.notifier_call(&cpu_nfb, CPU_UP_PREPARE, (void > *)(long)cpu); > + cpu_nfb.notifier_call(&cpu_nfb, CPU_ONLINE, (void > *)(long)cpu); > + if ( !cpu_online(cpu) ) { > + repark_cpu(cpu); > + } > + } Is this safe against a CPU coming online or going offline? > --- a/xen/arch/x86/cpu/mwait-idle.c > +++ b/xen/arch/x86/cpu/mwait-idle.c > @@ -71,13 +71,15 @@ > #undef PREFIX > #define PREFIX "mwait-idle: " > > +#define pr_err(fmt...) printk(KERN_ERR fmt) > + > #ifdef DEBUG > # define pr_debug(fmt...) printk(KERN_DEBUG fmt) > #else > # define pr_debug(fmt...) > #endif > > -static __initdata bool opt_mwait_idle = true; > +static bool opt_mwait_idle = true; > boolean_param("mwait-idle", opt_mwait_idle); The variable still isn't written post-boot, is it? If so, you want to use __ro_after_init rather than dropping the placement attribute altogether. > @@ -92,7 +94,7 @@ static unsigned int mwait_substates; > * exclusive C-states, this parameter has no effect. > */ > static unsigned int __ro_after_init preferred_states_mask; > -static char __initdata preferred_states[64]; > +static char preferred_states[64]; > string_param("preferred-cstates", preferred_states); Same here, I suppose. > @@ -1151,6 +1153,9 @@ static const struct idle_cpu idle_cpu_snr = { > .c1e_promotion = C1E_PROMOTION_DISABLE, > }; > > +static struct idle_cpu __read_mostly idle_cpu_acpi = { > +}; No need for an (empty) initializer. There are also numerous style issues throughout the patch, which would want correcting in case there's a convincing argument towards going this route. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |