|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v6] xenpm: Add get-intel-temp subcommand
On 12.02.2026 11:19, Teddy Astie wrote:
> Le 09/02/2026 à 15:21, Jan Beulich a écrit :
>> On 09.02.2026 11:31, Teddy Astie wrote:
>>> @@ -93,6 +96,7 @@ void show_help(void)
>>> " units default to
>>> \"us\" if unspecified.\n"
>>> " truncates
>>> un-representable values.\n"
>>> " 0 lets the
>>> hardware decide.\n"
>>> + " get-intel-temp [cpuid] get Intel CPU
>>> temperature of <cpuid> or all\n"
>>
>> Sorry, thinking about it only now: Do we really want to build in the vendor
>> name to a command? "get-temp" would allow for adding an AMD implementation
>> later on?
>
> AMD CPUs expose a PCI device that can be interacted with to get
> temperatures; which is then exposed in hwmon interface of Linux. That
> wouldn't be practical to implement hwmon interfaces in xenpm.
You understand though that AMD only was an example (the most natural one)?
>>> @@ -1354,6 +1358,131 @@ void enable_turbo_mode(int argc, char *argv[])
>>> errno, strerror(errno));
>>> }
>>>
>>> +static int fetch_dts_temp(xc_interface *xch, uint32_t cpu, bool package,
>>> int *temp)
>>> +{
>>> + xc_resource_entry_t entries[] = {
>>> + { .idx = package ? MSR_PACKAGE_THERM_STATUS :
>>> MSR_IA32_THERM_STATUS },
>>> + { .idx = MSR_TEMPERATURE_TARGET },
>>> + };
>>> + struct xc_resource_op ops = {
>>> + .cpu = cpu,
>>> + .entries = entries,
>>> + .nr_entries = ARRAY_SIZE(entries),
>>> + };
>>> + int tjmax;
>>> +
>>> + int ret = xc_resource_op(xch, 1, &ops);
>>> +
>>> + switch ( ret )
>>> + {
>>> + case -1:
>>> + /* xc_resource_op returns -1 in out of memory scenarios */
>>> + errno = -ENOMEM;
>>
>> And xc_resource_op() doesn't itself set / inherit a properly set errno?
>> We don't want to override what the C library may have set.
>
> I'm not sure what to do then. I guess we want to reset errno before
> entering xc_resource_op (so we won't report stale errno), but we would
> still need to consider cases where xc_resource_op returns failure
> without errno having being set ?
Should xc_resource_op() perhaps better be corrected to behave consistently
wrt the setting of errno?
>>> + case 1:
>>> + {
>>> + /*
>>> + * The CPU doesn't support MSR_TEMPERATURE_TARGET, we assume it's
>>> 100
>>> + * which is correct aside a few selected Atom CPUs. Check Linux
>>> + * kernel's coretemp.c for more information.
>>> + */
>>> + static bool has_reported_once = false;
>>> +
>>> + if ( !has_reported_once )
>>> + {
>>> + fprintf(stderr, "MSR_TEMPERATURE_TARGET is not supported,
>>> assume "
>>> + "tjmax = 100, readings may be incorrect.\n");
>>> + has_reported_once = true;
>>> + }
>>> +
>>> + tjmax = 100;
>>> + break;
>>> + }
>>> +
>>> + case 2:
>>> + tjmax = (entries[1].val >> 16) & 0xff;
>>> + break;
>>> +
>>> + default:
>>> + if ( ret > 0 )
>>> + {
>>> + fprintf(stderr, "Got unexpected xc_resource_op return value:
>>> %d", ret);
>>> + errno = -EINVAL;
>>> + }
>>> + else
>>> + errno = ret;
>>
>> Why would this be? How do you know "ret" holds a value suitable for putting
>> in errno?
>
> Aside -1 in out of memory situations when xc_resource_op returns -1, in
> other failure cases, it returns a hypercall (e.g multicall_op) return
> code, which is supposed to match a errno.
As above - generally all layers want to deal with errno in a consistent
manner. If bugs in lower layers are too intrusive to fix right away, such
workarounds will at least want commenting upon (so it is easy to determine
at what point they can be removed again).
Jan
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |