[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 04/23] x86: Don't use potentially incorrect CPUID values for topology information


  • To: Simon Gaiser <simon@xxxxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 7 Aug 2023 12:04:41 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=AImSG46M5dVWVzPvxVp4rfrueGK4YHQjTUB4hPPXSU4=; b=mQ4VJItZLfItvrO2eMiWKtlLGsTbfupQCy8EAqmNzovwshUGeEyDw1fJtguBHe8nYIyx+k/rVmSHXyaQQ785KrogVbHbyUWAaPKKtd2tXAgvM4EE3cvUWBVPlkFPbscOkvbKbb7842JFS2ytE4QkLZfkbgoSioOhV/mPfb5GJ9vKTp7S119uhGoSRJJhceXABfTQ/x8RLRXhlniRC8ES6ZdLm7SNE8G9DAkhGlSVmrOZnKSwVHjnF1TYq20pe3XSCWuIaT/SR+Fgk7OGyGUymZkqM1f8lwOXNtYnzx0XjI8Ld1NVYIdDHV1pZWNikffWQb/W6j2akbYNJRv5vp2jRA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=VRG7fQYESY5XBw28X194BAEB5GIMEQIC55tVb8FQezjpN0fUMJ+2FH/s9TxgMmoTnvcSx9+fjPwCtIeYubpZxO7/v8UWzvbZ2gFfFnPO2MOY/t59FDPiy+aTlDOJXiVHXwPlgmLWwaqYhVlENfNL09lRiVgryhCwEYwNlUzUGr3GtwVvQe3w6ax9KIQPq9odaxQcUno2r2J//Z4bqnKeDEj/cxn3oZLfoRlWl8PuDZIYzqIMoBbT8oMryhVjvEHHyIGaRKarzyGrdFFtlKtUTVtAeJjeOJs8wxTqnspeNUXfgY5EUs/wR0YmLpC57WH0bxW7I6TDQdy9I7vzxUscow==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, KarimAllah Ahmed <karahmed@xxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Jan H. Schönherr <jschoenh@xxxxxxxxx>, Matt Wilson <msw@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>, Anthony Liguori <aliguori@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Mon, 07 Aug 2023 10:04:56 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 07.08.2023 11:58, Simon Gaiser wrote:
> Jan Beulich:
>> On 07.08.2023 10:18, Simon Gaiser wrote:
>>> Anthony Liguori:
>>>> From: Jan H. Schönherr <jschoenh@xxxxxxxxx>
>>>>
>>>> Intel says for CPUID leaf 0Bh:
>>>>
>>>>   "Software must not use EBX[15:0] to enumerate processor
>>>>    topology of the system. This value in this field
>>>>    (EBX[15:0]) is only intended for display/diagnostic
>>>>    purposes. The actual number of logical processors
>>>>    available to BIOS/OS/Applications may be different from
>>>>    the value of EBX[15:0], depending on software and platform
>>>>    hardware configurations."
>>>>
>>>> And yet, we're using them to derive the number cores in a package
>>>> and the number of siblings in a core.
>>>>
>>>> Derive the number of siblings and cores from EAX instead, which is
>>>> intended for that.
>>>>
>>>> Signed-off-by: Jan H. Schönherr <jschoenh@xxxxxxxxx>
>>>> ---
>>>>  xen/arch/x86/cpu/common.c | 4 ++--
>>>>  1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/xen/arch/x86/cpu/common.c b/xen/arch/x86/cpu/common.c
>>>> index e9588b3..22f392f 100644
>>>> --- a/xen/arch/x86/cpu/common.c
>>>> +++ b/xen/arch/x86/cpu/common.c
>>>> @@ -479,8 +479,8 @@ void detect_extended_topology(struct cpuinfo_x86 *c)
>>>>     initial_apicid = edx;
>>>>  
>>>>     /* Populate HT related information from sub-leaf level 0 */
>>>> -   core_level_siblings = c->x86_num_siblings = LEVEL_MAX_SIBLINGS(ebx);
>>>>     core_plus_mask_width = ht_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
>>>> +   core_level_siblings = c->x86_num_siblings = 1 << ht_mask_width;
>>>>  
>>>>     sub_index = 1;
>>>>     do {
>>>> @@ -488,8 +488,8 @@ void detect_extended_topology(struct cpuinfo_x86 *c)
>>>>  
>>>>             /* Check for the Core type in the implemented sub leaves */
>>>>             if ( LEAFB_SUBTYPE(ecx) == CORE_TYPE ) {
>>>> -                   core_level_siblings = LEVEL_MAX_SIBLINGS(ebx);
>>>>                     core_plus_mask_width = BITS_SHIFT_NEXT_LEVEL(eax);
>>>> +                   core_level_siblings = 1 << core_plus_mask_width;
>>>
>>>
>>> On the i5-1135G7 (4 cores with each 2 threads) I'm currently testing on
>>> I noticed that this changes leads to core_level_siblings == 16 and
>>> therefore x86_max_cores == 8. If read from ebx like before this change
>>> and what Linux is still doing [1] it reads core_level_siblings == 8 (as
>>> expected?).
>>>
>>> What's the expected semantic here? If it's to derive the actual number
>>> of siblings (and therefore cores) in a package as the commit message
>>> suggest, the new code can't work even ignoring the example from my test
>>> system. It will always produce powers of 2, so can't get it right on a
>>> system with, say, 6 cores.
>>
>> The only use of the variable in question is in this statement:
>>
>>       c->x86_max_cores = (core_level_siblings / c->x86_num_siblings);
>>
>> Note the "max" in the name. This is how many _could_ be there, not how
>> many _are_ there, aiui.
> 
> I'm indeed not quite sure on the intended semantic, hence the question
> (although it's not clear to me what case that "could" would cover here).

"Could" covers for a number of reasons why APIC IDs may not be contiguous.
Consider a 6-code system: The APIC IDs need to cover for at least 8 there.

> It doesn't have to be identical but Linux says for it's version of the
> variable:
> 
>     The number of cores in a package. This information is retrieved via
>     CPUID.
> 
> And if I look at it's usage in set_nr_sockets in Xen:
> 
>     nr_sockets = last_physid(phys_cpu_present_map)
>                  / boot_cpu_data.x86_max_cores
>                  / boot_cpu_data.x86_num_siblings + 1;

This validly uses the field in the "max" sense, not in the "actual" one.

> It seems to be also be used in this meaning. At least on my test system
> I get last_physid == 7 (as I would have expected for 8 logical cpus). So
> dividing this by the 4 (number of cores) and 2 (threads per core) is
> what I think was intended here.

Would you mind providing raw data from your system: Both the raw CPUID
output for the leaf/leaves of interest here and the APIC IDs of all
threads?

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.