[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: HVM guest only bring up a single vCPU


  • To: Julien Grall <julien@xxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 27 Aug 2021 12:52:25 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck; bh=1tae+1HQVE4uPfCp+5OlovBP1YhJlTviShxSWS1rePY=; b=Pz/XW3NdospZNwnd1N+psh2LeGpFZu7WREoroNnCToO7z35BnsKp0KgSbRAtEaTe7gak6h30lLwSGA03wBbmdMOLrZX/NDk00/bFdXH7koTs840nOWKSYD7T5j9ShIxf56RdEyjLf5VOJh8Bu5ydzc/1qfnRGviDoZEt0l+nLpbH6EnhgzjUM0LJIIZnIS8NgL1S8dWPq+Hx8H6ybuIxkRJlN/2GQ56BoYmi0dZS7QVr5sSvpBcFqzITPR1QW5QdXra43J0cfQTjcSiT9IT2ysHKdWPs8Oz0hK3hWQLEUcw8juXK+ZXOjKLE5v6Hri+nYbRfoHpNgjRI0yCAVzBdmA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=gNGIenOGsS+AwECOfspb67idbdq3nvqCnAebUHXkYbV3CCFQpAkDbInBG0fGz1l+szWNdXsqiM4zI4PrUh75y8+P2UiRI5JS590gS/cFxQJrwCJcJo/xK5st2rJJHS/Tjd69r+EEUmOkrgW6jlYKSKF7q0dXs6hsVBWfo7NANevowz1GV/gXm9tc2HIq4SuSfPL5ItzbvKimm9g5A+T/rel0Xg/HhQEcrdJkPHF//70+Vwhyka5xfqIWMHu6eWuJb+QuGYw4Xm5CFpGdiz82LCb9n3T90QGX2GNAXp7m/QUoh7rynhSDYlqFclkVXdRnybLNIkBJx5xRFSWE5egWZA==
  • Authentication-results: citrix.com; dkim=none (message not signed) header.d=none;citrix.com; dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
  • Delivery-date: Fri, 27 Aug 2021 10:52:35 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 27.08.2021 12:35, Julien Grall wrote:
> Hi Jan,
> 
> On 27/08/2021 07:28, Jan Beulich wrote:
>> On 27.08.2021 01:42, Andrew Cooper wrote:
>>> On 26/08/2021 22:00, Julien Grall wrote:
>>>> Hi Andrew,
>>>>
>>>> While doing more testing today, I noticed that only one vCPU would be
>>>> brought up with HVM guest with Xen 4.16 on my setup (QEMU):
>>>>
>>>> [    1.122180]
>>>> ================================================================================
>>>> [    1.122180] UBSAN: shift-out-of-bounds in
>>>> oss/linux/arch/x86/kernel/apic/apic.c:2362:13
>>>> [    1.122180] shift exponent -1 is negative
>>>> [    1.122180] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.14.0-rc7+ #304
>>>> [    1.122180] Hardware name: Xen HVM domU, BIOS 4.16-unstable 06/07/2021
>>>> [    1.122180] Call Trace:
>>>> [    1.122180]  dump_stack_lvl+0x56/0x6c
>>>> [    1.122180]  ubsan_epilogue+0x5/0x50
>>>> [    1.122180]  __ubsan_handle_shift_out_of_bounds+0xfa/0x140
>>>> [    1.122180]  ? cgroup_kill_write+0x4d/0x150
>>>> [    1.122180]  ? cpu_up+0x6e/0x100
>>>> [    1.122180]  ? _raw_spin_unlock_irqrestore+0x30/0x50
>>>> [    1.122180]  ? rcu_read_lock_held_common+0xe/0x40
>>>> [    1.122180]  ? irq_shutdown_and_deactivate+0x11/0x30
>>>> [    1.122180]  ? lock_release+0xc7/0x2a0
>>>> [    1.122180]  ? apic_id_is_primary_thread+0x56/0x60
>>>> [    1.122180]  apic_id_is_primary_thread+0x56/0x60
>>>> [    1.122180]  cpu_up+0xbd/0x100
>>>> [    1.122180]  bringup_nonboot_cpus+0x4f/0x60
>>>> [    1.122180]  smp_init+0x26/0x74
>>>> [    1.122180]  kernel_init_freeable+0x183/0x32d
>>>> [    1.122180]  ? _raw_spin_unlock_irq+0x24/0x40
>>>> [    1.122180]  ? rest_init+0x330/0x330
>>>> [    1.122180]  kernel_init+0x17/0x140
>>>> [    1.122180]  ? rest_init+0x330/0x330
>>>> [    1.122180]  ret_from_fork+0x22/0x30
>>>> [    1.122244]
>>>> ================================================================================
>>>> [    1.123176] installing Xen timer for CPU 1
>>>> [    1.123369] x86: Booting SMP configuration:
>>>> [    1.123409] .... node  #0, CPUs:      #1
>>>> [    1.154400] Callback from call_rcu_tasks_trace() invoked.
>>>> [    1.154491] smp: Brought up 1 node, 1 CPU
>>>> [    1.154526] smpboot: Max logical packages: 2
>>>> [    1.154570] smpboot: Total of 1 processors activated (5999.99
>>>> BogoMIPS)
>>>>
>>>> I have tried a PV guest (same setup) and the kernel could bring up all
>>>> the vCPUs.
>>>>
>>>> Digging down, Linux will set smp_num_siblings to 0 (via
>>>> detect_ht_early()) and as a result will skip all the CPUs. The value
>>>> is retrieve from a CPUID leaf. So it sounds like we don't set the
>>>> leaft correctly.
>>>>
>>>> FWIW, I have also tried on Xen 4.11 and could spot the same issue.
>>>> Does this ring any bell to you?
>>>
>>> The CPUID data we give to guests is generally nonsense when it comes to
>>> topology.  By any chance does the hardware you're booting this on not
>>> have hyperthreading enabled/active to begin with?
>>
>> Well, I'd put the question slightly differently: What CPUID data does
>> qemu supply to Xen here? I could easily see us making an assumption
>> somewhere that is met by all hardware but is theoretically wrong to
>> make and not met by qemu, which then leads to further issues with what
>> we expose to our guest.
> I have pasted the output from cpuid on a baremetal Linux here:

"baremetal" still meaning it was running on qemu, not itself baremetal?

> https://pastebin.com/WvaXiXuL

   miscellaneous (1/ebx):
      process local APIC physical ID = 0x0 (0)
      maximum IDs for CPUs in pkg    = 0x0 (0)
      CLFLUSH line size              = 0x8 (8)
      brand index                    = 0x0 (0)

As suspected the field is zero, and hence will remain zero after
multiplying by 2. I suppose the patch sent earlier should then get you
further.

Jan




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.