[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: cpupool / credit2 misuse of xfree() (was: Re: [BUG] Xen causes a host hang by using xen-hptool cpu-offline)


  • To: Juergen Gross <jgross@xxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Mon, 1 Aug 2022 10:41:46 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XqgKCYljuqimkIK/KZKu2ZAsOfEoC4UdLFlgBJ4AZ/w=; b=fuQ30MwGkW7QNj+PvVyFa5+nQ25hteF9r+QZHXpeCAgwAtINYjPh4N1DK77PIsXOtnNrLD4mYuR7o496PJgV9mIpmXmAEqLmq4RmMRNx1uQu5drnc7+N6/0pkeb9cM37eMXR4idU55vIjg4JEv8pfoG7hKceoBrpyXD5cCLAj34mu9zvAgOdmzpcZFO9eHDaDEtt0N4e07CjUVD/2o0Xm9w9PxX/MZkpV+MUxLgqMhrPRdn7zeMs8wbBGlYJxKdxcDpgl21D/MhURJZGeg99iwWbeWAS3iir0FDnVU9Wr8O2laGHU0n8M2C2Pn59JAWwPx3V4Gsr7XkzHBi0r1CZuw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Ybv4tRi6V8ARLWCglPLJaiLv6BETxkuLf9At/kOK2+rPkVp18xfljwq0cIXcU/RpCV5nBzZinvZoHatgQZin9y+Vyz2v+1XzUn8ADzhQJaPDgH2NczIbV25jFFq8TsHDfBwfW069fYHQ79q+TizLN8qTqC7F8EnjOdRBG0mQPN+hIR8cRxLE6WjbTvicXFoyeU3sMHTqlO4UGWhppaUcjVUFPeCW+TCzUtQMSa6a9l7xQ04HGZnk3u1FmBqsbCtAoA9aTlv+nzU7Q9Q+yUUWa607XzHXt3zfKkbB3twDdWgwdzrYNl2IDHqcIEgD/fMOciJ4dsAAKpnsH7JhF4IlEw==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, "Gao, Ruifeng" <ruifeng.gao@xxxxxxxxx>
  • Delivery-date: Mon, 01 Aug 2022 08:41:58 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 01.08.2022 09:38, Juergen Gross wrote:
> On 27.07.22 08:32, Jan Beulich wrote:
>> On 27.07.2022 03:19, Gao, Ruifeng wrote:
>>> Problem Description:
>>> Trying to execute "/usr/local/sbin/xen-hptool cpu-offline <cpuid>", the 
>>> host will hang immediately.
>>>
>>> Version-Release and System Details:
>>> Platform: Ice Lake Server
>>> Host OS: Red Hat Enterprise Linux 8.3 (Ootpa)
>>> Kernel: 5.19.0-rc6
>>> HW: Intel(R) Xeon(R) Gold 6336Y CPU @ 2.40GHz
>>> Xen Version: 4.17-unstable(ab2977b027-dirty)
>>>
>>> Reproduce Steps:
>>> 1. Boot from Xen and check the information:
>>> [root@icx-2s1 ~]# xl info
>>> host                   : icx-2s1
>>> release                : 5.19.0-rc6
>>> xen_version            : 4.17-unstable
>>> xen_caps               : xen-3.0-x86_64 hvm-3.0-x86_32 hvm-3.0-x86_32p 
>>> hvm-3.0-x86_64
>>> platform_params        : virt_start=0xffff800000000000
>>> xen_changeset          : Thu Jul 14 19:45:36 2022 +0100 git:ab2977b027-dirty
>>> 2. Execute the cpu-offline command, here cpuid is 48 as an example:
>>> [root@icx-2s1 ~]# /usr/local/sbin/xen-hptool cpu-offline 48
>>>
>>> Actual Results:
>>> The host will hang immediately.
>>
>> Well, it crashes (which is an important difference). Also you've hidden
>> the important details (allowing to easily identify what area the issue
>> is in) quite well in the attachment.
>>
>> Jürgen (and possibly George / Dario),
>>
>> this
>>
>> (XEN) Xen call trace:
>> (XEN)    [<ffff82d04023be76>] R xfree+0x150/0x1f7
>> (XEN)    [<ffff82d040248795>] F 
>> common/sched/credit2.c#csched2_free_udata+0xc/0xe
>> (XEN)    [<ffff82d040259169>] F schedule_cpu_rm+0x38d/0x4b3
>> (XEN)    [<ffff82d0402430ca>] F 
>> common/sched/cpupool.c#cpupool_unassign_cpu_finish+0x17e/0x22c
>> (XEN)    [<ffff82d04021d402>] F 
>> common/sched/cpupool.c#cpu_callback+0x3fb/0x4dc
>> (XEN)    [<ffff82d040229fc3>] F notifier_call_chain+0x6b/0x96
>> (XEN)    [<ffff82d040204df7>] F 
>> common/cpu.c#cpu_notifier_call_chain+0x1b/0x33
>> (XEN)    [<ffff82d040204e33>] F common/cpu.c#_take_cpu_down+0x24/0x2b
>> (XEN)    [<ffff82d040204e43>] F common/cpu.c#take_cpu_down+0x9/0x10
>> (XEN)    [<ffff82d040231517>] F 
>> common/stop_machine.c#stopmachine_action+0x86/0x96
>> (XEN)    [<ffff82d040231cc5>] F common/tasklet.c#do_tasklet_work+0x72/0xa5
>> (XEN)    [<ffff82d040231f42>] F do_tasklet+0x58/0x8a
>> (XEN)    [<ffff82d040320b60>] F arch/x86/domain.c#idle_loop+0x8d/0xee
>> (XEN)
>> (XEN)
>> (XEN) ****************************************
>> (XEN) Panic on CPU 48:
>> (XEN) Assertion '!in_irq() && (local_irq_is_enabled() || num_online_cpus() 
>> <= 1)' failed at common/xmalloc_tlsf.c:704
>> (XEN) ****************************************
>>
>> is pointing at the problem quite clearly. Conceptually I think it
>> has always been wrong to call xfree() from stop-machine context. It
>> just so happened that we got away with that so far, because the CPU
>> being brought down was the only one using respective functions (and
>> hence there was no other risk of locking issues).
>>
>> Question is whether we want to continue building upon this (and
>> hence the involved assertion would need to "learn" to ignore
>> stop-machine context) or whether instead the freeing of the memory
>> here can be deferred, e.g. to be taken care of by the CPU driving
>> the offlining process.
> 
> This is even more complicated.
> 
> I think ASSERT_ALLOC_CONTEXT() will trigger more often, especially
> with core scheduling enabled. In fact I think this is the reason why
> I've seen very rare strange failures with core scheduling when trying
> cpu hotplug operations, as there are even xmalloc() calls in stop
> machine context.
> 
> I'm seeing the following possibilities:
> 
> 1) Pre-allocating the needed data and deferring freeing of no longer
>     needed data when taking a cpu down. Apart form some refactoring
>     in common/sched/cpupool.c and common/sched/core.c this should be
>     doable.
> 
> 2) In case stop_machine() is called for action on only one cpu allow
>     memory allocations and freeing with interrupts off and flush the
>     TLBs locally when enabling interrupts again. This would require
>     rather limited changes, but wouldn't be as clean as the other
>     approach.
> 
> Any preferences? I'd be fine with both variants and could write the
> patches.

I'd prefer 1 over 2, but in the unlikely event that 1 ends up unwieldy
I could live with an extensively commented form of 2.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.