|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] xen/x86: Change stub page freeing to fix smt=0
On 03/06/2026 3:03 pm, Jason Andryuk wrote:
> On 2026-06-02 03:01, Roger Pau Monné wrote:
>> On Mon, Jun 01, 2026 at 05:07:52PM -0400, Jason Andryuk wrote:
>>> On 2026-06-01 13:00, Roger Pau Monné wrote:
>>>> On Tue, May 26, 2026 at 04:31:14PM -0400, Jason Andryuk wrote:
>>>>> A single stubs page is initialized with 0xcc and re-used, with
>>>>> multiple
>>>>> CPUs each using a portion of the shared page. In cpu_smpboot_free(),
>>>>> each stubs area is checked against 0xcc. When all are set to
>>>>> 0xcc, the
>>>>> page is freed.
>>>>>
>>>>> Booting a system with smt=0, CPU0 is initially setup, allocating the
>>>>> stubs page and initializing to 0xcc. When more CPUs are brought up,
>>>>> CPU1 is initialized and then immediately brough offline as it is the
>>>>> sibling of CPU0. Since the page was initially memset with 0xcc,
>>>>> cpu_smpboot_free() finds all stubs as 0xcc and frees the page.
>>>>> However, the page is still assigned to CPU0 and continues to be
>>>>> assigned
>>>>> to other CPUs.
>>>>>
>>>>> Meanwhile the page can be reallocated, which can lead to misbehavior.
>>>>> The particular instance was the stubs page re-used as a page table
>>>>> which
>>>>> later faulted when the entry was all 0xcc.
>>>>>
>>>>> Change to initializing the page as 0xd6/STUB_BUF_FREE, and
>>>>> initializing
>>>>> individual stubs as 0xcc/STUB_BUF_USED. 0xd6 now indicates
>>>>> unused, and
>>>>> 0xcc indicates used/assigned. When freeing a CPU, the stub is set to
>>>>> 0xd6, and the page is freed if all stubs are 0xd6. Initializing with
>>>>> STUB_BUF_FREE lets cpu_smpboot_free() a page that was only ever
>>>>> partially used.
>>>>>
>>>>> 0xd6/UDB is a 1 byte invalid opcode, which is similar to the existing
>>>>> use of 0xcc. 0xd6 is used to identify bug frames, but the stub addr
>>>>> (e.g. 0xffff82d07fffe000) fails the is_active_kernel_text()
>>>>> check. It
>>>>> should be okay to use here.
>>>>>
>>>>> Fixes: 7a66ac8d1633 ("x86: move syscall trampolines off the stack")
>>>>> Signed-off-by: Jason Andryuk <jason.andryuk@xxxxxxx>
>>>>> ---
>>>>> It would be nice to use get_page()/put_page() to let count_info
>>>>> handle
>>>>> reference counting, but they require an owning domain.
>>>>>
>>>>> The listed Fixes introduced the use of 0xcc, but the smt commit
>>>>> may have
>>>>> made it more problematic.
>>>>> Fixes: d8f974f1a646 ("x86: command line option to avoid use of
>>>>> secondary hyper-threads")
>>>>
>>>> Speaking with Andrew, we believe it might be easier to simply forego
>>>> the freeing of the page, possibly something like:
>>>>
>>>> diff --git a/xen/arch/x86/smpboot.c b/xen/arch/x86/smpboot.c
>>>> index ff05955bae40..62c6cbf4b561 100644
>>>> --- a/xen/arch/x86/smpboot.c
>>>> +++ b/xen/arch/x86/smpboot.c
>>>> @@ -990,19 +990,12 @@ static void cpu_smpboot_free(unsigned int
>>>> cpu, bool remove)
>>>> {
>>>> mfn_t mfn = _mfn(per_cpu(stubs.mfn, cpu));
>>>> unsigned char *stub_page = map_domain_page(mfn);
>>>> - unsigned int i;
>>>> memset(stub_page + STUB_BUF_CPU_OFFS(cpu), 0xcc,
>>>> STUB_BUF_SIZE);
>>>> - for ( i = 0; i < STUBS_PER_PAGE; ++i )
>>>> - if ( stub_page[i * STUB_BUF_SIZE] != 0xcc )
>>>> - break;
>>>> unmap_domain_page(stub_page);
>>>> destroy_xen_mappings(per_cpu(stubs.addr, cpu) & PAGE_MASK,
>>>> (per_cpu(stubs.addr, cpu) |
>>>> ~PAGE_MASK) + 1);
>>>> per_cpu(stubs.addr, cpu) = 0;
>>>> - per_cpu(stubs.mfn, cpu) = 0;
>>>> - if ( i == STUBS_PER_PAGE )
>>>> - free_domheap_page(mfn_to_page(mfn));
>>>> }
>>>> if ( IS_ENABLED(CONFIG_PV32) )
>>
>> I think I've made an oversight in the code above: if all 32 CPUs
>> sharing the same stubs page are offlined, the reference to the stubs
>> page is possibly lost (if CPUs are not parked) and a new stubs page
>> would be allocated if any of those CPUs is brought back online, thus
>> leaking the previous allocation. The simplest way to solve this would
>> be to introduce an array that indexes the stub pages, and replace the
>> logic in cpu_smpboot_alloc() that figures out whether stubs.mfn is set
>> for adjacent CPUs.
>
> Right, but I thought Andrew's point was that offlining 32 CPUs is
> unrealistic, so don't even bother tracking. If CPUs are offlined (and
> you somehow keep running), you can leak the page.
Perhaps I should rephrase that slightly.
I don't think we want to fully leak the page (after all, there *is* a
reference to it staying in l2_xenmap[]), but I also think we should
bother having logic attempting to free it. Software-offlining 32
adjacent threads is unrealistic, and not worth the effort (particularly
when the result is this).
Rework the code to use l2_xenmap[] as the source of truth, and allocate
a new ownerless page if the pagetables say one doesn't exist. When a
CPU comes up, it can derive addr from its CPU number, and pull MFN out
of the pagetable.
~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |