[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc()

On 15.10.2020 16:02, Andrew Cooper wrote:
> On 15/10/2020 09:50, Jan Beulich wrote:
>> On 14.10.2020 20:47, Andrew Cooper wrote:
>>> cpu_smpboot_alloc() is designed to be idempotent with respect to partially
>>> initialised state.  This occurs for S3 and CPU parking, where enough state 
>>> to
>>> handle NMIs/#MCs needs to remain valid for the entire lifetime of Xen, even
>>> when we otherwise want to offline the CPU.
>>> For simplicity between various configuration, Xen always uses shadow stack
>>> mappings (Read-only + Dirty) for the guard page, irrespective of whether
>>> CET-SS is enabled.
>>> Unfortunately, the CET-SS changes in memguard_guard_stack() broke 
>>> idempotency
>>> by first writing out the supervisor shadow stack tokens with plain writes,
>>> then changing the mapping to being read-only.
>>> This ordering is strictly necessary to configure the BSP, which cannot have
>>> the supervisor tokens be written with WRSS.
>>> Instead of calling memguard_guard_stack() unconditionally, call it only when
>>> actually allocating a new stack.  Xenheap allocates are guaranteed to be
>>> writeable, and the net result is idempotency WRT configuring stack_base[].
>>> Fixes: 91d26ed304f ("x86/shstk: Create shadow stacks")
>>> Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>> ---
>>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>> CC: Wei Liu <wl@xxxxxxx>
>>> This can more easily be demonstrated with CPU hotplug than S3, and the 
>>> absence
>>> of bug reports goes to show how rarely hotplug is used.
>>> v2:
>>>  * Don't break S3/CPU parking in combination with CET-SS.  v1 would, for S3,
>>>    turn the BSP shadow stack into regular mappings, and #DF as soon as the 
>>> TLB
>>>    shootdown completes.
>> The code change looks correct to me, but since I don't understand
>> this part I'm afraid I may be overlooking something. I understand
>> the "turn the BSP shadow stack into regular mappings" relates to
>> cpu_smpboot_free()'s call to memguard_unguard_stack(), but I
>> didn't think we come through cpu_smpboot_free() for the BSP upon
>> entering or leaving S3.
> The v1 really did fix Marek's repro of the problem.
> The only possible way this can occur is if, somewhere, there is a call
> to cpu_smpboot_free() for CPU0 with remove=0 on the S3 path

I didn't think it was the BSP's stack that got written to, but the
first AP's before letting it run.




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.