[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] x86/smpboot: Don't unconditionally call memguard_guard_stack() in cpu_smpboot_alloc()

On 15.10.2020 18:38, Andrew Cooper wrote:
> On 15/10/2020 16:16, Jan Beulich wrote:
>> On 15.10.2020 16:02, Andrew Cooper wrote:
>>> On 15/10/2020 09:50, Jan Beulich wrote:
>>>> On 14.10.2020 20:47, Andrew Cooper wrote:
>>>>> cpu_smpboot_alloc() is designed to be idempotent with respect to partially
>>>>> initialised state.  This occurs for S3 and CPU parking, where enough 
>>>>> state to
>>>>> handle NMIs/#MCs needs to remain valid for the entire lifetime of Xen, 
>>>>> even
>>>>> when we otherwise want to offline the CPU.
>>>>> For simplicity between various configuration, Xen always uses shadow stack
>>>>> mappings (Read-only + Dirty) for the guard page, irrespective of whether
>>>>> CET-SS is enabled.
>>>>> Unfortunately, the CET-SS changes in memguard_guard_stack() broke 
>>>>> idempotency
>>>>> by first writing out the supervisor shadow stack tokens with plain writes,
>>>>> then changing the mapping to being read-only.
>>>>> This ordering is strictly necessary to configure the BSP, which cannot 
>>>>> have
>>>>> the supervisor tokens be written with WRSS.
>>>>> Instead of calling memguard_guard_stack() unconditionally, call it only 
>>>>> when
>>>>> actually allocating a new stack.  Xenheap allocates are guaranteed to be
>>>>> writeable, and the net result is idempotency WRT configuring stack_base[].
>>>>> Fixes: 91d26ed304f ("x86/shstk: Create shadow stacks")
>>>>> Reported-by: Marek Marczykowski-Górecki <marmarek@xxxxxxxxxxxxxxxxxxxxxx>
>>>>> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>>>>> ---
>>>>> CC: Jan Beulich <JBeulich@xxxxxxxx>
>>>>> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>>>>> CC: Wei Liu <wl@xxxxxxx>
>>>>> This can more easily be demonstrated with CPU hotplug than S3, and the 
>>>>> absence
>>>>> of bug reports goes to show how rarely hotplug is used.
>>>>> v2:
>>>>>  * Don't break S3/CPU parking in combination with CET-SS.  v1 would, for 
>>>>> S3,
>>>>>    turn the BSP shadow stack into regular mappings, and #DF as soon as 
>>>>> the TLB
>>>>>    shootdown completes.
>>>> The code change looks correct to me, but since I don't understand
>>>> this part I'm afraid I may be overlooking something. I understand
>>>> the "turn the BSP shadow stack into regular mappings" relates to
>>>> cpu_smpboot_free()'s call to memguard_unguard_stack(), but I
>>>> didn't think we come through cpu_smpboot_free() for the BSP upon
>>>> entering or leaving S3.
>>> The v1 really did fix Marek's repro of the problem.
>>> The only possible way this can occur is if, somewhere, there is a call
>>> to cpu_smpboot_free() for CPU0 with remove=0 on the S3 path
>> I didn't think it was the BSP's stack that got written to, but the
>> first AP's before letting it run.
> Oh yes - my analysis was wrong.  The CPU notifier for CPU 1 to come up
> runs on CPU 0.
> So only the --- text was wrong.  Are you happy with the fix now?

Indeed I am:
Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>




Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.