[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 03/22] x86/dom0: only disable SMAP for the PV dom0 build
On 30/07/2024 11:55 am, Roger Pau Monné wrote: > On Mon, Jul 29, 2024 at 06:51:31PM +0100, Andrew Cooper wrote: >> On 29/07/2024 5:18 pm, Roger Pau Monné wrote: >>> On Mon, Jul 29, 2024 at 04:52:22PM +0100, Andrew Cooper wrote: >>>> On 29/07/2024 12:53 pm, Jan Beulich wrote: >>>>> On 26.07.2024 17:21, Roger Pau Monne wrote: >>>>>> The PVH dom0 builder doesn't switch page tables and has no need to run >>>>>> with >>>>>> SMAP disabled. >>>>>> >>>>>> Put the SMAP disabling close to the code region where it's necessary, as >>>>>> it >>>>>> then becomes obvious why switch_cr3_cr4() is required instead of >>>>>> write_ptbase(). >>>>>> >>>>>> Note removing SMAP from cr4_pv32_mask is not required, as we never jump >>>>>> into >>>>>> guest context, and hence updating the value of cr4_pv32_mask is not >>>>>> relevant. >>>>> I'm okay-ish with that being dropped, but iirc the goal was to keep the >>>>> variable in sync with CPU state. >>>> Removing SMAP from cr4_pv32_mask is necessary. >>>> >>>> Otherwise IST vectors will reactive SMAP behind the back of the dombuilder. >>>> >>>> This will probably only manifest in practice in a CONFIG_PV32=y build, >>> Sorry, I'm possibly missing some context here. When running the dom0 >>> builder we switch to the guest page-tables, but not to the guest vCPU, >>> (iow: current == idle) and hence the context is always the Xen >>> context. >> Correct. >> >>> Why would the return path of the IST use cr4_pv32_mask when the >>> context in which the IST happened was the Xen one, and the current >>> vCPU is the idle one (a 64bit PV guest from Xen's PoV). >>> >>> My understanding is that cr4_pv32_mask should only be used when the >>> current context is running a 32bit PV vCPU. >> This logic is evil to follow, because you need to look at both >> cr4_pv32_mask and XEN_CR4_PV32_BITS to see both halves of it. >> >> Notice how cr4_pv32_restore() only ever OR's cr4_pv32_mask into %cr4? >> >> CR4_PV32_RESTORE is called from every entry path which *might* have come >> from a 32bit PV guest, and it always results in Xen having SMEP/SMAP >> active (as applicable). This includes NMI. >> >> The change is only undone in compat_restore_all_guest(), where >> XEN_CR4_PV32_BITS is cleared from %cr4 iff returning to Ring1/2. This >> is logic cunningly disguised in the use of the Parity flag. >> >> >> Because the NMI handler does reactive SMEP/SMAP (based on the value in >> cr4_pv32_mask), and returning to Xen does not pass through >> compat_restore_all_guest(), taking an NMI in the middle of of the >> dombuilder will reactive SMAP behind your back. > After further conversations with Andrew we believe the current > disabling of X86_CR4_SMAP in %cr4 during dom0 build is not safe. > > Regardless of whether cr4_pv32_mask is properly adjusted return to Xen > context from interrupt would be done with SMAP enabled if > X86_FEATURE_XEN_SMAP is set. Sorry - that's not what I intended to convey. The logic prior to this patch is safe. SMAP is cleared from cr4_pv32_mask before clearing CR4.SMAP, and reinstated in the opposite order. Therefore, an NMI hitting the region won't reactivate SMAP because it's not (instantaniously) set in cr4_pv32_mask. Arguably it wants some barrier()'s for clarity, and an explanation of why this works. The problem your patch has is that by not clearing SMAP from cr4_pv32_mask, it becomes unsafe iff an NMI/#MC/#DB hits the region. > I will send a new patch that uses stac/clac in order to disable SMAP > (if required) around the dom0 builder code that switches to the guest > page-tables. Either way - this is a much cleaner solution. ~Andrew
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |