[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] RFC: Proposal for supporting EL1 MPU region context switch in Xen
# Proposal for supporting EL1 MPU region context switch in Xen This proposal will introduce the proposed design for supporting EL1 MPU region context switch for guests. ## Purpose We would like to be able to support the PMSAv8-64 translation regime at EL1 for Xen guests. We would also like to configure the number of supported MPU regions on a per-guest basis. ## Interface: We propose to add a new device tree property `mpu`, which specifies the number of MPU memory regions that a guest is allowed to use and governs whether the EL1 MPU and the PMSAv8-64 translation regime are enabled at EL1 for a guest. The property is specified as follows: - mpu Optional. A 32-bit integer specifying the value returned by accesses to MPUIR_EL1.REGION (or MPUIR.REGION on AArch32). This property also governs whether the EL1 MPU and the PMSAv8-64 translation regime are enabled at EL1. Behavior: - `mpu = <0>;` Disables the EL1 MPU and the PMSAv8-64 translation regime at EL1. This is also the default behavior if the `mpu` property is omitted. - `mpu;` (property present with no value) Enables the EL1 MPU and the PMSAv8-64 translation regime at EL1. The value returned by accesses to MPUIR_EL1.REGION (or MPUIR.REGION on AArch32) will match the actual hardware register value. - `mpu = <N>;` (N > 0) Enables the EL1 MPU and the PMSAv8-64 translation regime at EL1. The value returned by accesses to MPUIR_EL1.REGION (or MPUIR.REGION on AArch32) is set to N. This value must not exceed the actual hardware-supported number of regions. Domain creation will fail and the system will halt if: - A non-zero value is specified but exceeds the hardware-supported number of MPU regions. - A non-zero value is specified but the kernel is not built with `CONFIG_MPU`. A new field - `nr_mpu_regions` - will be added to the arm-specific `struct arch_domain` in xen/arch/arm/include/asm/domain.h to store the value of the `mpu` device tree property. The field has type uint8_t. ## Trapping and Emulation: In order to control the number of regions supported by the EL1 MPU we must trap accesses to MPUIR_EL1. Additionally, to prevent accesses to or modification of MPU regions outside of the range of the configured number of supported regions we must also trap accesses to PRENR_EL1, PRSELR_EL1, PRBAR_EL1, PRBAR<n>_EL1, PRLAR_EL1, PRLAR<n>_EL1 (AArch64) and PRSELR, PRBAR, PRBAR<n>, PRLAR, PRLAR<n> (AArch32). ### Trapping accesses to MPUIR_EL1 (AArch64) and MPUIR (AArch32): Access to MPUIR_EL1/MPUIR will be trapped and emulated, returning the value of the `nr_mpu_regions` field. - On AArch64: if HCR_EL2.TID1 == 1, EL1 accesses to MPUIR_EL1, REVIDR_EL1, AIDR_EL1 are trapped to EL2 [^1]. We will emulate these as follows: - MPUIR_EL1: return value of `nr_mpu_regions` field - REVIDR_EL1: Unmodified value read from hardware - AIDR_EL1: Unmodified value read from hardware - On AArch32: if HCR.TID1 == 1, EL1 accesses to MPUIR, TCMTR, TLBTR, REVIDR, AIDR are trapped to Hyp mode [^2]. - MPUIR: return value of `nr_mpu_regions` field - TCMTR: Unmodified value read from hardware - TLBTR: Unmodified value read from hardware - REVIDR: Unmodified value read from hardware - AIDR: Unmodified value read from hardware ### Trapping accesses to virtual memory control registers Accesses to the PMSAv8-64 virtual memory control registers from EL1 must also be trapped to EL2, to prevent modification of MPU regions outside of the range of the configured number of supported regions. - On AArch64: if HCR_EL2.TVM == 1, EL1 write accesses to virtual memory control registers are trapped to EL2 [^1]. We will emulate these as follows: - SCTLR_EL1: Unmodified value written to hardware - TTBR0_EL: Unmodified value written to hardware - TTBR1_EL: Unmodified value written to hardware - TCR_EL1: Unmodified value written to hardware - ESR_EL1: Unmodified value written to hardware - FAR_EL1: Unmodified value written to hardware - AFSR0_EL1: Unmodified value written to hardware - AFSR1_EL1: Unmodified value written to hardware - MAIR_EL1: Unmodified value written to hardware - AMAIR_EL1: Unmodified value written to hardware - CONTEXTIDR_EL1: Unmodified value written to hardware - PRENR_EL1: If value has any set bits in positions corresponding to MPU regions >= `nr_mpu_regions`, i.e. `value & ~((1U << nr_mpu_regions) - 1)` is non-zero, the write is ignored. Otherwise, the unmodified value is written to hardware. - PRSELR_EL1: If value >= `nr_mpu_regions` the write causes a guest crash. This deviates from the TRM, which states that the value of the register becomes UNKNOWN. Otherwise, the unmodified value is written to hardware. - PRBAR_EL1: Unmodified value written to hardware. - PRBAR<n>_EL1: If `n` is such that the value of `PRSELR_EL1.REGION<7:4>:n` >= `nr_mpu_regions` the write causes a guest crash. This deviates from the TRM, which states that invalid writes make all PRBAR_EL1 registers value UNKNOWN. Otherwise the unmodified value is written to hardware. - PRLAR_EL1: Unmodified value written to hardware. - PRLAR<n>_EL1: If `n` is such that the value of `PRSELR_EL1.REGION<7:4>:n` >= `nr_mpu_regions` the write causes a guest crash. This deviates from the TRM, which states that invalid writes make all PRBAR_EL1 registers value UNKNOWN. Otherwise the unmodified value is written to hardware. - On AArch64: if HCR_EL2.TRVM == 1, EL1 read accesses to virtual memory control registers are trapped to EL2 [^1]. We will emulate these as follows: - SCTLR_EL1: Unmodified value read from hardware - TTBR0_EL: Unmodified value read from hardware - TTBR1_EL: Unmodified value read from hardware - TCR_EL1: Unmodified value read from hardware - ESR_EL1: Unmodified value read from hardware - FAR_EL1: Unmodified value read from hardware - AFSR0_EL1: Unmodified value read from hardware - AFSR1_EL1: Unmodified value read from hardware - MAIR_EL1: Unmodified value read from hardware - AMAIR_EL1: Unmodified value read from hardware - CONTEXTIDR_EL1: Unmodified value read from hardware - PRENR_EL1: Unmodified value read from hardware - PRSELR_EL1: Unmodified value read from hardware - PRBAR_EL1: Unmodified value read from hardware. - PRBAR<n>_EL1: If `n` is such that the value of `PRSELR_EL1.REGION<7:4>:n` >= `nr_mpu_regions` the read causes a guest crash. This deviates from the TRM which states that invalid reads return an UNKNOWN value. Otherwise the unmodified value is read from hardware. - PRLAR_EL1: Unmodified value read from hardware. - PRLAR<n>_EL1: If `n` is such that the value of `PRSELR_EL1.REGION<7:4>:n` >= `nr_mpu_regions` the read causes a guest crash. This deviates from the TRM which states that invalid reads return an UNKNOWN value. Otherwise the unmodified value is read from hardware. - On AArch32: if HCR.TVM == 1, EL1 write accesses to memory control registers are trapped to Hyp mode [^2]. We will emulate these as follows: - SCTLR: Unmodified value written to hardware - DFSR: Unmodified value written to hardware - IFSR: Unmodified value written to hardware - DFAR: Unmodified value written to hardware - IFAR: Unmodified value written to hardware - ADFSR: Unmodified value written to hardware - AIFSR: Unmodified value written to hardware - PRRR: Unmodified value written to hardware - NMRR: Unmodified value written to hardware - MAIR0: Unmodified value written to hardware - MAIR1: Unmodified value written to hardware - AMAIR0: Unmodified value written to hardware - AMAIR1: Unmodified value written to hardware - CONTEXTIDR: Unmodified value written to hardware - PRSELR: If value >= `nr_mpu_regions` the write causes a guest crash. Otherwise, the unmodified value is written to hardware. - PRBAR: Unmodified value written to hardware - PRBAR<n>: If `n` >= `nr_mpu_regions` the write causes a guest crash. Otherwise the unmodified value is written to hardware. - PRLAR: Unmodified value written to hardware - PRLAR<n>: If `n` >= `nr_mpu_regions` the write causes a guest crash. Otherwise the unmodified value is written to hardware. - On AArch32: if HCR.TRVM == 1, EL1 read accesses to memory control registers are trapped to Hyp mode [^2]. We will emulate these as follows: - SCTLR: Unmodified value read from hardware - DFSR: Unmodified value read from hardware - IFSR: Unmodified value read from hardware - DFAR: Unmodified value read from hardware - IFAR: Unmodified value read from hardware - ADFSR: Unmodified value read from hardware - AIFSR: Unmodified value read from hardware - PRRR: Unmodified value read from hardware - NMRR: Unmodified value read from hardware - MAIR0: Unmodified value read from hardware - MAIR1: Unmodified value read from hardware - AMAIR0: Unmodified value read from hardware - AMAIR1: Unmodified value read from hardware - CONTEXTIDR: Unmodified value read from hardware - PRSELR: Unmodified value read from hardware - PRBAR: Unmodified value read from hardware. - PRBAR<n>: If `n` >= `nr_mpu_regions` the read causes a guest crash. Otherwise the unmodified value is read from hardware. - PRLAR: Unmodified value read from hardware - PRLAR<n>: If `n` >= `nr_mpu_regions` the read causes a guest crash. Otherwise the unmodified value is read from hardware. On context switch, we need to ensure that: - PRSELR_EL1 (AArch64) and PRSELR (AArch32) is saved/restored - Base and Limit registers for all MPU regions up to the number of regions supported by the guest, i.e. regions [0, nr_mpu_regions-1] are saved/restored. It is not necessary to zero the MPU memory regions outside of this range, as these are rendered inaccessible to a guest via trapping and emulation of the virtual memory control registers. ## Interaction with existing handling of Set/Way operations In order to handle Set/Way operations the following policy is used [^3]: - If we trap a S/W operation, we enable VM trapping (HCR_EL2.TVM/HCR.TVM == 1) to detect caches being turned on/off, and do a full clean. - Once the caches are enabled, we disable VM trapping (HCR_EL2.TVM/HCR.TVM == 0) This causes an issue, because VM control register trapping will be switched off any time caches change state from disabled to enabled. We propose to address this by not disabling VM control register trapping once caches are enabled for PMSAv8-64 guests. ## Considerations: - If we zero the non-accessible Base/Limit registers on context switch then we could avoid trapping read accesses to virtual memory control registers. The trade-off here would be additional overhead on context switch due to zeroing the maximum number of architecturally supported MPU regions (255). [^1] https://developer.arm.com/documentation/ddi0487/latest (G1.3.3) [^2] https://developer.arm.com/documentation/ddi0487/latest (E2.1.5) [^3] https://gitlab.com/xen-project/xen/-/blob/staging/xen/arch/arm/p2m.c#L404-L431 --- Cheers, Hari
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |