[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/pv: inject #UD for entirely missing SYSCALL callbacks
On 26/10/2020 09:40, Jan Beulich wrote: > In the case that no 64-bit SYSCALL callback is registered, the guest > will be crashed when 64-bit userspace executes a SYSCALL instruction, > which would be a userspace => kernel DoS. Similarly for 32-bit > userspace when no 32-bit SYSCALL callback was registered either. > > This has been the case ever since the introduction of 64bit PV support, > but behaves unlike all other SYSCALL/SYSENTER callbacks in Xen, which > yield #GP/#UD in userspace before the callback is registered, and are > therefore safe by default. > > This change does constitute a change in the PV ABI, for the corner case > of a PV guest kernel not registering a 64-bit callback (which has to be > considered a defacto requirement of the unwritten PV ABI, considering > there is no PV equivalent of EFER.SCE). > > It brings the behaviour in line with PV32 SYSCALL/SYSENTER, and PV64 > SYSENTER (safe by default, until explicitly enabled). > > Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Signed-off-by: Jan Beulich <JBeulich@xxxxxxxx> > --- > v3: > * Split this change off of "x86/pv: Inject #UD for missing SYSCALL > callbacks", to allow the uncontroversial part of that change to go > in. Add conditional "rex64" for UREGS_rip adjustment. (Is branching > over just the REX prefix too much trickery even for an unlikely to be > taken code path?) I find this submission confusion seeing as my v3 is already suitably acked and ready to commit. What I haven't had yet enough free time to do so. > v2: > * Drop unnecessary instruction suffixes > * Don't truncate #UD entrypoint to 32 bits > > --- a/xen/arch/x86/x86_64/entry.S > +++ b/xen/arch/x86/x86_64/entry.S > @@ -33,11 +33,27 @@ ENTRY(switch_to_kernel) > cmoveq VCPU_syscall32_addr(%rbx),%rax > testq %rax,%rax > cmovzq VCPU_syscall_addr(%rbx),%rax > - movq %rax,TRAPBOUNCE_eip(%rdx) > /* TB_flags = VGCF_syscall_disables_events ? TBF_INTERRUPT : 0 */ > btl $_VGCF_syscall_disables_events,VCPU_guest_context_flags(%rbx) > setc %cl > leal (,%rcx,TBF_INTERRUPT),%ecx > + > + test %rax, %rax > +UNLIKELY_START(z, syscall_no_callback) /* TB_eip == 0 => #UD */ > + mov VCPU_trap_ctxt(%rbx), %rdi > + movl $X86_EXC_UD, UREGS_entry_vector(%rsp) > + cmpw $FLAT_USER_CS32, UREGS_cs(%rsp) > + je 0f > + rex64 # subl => subq > +0: > + subl $2, UREGS_rip(%rsp) There was deliberately not a 32bit sub here (see below). As to the construct, I'm having a hard time deciding whether this is an excellent idea, or far too clever for its own good. Some basic perf testing shows that there is a visible difference in execution time of these two paths, which means there is some optimisation being missed in the frontend for the 32bit case. However, the difference is tiny in the grand scheme of things (about 0.4% difference in aggregate time to execute a loop of this pattern with a fixed number of iterations.) However, the 32bit case isn't actually interesting here. A guest can't execute a SYSCALL instruction on/across the 4G->0 boundary because the M2P is mapped NX up to the 4G boundary, so we can never reach this point with %eip < 2. Therefore, the 64bit-only form is the appropriate one to use, which solves any question of cleverness, or potential decode stalls it causes. ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |