[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v3] x86/pv: inject #UD for entirely missing SYSCALL callbacks
On 28.10.2020 22:31, Andrew Cooper wrote: > On 26/10/2020 09:40, Jan Beulich wrote: >> In the case that no 64-bit SYSCALL callback is registered, the guest >> will be crashed when 64-bit userspace executes a SYSCALL instruction, >> which would be a userspace => kernel DoS. Similarly for 32-bit >> userspace when no 32-bit SYSCALL callback was registered either. >> >> This has been the case ever since the introduction of 64bit PV support, >> but behaves unlike all other SYSCALL/SYSENTER callbacks in Xen, which >> yield #GP/#UD in userspace before the callback is registered, and are >> therefore safe by default. >> >> This change does constitute a change in the PV ABI, for the corner case >> of a PV guest kernel not registering a 64-bit callback (which has to be >> considered a defacto requirement of the unwritten PV ABI, considering >> there is no PV equivalent of EFER.SCE). >> >> It brings the behaviour in line with PV32 SYSCALL/SYSENTER, and PV64 >> SYSENTER (safe by default, until explicitly enabled). >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> Signed-off-by: Jan Beulich <JBeulich@xxxxxxxx> >> --- >> v3: >> * Split this change off of "x86/pv: Inject #UD for missing SYSCALL >> callbacks", to allow the uncontroversial part of that change to go >> in. Add conditional "rex64" for UREGS_rip adjustment. (Is branching >> over just the REX prefix too much trickery even for an unlikely to be >> taken code path?) > > I find this submission confusion seeing as my v3 is already suitably > acked and ready to commit. What I haven't had yet enough free time to > do so. My objection to the other half stands, and hence, I'm afraid, stands in the way of your patch getting committed. Aiui Roger's ack doesn't invalidate my objection, sorry. >> --- a/xen/arch/x86/x86_64/entry.S >> +++ b/xen/arch/x86/x86_64/entry.S >> @@ -33,11 +33,27 @@ ENTRY(switch_to_kernel) >> cmoveq VCPU_syscall32_addr(%rbx),%rax >> testq %rax,%rax >> cmovzq VCPU_syscall_addr(%rbx),%rax >> - movq %rax,TRAPBOUNCE_eip(%rdx) >> /* TB_flags = VGCF_syscall_disables_events ? TBF_INTERRUPT : 0 */ >> btl $_VGCF_syscall_disables_events,VCPU_guest_context_flags(%rbx) >> setc %cl >> leal (,%rcx,TBF_INTERRUPT),%ecx >> + >> + test %rax, %rax >> +UNLIKELY_START(z, syscall_no_callback) /* TB_eip == 0 => #UD */ >> + mov VCPU_trap_ctxt(%rbx), %rdi >> + movl $X86_EXC_UD, UREGS_entry_vector(%rsp) >> + cmpw $FLAT_USER_CS32, UREGS_cs(%rsp) >> + je 0f >> + rex64 # subl => subq >> +0: >> + subl $2, UREGS_rip(%rsp) > > There was deliberately not a 32bit sub here (see below). Funny you should say this when what I've taken as input (v2; I don't think I had seen a v3, or else I would have called this one v4) had "subl", not "subq". Roger's comment was regarding a "mov" with a 32-bit register destination, not this "sub". If you had also noticed and fixed this one, without you having posted v3 (or without me having seen the posting) I couldn't have known. > As to the construct, I'm having a hard time deciding whether this is an > excellent idea, or far too clever for its own good. > > Some basic perf testing shows that there is a visible difference in > execution time of these two paths, which means there is some > optimisation being missed in the frontend for the 32bit case. However, > the difference is tiny in the grand scheme of things (about 0.4% > difference in aggregate time to execute a loop of this pattern with a > fixed number of iterations.) > > However, the 32bit case isn't actually interesting here. A guest can't > execute a SYSCALL instruction on/across the 4G->0 boundary because the > M2P is mapped NX up to the 4G boundary, so we can never reach this point > with %eip < 2. > > Therefore, the 64bit-only form is the appropriate one to use, which > solves any question of cleverness, or potential decode stalls it causes. Good point. Jan
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |