Xen project Mailing List

Re: [Xen-devel] [PATCH v2 01/11] x86emul: catch exceptions occurring in stubs

From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>

Date: Tue, 14 Feb 2017 10:56:18 +0000

Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 14 Feb 2017 10:56:24 +0000

List-id: Xen developer discussion <xen-devel.lists.xen.org>

On 13/02/17 16:20, Jan Beulich wrote: >>>> On 13.02.17 at 14:58, <andrew.cooper3@xxxxxxxxxx> wrote: >> On 13/02/17 11:40, Jan Beulich wrote: >>>>>> On 10.02.17 at 17:38, <andrew.cooper3@xxxxxxxxxx> wrote: >>>> On 01/02/17 11:12, Jan Beulich wrote: >>>>> Before adding more use of stubs cloned from decoded guest insns, guard >>>>> ourselves against mistakes there: Should an exception (with the >>>>> noteworthy exception of #PF) occur inside the stub, forward it to the >>>>> guest. >>>> Why exclude #PF ? Nothing in a stub should be hitting a pagefault in the >>>> first place. >>> To be honest, I've been considering to limit this to just #UD. We >>> clearly shouldn't hide memory addressing issues, as them going >>> by silently means information leaks. Nevertheless including #PF >>> here would be a trivial change to the patch. >> When I considered this first, my plan was to catch the fault and crash >> the domain, rather than allow it to continue (FPU exceptions being the >> exception). >> >> One way or another, by the time we encounter a fault in the stubs, >> something has gone wrong, and crashing the domain is better than >> crashing the host. (In fact, I am looking to extend this principle >> further, e.g. with vmread/vmwrite failures.) >> >> I don't see #PF being meaningfully different to #GP or #SS here. If we >> get a fault, an action was stopped, but we can never catch the issues >> which don't fault in the first place. >> >> #UD is a little more tricky. It either means that we created a >> malformed stub, or we didn't have sufficient feature checking, both of >> which are emulation bugs. This could be passed back to the domain, but >> I'd err on the side of making it more obvious by crashing the domain. > Generally yes, but I think here we really should forward at least > #UD. I can agree with other faults being terminal to the domain, > which will the also allow #PF to be handled uniformly (as there > won't be a need to propagate some CR2 value). > >> (Perhaps changing behaviour based on debug?) > Not here, I would say - this logic should be tested the way it is > meant to be run in production. Ok. Could we at least get a printk() in the case of handing a fault like this back to the guest, so we stand a chance of noticing the emulation bug and fixing it? > >>>>> --- >>>>> There's one possible caveat here: A stub invocation immediately >>>>> followed by another instruction having fault revovery attached to it >>>>> would not work properly, as the table lookup can only ever find one of >>>>> the two entries. Such CALL instructions would then need to be followed >>>>> by a NOP for disambiguation (even if only a slim chance exists for the >>>>> compiler to emit things that way). >>>> Why key on return address at all? %rip being in the stubs should be >>>> good enough. >>> Well, we need unique (key-address, recovery-address) tuples, >>> and key-address can't possibly be an address inside the stub >>> (for both the address varying between CPUs and said uniqueness >>> requirement). >> We don't necessarily need to use the extable infrastructure, and you >> don't appear to be using a unique key at all. > How am I not? How would both the self test and the emulator > uses work without unique addresses to key off of? I'd not followed how you were hooking the information up. Sorry for the noise. >>>>> @@ -85,15 +86,88 @@ search_one_extable(const struct exceptio >>>>> } >>>>> >>>>> unsigned long >>>>> -search_exception_table(unsigned long addr) >>>>> +search_exception_table(const struct cpu_user_regs *regs, bool check_stub) >>>>> { >>>>> - const struct virtual_region *region = find_text_region(addr); >>>>> + const struct virtual_region *region = find_text_region(regs->rip); >>>>> + unsigned long stub = this_cpu(stubs.addr); >>>>> >>>>> if ( region && region->ex ) >>>>> - return search_one_extable(region->ex, region->ex_end - 1, addr); >>>>> + return search_one_extable(region->ex, region->ex_end - 1, >>>>> regs->rip); >>>>> + >>>>> + if ( check_stub && >>>>> + regs->rip >= stub + STUB_BUF_SIZE / 2 && >>>>> + regs->rip < stub + STUB_BUF_SIZE && >>>>> + regs->rsp > (unsigned long)&check_stub && >>>>> + regs->rsp < (unsigned long)get_cpu_info() ) >>>> How much do we care about accidentally clobbering %rsp in a stub? >>> I think we can't guard against everything, but we should do the >>> best we reasonably can. I.e. in the case here, rather than >>> reading the return (key) address from somewhere outside the >>> stack (easing a possible attacker's job), don't handle the fault >>> at all, and instead accept the crash. >> As before, it would be better overall to result in a domain_crash() than >> a host crash. > Yes (see above), but in no case should we do the crashing here > (or else, even if it may seem marginal, the self test won't work > anymore). To provide best functionality to current and possible > future uses, we should leave the decision for which exceptions > to crash the guest to the recovery code. Yes - we should leave the action up to the recovery code. > >>> Plus what would you intend to do with the RIP to return to and/or >>> the extra item on the stack? I'd like the generic mechanism here >>> to impose as little restrictions on its potential use cases as >>> possible, just like the pre-existing mechanism does. The fiddling >>> with stack and return address is already more than I really feel >>> happy with, but I can't see a way to do without. >> An alternative might be have a per_cpu() variable filled in by the >> exception handler. This would avoid any need to play with the return >> address and stack under the feet of the stub. > But we need to fiddle with the return address anyway, as we can't > return to that address. Ah yes, of course. I have no idea why this point managed to escape me. > Leaving the stack as is (i.e. only reading > the address) won't make things any less awkward for the > recovery code, as it'll still run on a one-off stack. It is for that > reason that I preferred to use this stack slot to communicate the > information, instead of going the per-CPU variable route (which I > did consider). Hmm ok. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.