[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 01/11] x86emul: catch exceptions occurring in stubs



On 01/02/17 11:12, Jan Beulich wrote:
> Before adding more use of stubs cloned from decoded guest insns, guard
> ourselves against mistakes there: Should an exception (with the
> noteworthy exception of #PF) occur inside the stub, forward it to the
> guest.

Why exclude #PF ? Nothing in a stub should be hitting a pagefault in the
first place.

>
> Since the exception fixup table entry can't encode the address of the
> faulting insn itself, attach it to the return address instead. This at
> once provides a convenient place to hand the exception information
> back: The return address is being overwritten by it before branching to
> the recovery code.
>
> Take the opportunity and (finally!) add symbol resolution to the
> respective log messages (the new one is intentionally not being coded
> that way, as it covers stub addresses only, which don't have symbols
> associated).
>
> Also take the opportunity and make search_one_extable() static again.
>
> Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> There's one possible caveat here: A stub invocation immediately
> followed by another instruction having fault revovery attached to it
> would not work properly, as the table lookup can only ever find one of
> the two entries. Such CALL instructions would then need to be followed
> by a NOP for disambiguation (even if only a slim chance exists for the
> compiler to emit things that way).

Why key on return address at all?  %rip being in the stubs should be
good enough.

>
> TBD: Instead of adding a 2nd search_exception_table() invocation to
>      do_trap(), we may want to consider moving the existing one down:
>      Xen code (except when executing stubs) shouldn't be raising #MF
>      or #XM, and hence fixups attached to instructions shouldn't care
>      about getting invoked for those. With that, doing the HVM special
>      case for them before running search_exception_table() would be
>      fine.
>
> Note that the two SIMD related stub invocations in the insn emulator
> intentionally don't get adjusted here, as subsequent patches will
> replace them anyway.
>
> --- a/xen/arch/x86/extable.c
> +++ b/xen/arch/x86/extable.c
> @@ -6,6 +6,7 @@
>  #include <xen/sort.h>
>  #include <xen/spinlock.h>
>  #include <asm/uaccess.h>
> +#include <xen/domain_page.h>
>  #include <xen/virtual_region.h>
>  #include <xen/livepatch.h>
>  
> @@ -62,7 +63,7 @@ void __init sort_exception_tables(void)
>      sort_exception_table(__start___pre_ex_table, __stop___pre_ex_table);
>  }
>  
> -unsigned long
> +static unsigned long
>  search_one_extable(const struct exception_table_entry *first,
>                     const struct exception_table_entry *last,
>                     unsigned long value)
> @@ -85,15 +86,88 @@ search_one_extable(const struct exceptio
>  }
>  
>  unsigned long
> -search_exception_table(unsigned long addr)
> +search_exception_table(const struct cpu_user_regs *regs, bool check_stub)
>  {
> -    const struct virtual_region *region = find_text_region(addr);
> +    const struct virtual_region *region = find_text_region(regs->rip);
> +    unsigned long stub = this_cpu(stubs.addr);
>  
>      if ( region && region->ex )
> -        return search_one_extable(region->ex, region->ex_end - 1, addr);
> +        return search_one_extable(region->ex, region->ex_end - 1, regs->rip);
> +
> +    if ( check_stub &&
> +         regs->rip >= stub + STUB_BUF_SIZE / 2 &&
> +         regs->rip < stub + STUB_BUF_SIZE &&
> +         regs->rsp > (unsigned long)&check_stub &&
> +         regs->rsp < (unsigned long)get_cpu_info() )

How much do we care about accidentally clobbering %rsp in a stub?

If we encounter a fault with %rip in the stubs, we should terminate
obviously if %rsp it outside of the main stack.  Nothing good can come
from continuing.

> +    {
> +        unsigned long retptr = *(unsigned long *)regs->rsp;
> +
> +        region = find_text_region(retptr);
> +        retptr = region && region->ex
> +                 ? search_one_extable(region->ex, region->ex_end - 1, retptr)
> +                 : 0;
> +        if ( retptr )
> +        {
> +            /*
> +             * Put trap number and error code on the stack (in place of the
> +             * original return address) for recovery code to pick up.
> +             */
> +            *(unsigned long *)regs->rsp = regs->error_code |
> +                ((uint64_t)(uint8_t)regs->entry_vector << 32);
> +            return retptr;

I have found an alternative which has proved very neat in XTF.

By calling the stub like this:

asm volatile ("call *%[stub]" : "=a" (exn) : "a" (0));

and having this fixup write straight into %rax, the stub ends up
behaving as having an unsigned long return value.  This avoids the need
for any out-of-line code recovering the exception information and
redirecting back as if the call had completed normally.

http://xenbits.xen.org/gitweb/?p=xtf.git;a=blob;f=include/arch/x86/exinfo.h;hb=master

One subtle trap I fell over is you also need a valid bit to help
distinguish #DE, which always has an error code of 0.

> +        }
> +    }
> +
> +    return 0;
> +}
> +
> +#ifndef NDEBUG
> +static int __init stub_selftest(void)
> +{
> +    static const struct {
> +        uint8_t opc[4];
> +        uint64_t rax;
> +        union stub_exception_token res;
> +    } tests[] __initconst = {
> +        { .opc = { 0x0f, 0xb9, 0xc3, 0xc3 }, /* ud1 */
> +          .res.fields.trapnr = TRAP_invalid_op },
> +        { .opc = { 0x90, 0x02, 0x00, 0xc3 }, /* nop; add (%rax),%al */
> +          .rax = 0x0123456789abcdef,
> +          .res.fields.trapnr = TRAP_gp_fault },
> +        { .opc = { 0x02, 0x04, 0x04, 0xc3 }, /* add (%rsp,%rax),%al */
> +          .rax = 0xfedcba9876543210,
> +          .res.fields.trapnr = TRAP_stack_error },
> +    };
> +    unsigned long addr = this_cpu(stubs.addr) + STUB_BUF_SIZE / 2;
> +    unsigned int i;
> +
> +    for ( i = 0; i < ARRAY_SIZE(tests); ++i )
> +    {
> +        uint8_t *ptr = map_domain_page(_mfn(this_cpu(stubs.mfn))) +
> +                       (addr & ~PAGE_MASK);
> +        unsigned long res = ~0;
> +
> +        memset(ptr, 0xcc, STUB_BUF_SIZE / 2);
> +        memcpy(ptr, tests[i].opc, ARRAY_SIZE(tests[i].opc));
> +        unmap_domain_page(ptr);
> +
> +        asm volatile ( "call *%[stb]\n"
> +                       ".Lret%=:\n\t"
> +                       ".pushsection .fixup,\"ax\"\n"
> +                       ".Lfix%=:\n\t"
> +                       "pop %[exn]\n\t"
> +                       "jmp .Lret%=\n\t"
> +                       ".popsection\n\t"
> +                       _ASM_EXTABLE(.Lret%=, .Lfix%=)
> +                       : [exn] "+m" (res)
> +                       : [stb] "rm" (addr), "a" (tests[i].rax));
> +        ASSERT(res == tests[i].res.raw);
> +    }
>  
>      return 0;
>  }
> +__initcall(stub_selftest);
> +#endif
>  
>  unsigned long
>  search_pre_exception_table(struct cpu_user_regs *regs)
> --- a/xen/arch/x86/traps.c
> +++ b/xen/arch/x86/traps.c
> @@ -802,10 +802,10 @@ void do_trap(struct cpu_user_regs *regs)
>          return;
>      }
>  
> -    if ( likely((fixup = search_exception_table(regs->rip)) != 0) )
> +    if ( likely((fixup = search_exception_table(regs, false)) != 0) )
>      {
> -        dprintk(XENLOG_ERR, "Trap %d: %p -> %p\n",
> -                trapnr, _p(regs->rip), _p(fixup));
> +        dprintk(XENLOG_ERR, "Trap %u: %p [%ps] -> %p\n",
> +                trapnr, _p(regs->rip), _p(regs->rip), _p(fixup));
>          this_cpu(last_extable_addr) = regs->rip;
>          regs->rip = fixup;
>          return;
> @@ -820,6 +820,15 @@ void do_trap(struct cpu_user_regs *regs)
>          return;
>      }
>  
> +    if ( likely((fixup = search_exception_table(regs, true)) != 0) )
> +    {
> +        dprintk(XENLOG_ERR, "Trap %u: %p -> %p\n",
> +                trapnr, _p(regs->rip), _p(fixup));
> +        this_cpu(last_extable_addr) = regs->rip;
> +        regs->rip = fixup;
> +        return;
> +    }
> +
>   hardware_trap:
>      if ( debugger_trap_fatal(trapnr, regs) )
>          return;
> @@ -1567,7 +1576,7 @@ void do_invalid_op(struct cpu_user_regs
>      }
>  
>   die:
> -    if ( (fixup = search_exception_table(regs->rip)) != 0 )
> +    if ( (fixup = search_exception_table(regs, true)) != 0 )
>      {
>          this_cpu(last_extable_addr) = regs->rip;
>          regs->rip = fixup;
> @@ -1897,7 +1906,7 @@ void do_page_fault(struct cpu_user_regs
>          if ( pf_type != real_fault )
>              return;
>  
> -        if ( likely((fixup = search_exception_table(regs->rip)) != 0) )
> +        if ( likely((fixup = search_exception_table(regs, false)) != 0) )
>          {
>              perfc_incr(copy_user_faults);
>              if ( unlikely(regs->error_code & PFEC_reserved_bit) )
> @@ -3841,10 +3850,10 @@ void do_general_protection(struct cpu_us
>  
>   gp_in_kernel:
>  
> -    if ( likely((fixup = search_exception_table(regs->rip)) != 0) )
> +    if ( likely((fixup = search_exception_table(regs, true)) != 0) )
>      {
> -        dprintk(XENLOG_INFO, "GPF (%04x): %p -> %p\n",
> -                regs->error_code, _p(regs->rip), _p(fixup));
> +        dprintk(XENLOG_INFO, "GPF (%04x): %p [%ps] -> %p\n",
> +                regs->error_code, _p(regs->rip), _p(regs->rip), _p(fixup));
>          this_cpu(last_extable_addr) = regs->rip;
>          regs->rip = fixup;
>          return;
> @@ -4120,7 +4129,7 @@ void do_debug(struct cpu_user_regs *regs
>               * watchpoint set on it. No need to bump EIP; the only faulting
>               * trap is an instruction breakpoint, which can't happen to us.
>               */
> -            WARN_ON(!search_exception_table(regs->rip));
> +            WARN_ON(!search_exception_table(regs, false));
>          }
>          goto out;
>      }
> --- a/xen/arch/x86/x86_emulate/x86_emulate.c
> +++ b/xen/arch/x86/x86_emulate/x86_emulate.c
> @@ -676,14 +676,34 @@ do{ asm volatile (
>  #define __emulate_1op_8byte(_op, _dst, _eflags)
>  #endif /* __i386__ */
>  
> +#ifdef __XEN__
> +# define invoke_stub(pre, post, constraints...) do {                    \
> +    union stub_exception_token res_ = { .raw = ~0 };                    \
> +    asm volatile ( pre "\n\tcall *%[stub]\n\t" post "\n"                \
> +                   ".Lret%=:\n\t"                                       \
> +                   ".pushsection .fixup,\"ax\"\n"                       \
> +                   ".Lfix%=:\n\t"                                       \
> +                   "pop %[exn]\n\t"                                     \
> +                   "jmp .Lret%=\n\t"                                    \
> +                   ".popsection\n\t"                                    \
> +                   _ASM_EXTABLE(.Lret%=, .Lfix%=)                       \
> +                   : [exn] "+g" (res_), constraints,                    \
> +                     [stub] "rm" (stub.func) );                         \
> +    generate_exception_if(~res_.raw, res_.fields.trapnr,                \
> +                          res_.fields.ec);                              \
> +} while (0)
> +#else
> +# define invoke_stub(pre, post, constraints...)                         \
> +    asm volatile ( pre "\n\tcall *%[stub]\n\t" post                     \
> +                   : constraints, [stub] "rm" (stub.func) )
> +#endif
> +
>  #define emulate_stub(dst, src...) do {                                  \
>      unsigned long tmp;                                                  \
> -    asm volatile ( _PRE_EFLAGS("[efl]", "[msk]", "[tmp]")               \
> -                   "call *%[stub];"                                     \
> -                   _POST_EFLAGS("[efl]", "[msk]", "[tmp]")              \
> -                   : dst, [tmp] "=&r" (tmp), [efl] "+g" (_regs._eflags) \
> -                   : [stub] "r" (stub.func),                            \
> -                     [msk] "i" (EFLAGS_MASK), ## src );                 \
> +    invoke_stub(_PRE_EFLAGS("[efl]", "[msk]", "[tmp]"),                 \
> +                _POST_EFLAGS("[efl]", "[msk]", "[tmp]"),                \
> +                dst, [tmp] "=&r" (tmp), [efl] "+g" (_regs._eflags)      \
> +                : [msk] "i" (EFLAGS_MASK), ## src);                     \
>  } while (0)
>  
>  /* Fetch next part of the instruction being emulated. */
> @@ -929,8 +949,7 @@ do {
>      unsigned int nr_ = sizeof((uint8_t[]){ bytes });                    \
>      fic.insn_bytes = nr_;                                               \
>      memcpy(get_stub(stub), ((uint8_t[]){ bytes, 0xc3 }), nr_ + 1);      \
> -    asm volatile ( "call *%[stub]" : "+m" (fic) :                       \
> -                   [stub] "rm" (stub.func) );                           \
> +    invoke_stub("", "", "=m" (fic) : "m" (fic));                        \
>      put_stub(stub);                                                     \
>  } while (0)
>  
> @@ -940,13 +959,11 @@ do {
>      unsigned long tmp_;                                                 \
>      fic.insn_bytes = nr_;                                               \
>      memcpy(get_stub(stub), ((uint8_t[]){ bytes, 0xc3 }), nr_ + 1);      \
> -    asm volatile ( _PRE_EFLAGS("[eflags]", "[mask]", "[tmp]")           \
> -                   "call *%[func];"                                     \
> -                   _POST_EFLAGS("[eflags]", "[mask]", "[tmp]")          \
> -                   : [eflags] "+g" (_regs._eflags),                     \
> -                     [tmp] "=&r" (tmp_), "+m" (fic)                     \
> -                   : [func] "rm" (stub.func),                           \
> -                     [mask] "i" (EFLG_ZF|EFLG_PF|EFLG_CF) );            \
> +    invoke_stub(_PRE_EFLAGS("[eflags]", "[mask]", "[tmp]"),             \
> +                _POST_EFLAGS("[eflags]", "[mask]", "[tmp]"),            \
> +                [eflags] "+g" (_regs._eflags), [tmp] "=&r" (tmp_),      \
> +                "+m" (fic)                                              \
> +                : [mask] "i" (EFLG_ZF|EFLG_PF|EFLG_CF));                \
>      put_stub(stub);                                                     \
>  } while (0)
>  
> --- a/xen/include/asm-x86/uaccess.h
> +++ b/xen/include/asm-x86/uaccess.h
> @@ -275,7 +275,16 @@ extern struct exception_table_entry __st
>  extern struct exception_table_entry __start___pre_ex_table[];
>  extern struct exception_table_entry __stop___pre_ex_table[];
>  
> -extern unsigned long search_exception_table(unsigned long);
> +union stub_exception_token {
> +    struct {
> +        uint32_t ec;

ec only needs to be 16 bits wide, which very helpfully lets it fit into
an unsigned long, even for 32bit builds, and 8 bits at the top for extra
metadata.

~Andrew

> +        uint8_t trapnr;
> +    } fields;
> +    uint64_t raw;
> +};
> +
> +extern unsigned long search_exception_table(const struct cpu_user_regs *regs,
> +                                            bool check_stub);
>  extern void sort_exception_tables(void);
>  extern void sort_exception_table(struct exception_table_entry *start,
>                                   const struct exception_table_entry *stop);
>
>


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.