|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH] x86/xen: Clear user %gs before updating segment descriptors
On Fri, Dec 7, 2018 at 3:15 PM David Woodhouse <dwmw@xxxxxxxxxxxx> wrote:
>
> During a context switch, if clearing a descriptor which is currently
> referenced by the old process's user %gs, if Xen preempts the vCPU
> before %gs is set for the new process, a fault may occur.
>
> This fault actually seems to be fairly harmless; xen_failsafe_callback
> will just return to the "faulting" instruction (the random one after the
> vCPU was preempted) with the offending segment register zeroed, and then
> it'll get set again later during the context switch. But it's cleaner
> to avoid it.
>
> If the descriptor referenced by the %gs selector is being touched,
> then include a request to zero the user %gs in the multicall too.
Fine with me.
>
> Signed-off-by: David Woodhouse <dwmw@xxxxxxxxxxxx>
> ---
> v2: Don't accidentally remove the call to xen_mc_batch().
>
> arch/x86/include/asm/xen/hypercall.h | 11 ++++++++
> arch/x86/xen/enlighten_pv.c | 40 ++++++++++++++++++++++------
> 2 files changed, 43 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/include/asm/xen/hypercall.h
> b/arch/x86/include/asm/xen/hypercall.h
> index ef05bea7010d..e8b383b24246 100644
> --- a/arch/x86/include/asm/xen/hypercall.h
> +++ b/arch/x86/include/asm/xen/hypercall.h
> @@ -520,4 +520,15 @@ MULTI_stack_switch(struct multicall_entry *mcl,
> trace_xen_mc_entry(mcl, 2);
> }
>
> +static inline void
> +MULTI_set_segment_base(struct multicall_entry *mcl,
> + int reg, unsigned long value)
> +{
> + mcl->op = __HYPERVISOR_set_segment_base;
> + mcl->args[0] = reg;
> + mcl->args[1] = value;
> +
> + trace_xen_mc_entry(mcl, 2);
> +}
> +
> #endif /* _ASM_X86_XEN_HYPERCALL_H */
> diff --git a/arch/x86/xen/enlighten_pv.c b/arch/x86/xen/enlighten_pv.c
> index 2f6787fc7106..2eb9827dab4b 100644
> --- a/arch/x86/xen/enlighten_pv.c
> +++ b/arch/x86/xen/enlighten_pv.c
> @@ -506,7 +506,7 @@ static inline bool desc_equal(const struct desc_struct
> *d1,
> }
>
> static void load_TLS_descriptor(struct thread_struct *t,
> - unsigned int cpu, unsigned int i)
> + unsigned int cpu, unsigned int i, int
> flush_gs)
> {
> struct desc_struct *shadow = &per_cpu(shadow_tls_desc, cpu).desc[i];
> struct desc_struct *gdt;
> @@ -516,6 +516,17 @@ static void load_TLS_descriptor(struct thread_struct *t,
> if (desc_equal(shadow, &t->tls_array[i]))
> return;
>
> + /*
> + * If the current user %gs points to a descriptor we're changing,
> + * zero it first to avoid taking a fault if Xen preempts this
> + * vCPU between now and the time that %gs is later loaded with
> + * the new value.
> + */
> + if ((flush_gs >> 3) == GDT_ENTRY_TLS_MIN + i) {
> + mc = __xen_mc_entry(0);
> + MULTI_set_segment_base(mc.mc, SEGBASE_GS_USER_SEL, 0);
> + }
> +
> *shadow = t->tls_array[i];
>
> gdt = get_cpu_gdt_rw(cpu);
> @@ -527,6 +538,8 @@ static void load_TLS_descriptor(struct thread_struct *t,
>
> static void xen_load_tls(struct thread_struct *t, unsigned int cpu)
> {
> + u16 flush_gs = 0;
> +
> /*
> * XXX sleazy hack: If we're being called in a lazy-cpu zone
> * and lazy gs handling is enabled, it means we're in a
> @@ -537,27 +550,38 @@ static void xen_load_tls(struct thread_struct *t,
> unsigned int cpu)
> * This will go away as soon as Xen has been modified to not
> * save/restore %gs for normal hypercalls.
> *
> - * On x86_64, this hack is not used for %gs, because gs points
> - * to KERNEL_GS_BASE (and uses it for PDA references), so we
> - * must not zero %gs on x86_64
> - *
> * For x86_64, we need to zero %fs, otherwise we may get an
> * exception between the new %fs descriptor being loaded and
> * %fs being effectively cleared at __switch_to().
> + *
> + * We may also need to zero %gs, if it refers to a descriptor
> + * which we are clearing. Otherwise a Xen vCPU context switch
> + * before it gets reloaded could also cause a fault. Since
> + * clearing the user %gs is another hypercall, do that only if
> + * it's necessary.
> + *
> + * Note: These "faults" end up in xen_failsafe_callback(),
> + * which just returns immediately to the "faulting" instruction
> + * (i.e. the random one after Xen preempted this vCPU) with
> + * the offending segment register zeroed. Which is actually
> + * a perfectly safe thing to happen anyway, as it'll be loaded
> + * again shortly. So maybe we needn't bother?
> */
> if (paravirt_get_lazy_mode() == PARAVIRT_LAZY_CPU) {
> #ifdef CONFIG_X86_32
> lazy_load_gs(0);
> #else
> + savesegment(gs, flush_gs);
> +
> loadsegment(fs, 0);
> #endif
> }
>
> xen_mc_batch();
>
> - load_TLS_descriptor(t, cpu, 0);
> - load_TLS_descriptor(t, cpu, 1);
> - load_TLS_descriptor(t, cpu, 2);
> + load_TLS_descriptor(t, cpu, 0, flush_gs);
> + load_TLS_descriptor(t, cpu, 1, flush_gs);
> + load_TLS_descriptor(t, cpu, 2, flush_gs);
>
> xen_mc_issue(PARAVIRT_LAZY_CPU);
> }
> --
> 2.17.1
>
>
>
>
> Amazon Development Centre (London) Ltd. Registered in England and Wales with
> registration number 04543232 with its registered office at 1 Principal Place,
> Worship Street, London EC2A 2FA, United Kingdom.
>
>
>
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |