[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v3 1/2] x86/fpu: Combine fpu_ctxt and xsave_area in arch_vcpu



Hi,

On Thu Oct 3, 2024 at 8:38 PM BST, Andrew Cooper wrote:
> On 13/08/2024 3:21 pm, Alejandro Vallejo wrote:
> > @@ -299,44 +299,14 @@ void save_fpu_enable(void)
> >  /* Initialize FPU's context save area */
> >  int vcpu_init_fpu(struct vcpu *v)
> >  {
> > -    int rc;
> > -
> >      v->arch.fully_eager_fpu = opt_eager_fpu;
> > -
> > -    if ( (rc = xstate_alloc_save_area(v)) != 0 )
> > -        return rc;
> > -
> > -    if ( v->arch.xsave_area )
> > -        v->arch.fpu_ctxt = &v->arch.xsave_area->fpu_sse;
> > -    else
> > -    {
> > -        BUILD_BUG_ON(__alignof(v->arch.xsave_area->fpu_sse) < 16);
> > -        v->arch.fpu_ctxt = _xzalloc(sizeof(v->arch.xsave_area->fpu_sse),
> > -                                    
> > __alignof(v->arch.xsave_area->fpu_sse));
> > -        if ( v->arch.fpu_ctxt )
> > -        {
> > -            fpusse_t *fpu_sse = v->arch.fpu_ctxt;
> > -
> > -            fpu_sse->fcw = FCW_DEFAULT;
> > -            fpu_sse->mxcsr = MXCSR_DEFAULT;
> > -        }
> > -        else
> > -            rc = -ENOMEM;
>
> This looks wonky.  It's not, because xstate_alloc_save_area() contains
> the same logic for setting up FCW/MXCSR.
>
> It would be helpful to note this in the commit message.  Something about
> deduplicating the setup alongside deduplicating the pointer.
>

Sure

> > diff --git a/xen/arch/x86/include/asm/domain.h 
> > b/xen/arch/x86/include/asm/domain.h
> > index bca3258d69ac..3da60af2a44a 100644
> > --- a/xen/arch/x86/include/asm/domain.h
> > +++ b/xen/arch/x86/include/asm/domain.h
> > @@ -592,11 +592,11 @@ struct pv_vcpu
> >  struct arch_vcpu
> >  {
> >      /*
> > -     * guest context (mirroring struct vcpu_guest_context) common
> > -     * between pv and hvm guests
> > +     * Guest context common between PV and HVM guests. Includes general 
> > purpose
> > +     * registers, segment registers and other parts of the exception frame.
> > +     *
> > +     * It doesn't contain FPU state, as that lives in xsave_area instead.
> >       */
>
> This new comment isn't really correct either.  arch_vcpu contains the
> PV/HVM union, so it not only things which are common between the two.

It's about cpu_user_regs though, not arch_vcpu?

>
> I'd either leave it alone, or delete it entirely.  It doesn't serve much
> purpose IMO, and it is going to bitrot very quickly (FRED alone will
> change two of the state groups you mention).
>

I'm happy getting rid of it because it's actively confusing in its current
form. That said, I can't possibly believe there's not a single simple
description of cpu_user_regs that everyone can agree on.

> > -
> > -    void              *fpu_ctxt;
> >      struct cpu_user_regs user_regs;
> >  
> >      /* Debug registers. */
> > diff --git a/xen/arch/x86/x86_emulate/blk.c b/xen/arch/x86/x86_emulate/blk.c
> > index e790f4f90056..28b54f26fe29 100644
> > --- a/xen/arch/x86/x86_emulate/blk.c
> > +++ b/xen/arch/x86/x86_emulate/blk.c
> > @@ -11,7 +11,8 @@
> >      !defined(X86EMUL_NO_SIMD)
> >  # ifdef __XEN__
> >  #  include <asm/xstate.h>
> > -#  define FXSAVE_AREA current->arch.fpu_ctxt
> > +#  define FXSAVE_AREA ((struct x86_fxsr *) \
> > +                           (void *)&current->arch.xsave_area->fpu_sse)
>
> This isn't a like-for-like replacement.
>
> Previously FXSAVE_AREA's type was void *.  I'd leave the expression as just
>
>     (void *)&current->arch.xsave_area->fpu_sse
>
> because struct x86_fxsr is not the only type needing to be used here in
> due course.   (There are 8 variations of data layout for older
> instructions.)
>

Sure

> >  # else
> >  #  define FXSAVE_AREA get_fpu_save_area()
> >  # endif
> > diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> > index 5c4144d55e89..850ee31bd18c 100644
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -507,9 +507,16 @@ int xstate_alloc_save_area(struct vcpu *v)
> >      unsigned int size;
> >  
> >      if ( !cpu_has_xsave )
> > -        return 0;
> > -
> > -    if ( !is_idle_vcpu(v) || !cpu_has_xsavec )
> > +    {
> > +        /*
> > +         * This is bigger than FXSAVE_SIZE by 64 bytes, but it helps 
> > treating
> > +         * the FPU state uniformly as an XSAVE buffer even if XSAVE is not
> > +         * available in the host. Note the alignment restriction of the 
> > XSAVE
> > +         * area are stricter than those of the FXSAVE area.
> > +         */
>
> Can I suggest the following?
>
> "On non-XSAVE systems, we allocate an XSTATE buffer for simplicity. 
> XSTATE is backwards compatible to FXSAVE, and only one cacheline larger."
>
> It's rather more concise.
>
> ~Andrew

Sure.

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.