[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves



On Wed, Aug 26, 2015 at 10:47:22AM +0100, Andrew Cooper wrote:
> On 25/08/2015 11:54, Shuai Ruan wrote:
> > This patch add basic definitions/helpers which will be used in
> > later patches.
> >
> > Signed-off-by: Shuai Ruan <shuai.ruan@xxxxxxxxxxxxxxx>
> 
> Thankyou - this series is looking far better now.
> 
> > ---
> >  xen/arch/x86/xstate.c            | 137 
> > +++++++++++++++++++++++++++++++++++++++
> >  xen/include/asm-x86/cpufeature.h |   4 ++
> >  xen/include/asm-x86/domain.h     |   1 +
> >  xen/include/asm-x86/msr-index.h  |   2 +
> >  xen/include/asm-x86/xstate.h     |  11 +++-
> >  5 files changed, 154 insertions(+), 1 deletion(-)
> >
> > diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c
> > index d5f5e3b..3986515 100644
> > --- a/xen/arch/x86/xstate.c
> > +++ b/xen/arch/x86/xstate.c
> > @@ -29,6 +29,10 @@ static u32 __read_mostly xsave_cntxt_size;
> >  /* A 64-bit bitmask of the XSAVE/XRSTOR features supported by processor. */
> >  u64 __read_mostly xfeature_mask;
> >  
> > +unsigned int *xstate_offsets, *xstate_sizes;
> > +static unsigned int xstate_features;
> > +static unsigned int xstate_comp_offsets[sizeof(xfeature_mask)*8];
> 
> All of these should be tagged as __read_mostly.
> 
Ok.
> > +
> >  /* Cached xcr0 for fast read */
> >  static DEFINE_PER_CPU(uint64_t, xcr0);
> >  
> > @@ -65,6 +69,139 @@ uint64_t get_xcr0(void)
> >      return this_cpu(xcr0);
> >  }
> >  
> > +static void setup_xstate_features(void)
> > +{
> > +    unsigned int eax, ebx, ecx, edx, leaf = 0x2;
> > +    unsigned int i;
> > +
> > +    xstate_features = fls(xfeature_mask);
> > +    xstate_offsets = xzalloc_array(unsigned int, xstate_features);
> > +    xstate_sizes = xzalloc_array(unsigned int, xstate_features);
> 
> These can both be plain xmalloc_array(), as we unconditionally set every
> member below.
> 
> As a separate patch (or possibly this one if it isn't too big), you now
> need to modify the xstate initialisation to return int rather than
> void.  You must also check for allocation failures and bail with
> -ENOMEM.  It is fine for an error like this to be fatal to system or AP
> startup.
> 
> Also, merging review from patch 2, this function gets called once per
> pcpu, meaning that you waste a moderate quantity of memory and
> repeatedly clobber the xstate_{offsets,sizes} pointers.  You should pass
> in bool_t bsp and, if bsp, allocate the arrays and read out, while if
> not bsp, check that the ap is identical to the bsp.
> 
Ok.
> > +
> > +    for (i = 0; i < xstate_features ; i++)
> 
> Style: spaces inside braces, and there shouldn't be one between
> "xstate_features;"
> 
> Also, you can remove i entirely, and use "for ( leaf = 2; leaf <
> xstate_features; ++i )"
> 
Sorry.
> > +    {
> > +        cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx);
> > +
> > +        xstate_offsets[leaf] = ebx;
> > +        xstate_sizes[leaf] = eax;
> 
> You can drop the ebx and eax temporaries, and as ecx and edx are only
> around as discard variables, you can get away with having a single "tmp"
> variable.
> 
> Therefore, something like:
> 
> cpuid_count(XSTATE_CPUID, leaf, &xstate_sizes[leaf],
> &xstate_offsets[leaf], &tmp, &tmp);
> 
> which also drops the body of the function to a single statement,
> allowing you to drop the braces.
> 
Ok.
> > +
> > +        leaf++;
> > +    }
> > +}
> > +
> > +static void setup_xstate_comp(void)
> 
> This function can get away with being __init, as it is based solely on
> data written by the BSP.
> 
> Also merging review from patch 2, it only needs calling once.
> 
ok.
> > +{
> > +    unsigned int xstate_comp_sizes[sizeof(xfeature_mask)*8];
> > +    unsigned int i;
> > +
> > +    /*
> > +     * The FP xstates and SSE xstates are legacy states. They are always
> > +     * in the fixed offsets in the xsave area in either compacted form
> > +     * or standard form.
> > +     */
> > +    xstate_comp_offsets[0] = 0;
> > +    xstate_comp_offsets[1] = XSAVE_SSE_OFFSET;
> > +
> > +    xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE;
> > +
> > +    for (i = 2; i < xstate_features; i++)
> 
> Style - spaces inside braces.
> 
Sorry.
> > +    {
> > +        if ( (1ull << i) & xfeature_mask )
> > +            xstate_comp_sizes[i] = xstate_sizes[i];
> > +        else
> > +            xstate_comp_sizes[i] = 0;
> > +
> > +        if ( i > 2 )
> > +            xstate_comp_offsets[i] = xstate_comp_offsets[i-1]
> > +                                    + xstate_comp_sizes[i-1];
> 
> The entire xstate_comp_sizes array (which is quite large) can be removed
> if you rearrange the loop body as:
> 
> xstate_comp_offsets[i] = xstate_comp_offsets[i-1] + (((1ul << i) &
> xfeature_mask) ? xstate_comp_sizes[i-1] : 0);
> 
> which also removes the "if ( i > 2 )"
> 
> > +    }
> 
> As this point, it is probably worth matching xstate_comp_offsets[i-1]
> against cpuid.0xd[0].ecx.  If hardware and Xen disagree on the maximum
> size of an xsave area, we shouldn't continue.
> 
Ok.
> > +}
> > +
> > +static void *get_xsave_addr(struct xsave_struct *xsave, int xstate)
> > +{
> > +    int feature = fls(xstate) - 1;
> 
> In each callsite, you have already calculated fls(xstate) - 1.  It would
> be better to pass an "unsigned int xfeature_idx" and avoid the repeated
> calculation.
> 
Ok.
> > +    if ( !(1ull << feature & xfeature_mask) )
> 
> Style.  This is a mix of binary operators so needs more brackets than this.
> 
> if ( !((1ul << feature) & xfeature_mask )
> 
Ok.
> > +        return NULL;
> > +
> > +    return (void *)xsave + xstate_comp_offsets[feature];
> > +}
> > +
> > +void save_xsave_states(struct vcpu *v, void *dest ,unsigned int size)
> 
> s/ ,/, /
> 
> > +{
> > +    struct xsave_struct *xsave = v->arch.xsave_area;
> > +    u64 xstate_bv = xsave->xsave_hdr.xstate_bv;
> > +    u64 valid;
> > +
> > +    /*
> > +     * Copy legacy XSAVE area, to avoid complications with CPUID
> > +     * leaves 0 and 1 in the loop below.
> > +     */
> > +    memcpy(dest, xsave, FXSAVE_SIZE);
> > +
> > +    /* Set XSTATE_BV */
> > +    *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv;
> > +
> > +    /*
> > +     * Copy each region from the possibly compacted offset to the
> > +     * non-compacted offset.
> > +     */
> > +    valid = xstate_bv & ~XSTATE_FP_SSE;
> > +    while ( valid )
> > +    {
> > +        u64 feature = valid & -valid;
> > +        int index = fls(feature) - 1;
> > +        void *src = get_xsave_addr(xsave, feature);
> > +
> > +        if ( src )
> > +        {
> > +            if ( xstate_offsets[index] + xstate_sizes[index] > size )
> > +                BUG();
> 
> On second thoughts, this should probably turn into
> 
> ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size);
> 
> The size parameter is from a trusted caller, and it will be more obvious
> in the crash report if it does somehow get broken.
> 
Ok
> > +            memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]);
> > +        }
> > +
> > +        valid -= feature;
> > +    }
> > +
> > +}
> > +
> > +void load_xsave_states(struct vcpu *v, void *src, unsigned int size)
> 
> const void *src
> 
Ok.
> > +{
> > +    struct xsave_struct *xsave = v->arch.xsave_area;
> > +    u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET);
> > +    u64 valid;
> > +
> > +    /*
> > +     * Copy legacy XSAVE area, to avoid complications with CPUID
> > +     * leaves 0 and 1 in the loop below.
> > +     */
> > +    memcpy(xsave, src, FXSAVE_SIZE);
> > +
> > +    /* Set XSTATE_BV and possibly XCOMP_BV.  */
> > +    xsave->xsave_hdr.xstate_bv = xstate_bv;
> > +    xsave->xsave_hdr.xcomp_bv = v->arch.xcr0_accum | 
> > XSTATE_COMPACTION_ENABLED;
> > +
> > +    /*
> > +     * Copy each region from the non-compacted offset to the
> > +     * possibly compacted offset.
> > +     */
> > +    valid = xstate_bv & ~XSTATE_FP_SSE;
> > +    while ( valid )
> > +    {
> > +        u64 feature = valid & -valid;
> > +        int index = fls(feature) - 1;
> > +        void *dest = get_xsave_addr(xsave, feature);
> > +
> > +        if (dest)
> 
> Style.
> 
> > +        {
> > +            if ( xstate_offsets[index] + xstate_sizes[index] > size )
> > +                BUG();
> > +            memcpy(dest, src + xstate_offsets[index], xstate_sizes[index]);
> > +        }
> > +
> > +        valid -= feature;
> > +    }
> > +}
> > +
> >  void xsave(struct vcpu *v, uint64_t mask)
> >  {
> >      struct xsave_struct *ptr = v->arch.xsave_area;
> > diff --git a/xen/include/asm-x86/cpufeature.h 
> > b/xen/include/asm-x86/cpufeature.h
> > index 9a01563..16b1386 100644
> > --- a/xen/include/asm-x86/cpufeature.h
> > +++ b/xen/include/asm-x86/cpufeature.h
> > @@ -153,6 +153,10 @@
> >  #define X86_FEATURE_RDSEED (7*32+18) /* RDSEED instruction */
> >  #define X86_FEATURE_ADX            (7*32+19) /* ADCX, ADOX instructions */
> >  #define X86_FEATURE_SMAP   (7*32+20) /* Supervisor Mode Access Prevention 
> > */
> > +#define XSAVEOPT           (1 << 0)
> > +#define XSAVEC                     (1 << 1)
> > +#define XGETBV1            (1 << 2)
> > +#define XSAVES                     (1 << 3)
> 
> This absolutely must be X86_FEATURE_* for consistency, and you will need
> to introduce a new capability word.
> 
Ok.Next version I will change this.
> ~Andrew
> 
> _______________________________________________
> Xen-devel mailing list
Thanks for review,Andrew
> Xen-devel@xxxxxxxxxxxxx
> http://lists.xen.org/xen-devel
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.