[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH V4 1/4] x86/xsaves: add basic definitions/helpers to support xsaves
On 25/08/2015 11:54, Shuai Ruan wrote: > This patch add basic definitions/helpers which will be used in > later patches. > > Signed-off-by: Shuai Ruan <shuai.ruan@xxxxxxxxxxxxxxx> Thankyou - this series is looking far better now. > --- > xen/arch/x86/xstate.c | 137 > +++++++++++++++++++++++++++++++++++++++ > xen/include/asm-x86/cpufeature.h | 4 ++ > xen/include/asm-x86/domain.h | 1 + > xen/include/asm-x86/msr-index.h | 2 + > xen/include/asm-x86/xstate.h | 11 +++- > 5 files changed, 154 insertions(+), 1 deletion(-) > > diff --git a/xen/arch/x86/xstate.c b/xen/arch/x86/xstate.c > index d5f5e3b..3986515 100644 > --- a/xen/arch/x86/xstate.c > +++ b/xen/arch/x86/xstate.c > @@ -29,6 +29,10 @@ static u32 __read_mostly xsave_cntxt_size; > /* A 64-bit bitmask of the XSAVE/XRSTOR features supported by processor. */ > u64 __read_mostly xfeature_mask; > > +unsigned int *xstate_offsets, *xstate_sizes; > +static unsigned int xstate_features; > +static unsigned int xstate_comp_offsets[sizeof(xfeature_mask)*8]; All of these should be tagged as __read_mostly. > + > /* Cached xcr0 for fast read */ > static DEFINE_PER_CPU(uint64_t, xcr0); > > @@ -65,6 +69,139 @@ uint64_t get_xcr0(void) > return this_cpu(xcr0); > } > > +static void setup_xstate_features(void) > +{ > + unsigned int eax, ebx, ecx, edx, leaf = 0x2; > + unsigned int i; > + > + xstate_features = fls(xfeature_mask); > + xstate_offsets = xzalloc_array(unsigned int, xstate_features); > + xstate_sizes = xzalloc_array(unsigned int, xstate_features); These can both be plain xmalloc_array(), as we unconditionally set every member below. As a separate patch (or possibly this one if it isn't too big), you now need to modify the xstate initialisation to return int rather than void. You must also check for allocation failures and bail with -ENOMEM. It is fine for an error like this to be fatal to system or AP startup. Also, merging review from patch 2, this function gets called once per pcpu, meaning that you waste a moderate quantity of memory and repeatedly clobber the xstate_{offsets,sizes} pointers. You should pass in bool_t bsp and, if bsp, allocate the arrays and read out, while if not bsp, check that the ap is identical to the bsp. > + > + for (i = 0; i < xstate_features ; i++) Style: spaces inside braces, and there shouldn't be one between "xstate_features;" Also, you can remove i entirely, and use "for ( leaf = 2; leaf < xstate_features; ++i )" > + { > + cpuid_count(XSTATE_CPUID, leaf, &eax, &ebx, &ecx, &edx); > + > + xstate_offsets[leaf] = ebx; > + xstate_sizes[leaf] = eax; You can drop the ebx and eax temporaries, and as ecx and edx are only around as discard variables, you can get away with having a single "tmp" variable. Therefore, something like: cpuid_count(XSTATE_CPUID, leaf, &xstate_sizes[leaf], &xstate_offsets[leaf], &tmp, &tmp); which also drops the body of the function to a single statement, allowing you to drop the braces. > + > + leaf++; > + } > +} > + > +static void setup_xstate_comp(void) This function can get away with being __init, as it is based solely on data written by the BSP. Also merging review from patch 2, it only needs calling once. > +{ > + unsigned int xstate_comp_sizes[sizeof(xfeature_mask)*8]; > + unsigned int i; > + > + /* > + * The FP xstates and SSE xstates are legacy states. They are always > + * in the fixed offsets in the xsave area in either compacted form > + * or standard form. > + */ > + xstate_comp_offsets[0] = 0; > + xstate_comp_offsets[1] = XSAVE_SSE_OFFSET; > + > + xstate_comp_offsets[2] = FXSAVE_SIZE + XSAVE_HDR_SIZE; > + > + for (i = 2; i < xstate_features; i++) Style - spaces inside braces. > + { > + if ( (1ull << i) & xfeature_mask ) > + xstate_comp_sizes[i] = xstate_sizes[i]; > + else > + xstate_comp_sizes[i] = 0; > + > + if ( i > 2 ) > + xstate_comp_offsets[i] = xstate_comp_offsets[i-1] > + + xstate_comp_sizes[i-1]; The entire xstate_comp_sizes array (which is quite large) can be removed if you rearrange the loop body as: xstate_comp_offsets[i] = xstate_comp_offsets[i-1] + (((1ul << i) & xfeature_mask) ? xstate_comp_sizes[i-1] : 0); which also removes the "if ( i > 2 )" > + } As this point, it is probably worth matching xstate_comp_offsets[i-1] against cpuid.0xd[0].ecx. If hardware and Xen disagree on the maximum size of an xsave area, we shouldn't continue. > +} > + > +static void *get_xsave_addr(struct xsave_struct *xsave, int xstate) > +{ > + int feature = fls(xstate) - 1; In each callsite, you have already calculated fls(xstate) - 1. It would be better to pass an "unsigned int xfeature_idx" and avoid the repeated calculation. > + if ( !(1ull << feature & xfeature_mask) ) Style. This is a mix of binary operators so needs more brackets than this. if ( !((1ul << feature) & xfeature_mask ) > + return NULL; > + > + return (void *)xsave + xstate_comp_offsets[feature]; > +} > + > +void save_xsave_states(struct vcpu *v, void *dest ,unsigned int size) s/ ,/, / > +{ > + struct xsave_struct *xsave = v->arch.xsave_area; > + u64 xstate_bv = xsave->xsave_hdr.xstate_bv; > + u64 valid; > + > + /* > + * Copy legacy XSAVE area, to avoid complications with CPUID > + * leaves 0 and 1 in the loop below. > + */ > + memcpy(dest, xsave, FXSAVE_SIZE); > + > + /* Set XSTATE_BV */ > + *(u64 *)(dest + XSAVE_HDR_OFFSET) = xstate_bv; > + > + /* > + * Copy each region from the possibly compacted offset to the > + * non-compacted offset. > + */ > + valid = xstate_bv & ~XSTATE_FP_SSE; > + while ( valid ) > + { > + u64 feature = valid & -valid; > + int index = fls(feature) - 1; > + void *src = get_xsave_addr(xsave, feature); > + > + if ( src ) > + { > + if ( xstate_offsets[index] + xstate_sizes[index] > size ) > + BUG(); On second thoughts, this should probably turn into ASSERT((xstate_offsets[index] + xstate_sizes[index]) <= size); The size parameter is from a trusted caller, and it will be more obvious in the crash report if it does somehow get broken. > + memcpy(dest + xstate_offsets[index], src, xstate_sizes[index]); > + } > + > + valid -= feature; > + } > + > +} > + > +void load_xsave_states(struct vcpu *v, void *src, unsigned int size) const void *src > +{ > + struct xsave_struct *xsave = v->arch.xsave_area; > + u64 xstate_bv = *(u64 *)(src + XSAVE_HDR_OFFSET); > + u64 valid; > + > + /* > + * Copy legacy XSAVE area, to avoid complications with CPUID > + * leaves 0 and 1 in the loop below. > + */ > + memcpy(xsave, src, FXSAVE_SIZE); > + > + /* Set XSTATE_BV and possibly XCOMP_BV. */ > + xsave->xsave_hdr.xstate_bv = xstate_bv; > + xsave->xsave_hdr.xcomp_bv = v->arch.xcr0_accum | > XSTATE_COMPACTION_ENABLED; > + > + /* > + * Copy each region from the non-compacted offset to the > + * possibly compacted offset. > + */ > + valid = xstate_bv & ~XSTATE_FP_SSE; > + while ( valid ) > + { > + u64 feature = valid & -valid; > + int index = fls(feature) - 1; > + void *dest = get_xsave_addr(xsave, feature); > + > + if (dest) Style. > + { > + if ( xstate_offsets[index] + xstate_sizes[index] > size ) > + BUG(); > + memcpy(dest, src + xstate_offsets[index], xstate_sizes[index]); > + } > + > + valid -= feature; > + } > +} > + > void xsave(struct vcpu *v, uint64_t mask) > { > struct xsave_struct *ptr = v->arch.xsave_area; > diff --git a/xen/include/asm-x86/cpufeature.h > b/xen/include/asm-x86/cpufeature.h > index 9a01563..16b1386 100644 > --- a/xen/include/asm-x86/cpufeature.h > +++ b/xen/include/asm-x86/cpufeature.h > @@ -153,6 +153,10 @@ > #define X86_FEATURE_RDSEED (7*32+18) /* RDSEED instruction */ > #define X86_FEATURE_ADX (7*32+19) /* ADCX, ADOX instructions */ > #define X86_FEATURE_SMAP (7*32+20) /* Supervisor Mode Access Prevention > */ > +#define XSAVEOPT (1 << 0) > +#define XSAVEC (1 << 1) > +#define XGETBV1 (1 << 2) > +#define XSAVES (1 << 3) This absolutely must be X86_FEATURE_* for consistency, and you will need to introduce a new capability word. ~Andrew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |