[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 2/2] xen/arm32: implement VFP context switch



On Thu, 2013-05-30 at 17:01 +0100, Julien Grall wrote:
> +void vfp_save_state(struct vcpu *v)
> +{
> +    uint32_t tmp;
> +
> +    v->arch.vfp.fpexc = READ_CP32(FPEXC);

The docs seem to call for reading this via an explicit VMRS
instruction. 

Looking at the ARM ARM this seems to be an alias for the encoding of an
MRC instruction corresponding to reading FPEXC as you have done. Did you
have a reference for that aliasing? (I'm not finding it in the ARM ARM).

Are you avoiding the mnemonic to avoid issues with binutils providing
the instruction?

> +
> +    WRITE_CP32(v->arch.vfp.fpexc | FPEXC_EN, FPEXC);

This being a CP write, do we need an isb? Will this write complete
before the following read from FPSCR otherwise?

> +
> +    v->arch.vfp.fpscr = READ_CP32(FPSCR);
> +
> +    if ( v->arch.vfp.fpexc & FPEXC_EX ) /* Check for sub-architecture */
> +    {
> +        v->arch.vfp.fpinst = READ_CP32(FPINST);

Do we need to check that the sub arch in FPSID is vfp common subarch 1
(or 2 or 3?) here? Or better if that's the only thing we support we
could check during boot / outside the hot path.

> +
> +        if ( v->arch.vfp.fpexc & FPEXC_FP2V )
> +            v->arch.vfp.fpinst2 = READ_CP32(FPINST2);
> +        /* Disable FPEXC_EX */
> +        WRITE_CP32((v->arch.vfp.fpexc | FPEXC_EN) & ~FPEXC_EX, FPEXC);
> +    }
> +
> +    /* Save {d0-d15} */
> +    asm volatile("stc p11, cr0, [%0], #32*4" : : "r" (v->arch.vfp.fpregs1));

Don't you need a memory clobber of the array at v->arch.vfp.fpregs1?

> +
> +    tmp = READ_CP32(MVFR0);
> +    if ( (tmp & MVFR0_A_SIMD_MASK) == 2 ) /* 32 x 64 bits registers */

tmp isn't really needed here.

> +    {
> +        /* Save {d16-d31} */
> +        asm volatile("stcl p11, cr0, [%0], #32*4" : : "r" 
> (v->arch.vfp.fpregs2));

Likewise a clobber here.

> +    }
> +
> +    WRITE_CP32(v->arch.vfp.fpexc & ~(FPEXC_EN), FPEXC);

Is this needed? On restore we set it to whatever the next VCPU was
using. In particular if barriers turn out to be required it would be
good to omit as many of these as possible, including the ones where we
turn it on if the guest already had it enabled etc.

Do you know what happens to the values of d0..d31  and FPSCR etc when
FPEXC.EN is disabled? I'm wondering if we can avoid saving the registers
in that case, we'd still need to restore to prevent leaking the last
guests state if the VCPU enabled exceptions. I suppose it's not worth it
unless we implement a more full featured lazy saving regime.

> +}
> +
> +void vfp_restore_state(struct vcpu *v)
> +{

Some of the same comments (tmps, barriers etc) apply to restore as to
save.

Is there any chance that any of these loads could cause an FP fault?
e.g. if the guest had an FP exception pending when we saved it, could
reloading the register retrigger it?

> +    uint32_t tmp = READ_CP32(FPEXC);
> +
> +    WRITE_CP32(tmp | FPEXC_EN, FPEXC);
> +
> +    /* Restore {d0-d15} */
> +    asm volatile("ldc p11, cr0, [%0], #32*4" : : "r" (v->arch.vfp.fpregs1));
> +
> +    tmp = READ_CP32(MVFR0);
> +    if ( (tmp & MVFR0_A_SIMD_MASK) == 2 ) /* 32 x 64 bits registers */
> +        /* Restore {d16-d31} */
> +        asm volatile("ldcl p11, cr0, [%0], #32*4" : : "r" 
> (v->arch.vfp.fpregs2));
> +
> +    if ( v->arch.vfp.fpexc & FPEXC_EX )
> +    {
> +        WRITE_CP32(v->arch.vfp.fpinst, FPINST);
> +        if ( v->arch.vfp.fpexc & FPEXC_FP2V )
> +            WRITE_CP32(v->arch.vfp.fpinst2, FPINST2);
> +    }
> +
> +    WRITE_CP32(v->arch.vfp.fpscr, FPSCR);
> +
> +    WRITE_CP32(v->arch.vfp.fpexc, FPEXC);
> +}
> +
> +/*
> + * Local variables:
> + * mode: C
> + * c-file-style: "BSD"
> + * c-basic-offset: 4
> + * indent-tabs-mode: nil
> + * End:
> + */
> diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
> index f08d59a..6d7d6ae 100644
> --- a/xen/include/asm-arm/cpregs.h
> +++ b/xen/include/asm-arm/cpregs.h
> @@ -60,6 +60,14 @@
>   * arguments, which are cp,opc1,crn,crm,opc2.
>   */
>  
> +/* Coprocessor 10 */
> +
> +#define FPSCR           p10,7,c1,c0,0   /* Floating-Point Status and Control 
> Register */
> +#define MVFR0           p10,7,c7,c0,0   /* Media and VFP Feature Register 0 
> */
> +#define FPEXC           p10,7,c8,c0,0   /* Floating-Point Exception Control 
> Register */
> +#define FPINST          p10,7,c9,c0,0   /* Floating-Point Instruction 
> Register */
> +#define FPINST2         p10,7,c10,c0,0  /* Floating-point Instruction 
> Register 2 */
> +
>  /* Coprocessor 14 */
>  
>  /* CP14 CR0: */
> @@ -106,6 +114,7 @@
>  #define NSACR           p15,0,c1,c1,2   /* Non-Secure Access Control 
> Register */
>  #define HSCTLR          p15,4,c1,c0,0   /* Hyp. System Control Register */
>  #define HCR             p15,4,c1,c1,0   /* Hyp. Configuration Register */
> +#define HCPTR           p15,4,c1,c1,2   /* Hyp. Coprocessor Trap Register */

Is this intentionally unused? I don't mind adding unused register
definitions, just wanted to check there wasn't a hunk missing or
anything.

The register resets to 0x0 which is OK but it might be wise to trap all
accesses the coprocessors which we haven't implemented ctx switching
for? So at least we find out about missing ones instead of silently
leaking information between guests and/or corrupting their state.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.