[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v2 2/2] xen/arm32: implement VFP context switch



On 05/31/2013 03:12 PM, Ian Campbell wrote:

> On Thu, 2013-05-30 at 17:01 +0100, Julien Grall wrote:
>> +void vfp_save_state(struct vcpu *v)
>> +{
>> +    uint32_t tmp;
>> +
>> +    v->arch.vfp.fpexc = READ_CP32(FPEXC);
> 
> The docs seem to call for reading this via an explicit VMRS
> instruction. 
> 
> Looking at the ARM ARM this seems to be an alias for the encoding of an
> MRC instruction corresponding to reading FPEXC as you have done. Did you
> have a reference for that aliasing? (I'm not finding it in the ARM ARM).


I didn't find anything on the aliasing. I looked at the linux VFP
include (arch/arm/include/asm/vfp.h).

> Are you avoiding the mnemonic to avoid issues with binutils providing
> the instruction?

This mnemonic can only be used if VFP is enabled by gcc. I think it's
safer to use mrc if we don't want to use VFP on the other part of XEN.

>> +
>> +    WRITE_CP32(v->arch.vfp.fpexc | FPEXC_EN, FPEXC);
> 
> This being a CP write, do we need an isb? Will this write complete
> before the following read from FPSCR otherwise?


I will add an isb and only call this part if VFP is not enabled.

> 
>> +
>> +    v->arch.vfp.fpscr = READ_CP32(FPSCR);
>> +
>> +    if ( v->arch.vfp.fpexc & FPEXC_EX ) /* Check for sub-architecture */
>> +    {
>> +        v->arch.vfp.fpinst = READ_CP32(FPINST);
> 
> Do we need to check that the sub arch in FPSID is vfp common subarch 1
> (or 2 or 3?) here? Or better if that's the only thing we support we
> could check during boot / outside the hot path.

I will add a check during boot to verify if the VFP is in version 3.

>> +
>> +        if ( v->arch.vfp.fpexc & FPEXC_FP2V )
>> +            v->arch.vfp.fpinst2 = READ_CP32(FPINST2);
>> +        /* Disable FPEXC_EX */
>> +        WRITE_CP32((v->arch.vfp.fpexc | FPEXC_EN) & ~FPEXC_EX, FPEXC);
>> +    }
>> +
>> +    /* Save {d0-d15} */
>> +    asm volatile("stc p11, cr0, [%0], #32*4" : : "r" (v->arch.vfp.fpregs1));
> 
> Don't you need a memory clobber of the array at v->arch.vfp.fpregs1?

Indeed. I will fix it on the next version.

>> +
>> +    tmp = READ_CP32(MVFR0);
>> +    if ( (tmp & MVFR0_A_SIMD_MASK) == 2 ) /* 32 x 64 bits registers */
> 
> tmp isn't really needed here.

I will remove it.

> 
>> +    {
>> +        /* Save {d16-d31} */
>> +        asm volatile("stcl p11, cr0, [%0], #32*4" : : "r" 
>> (v->arch.vfp.fpregs2));
> 
> Likewise a clobber here.
> 
>> +    }
>> +
>> +    WRITE_CP32(v->arch.vfp.fpexc & ~(FPEXC_EN), FPEXC);
> 
> Is this needed? On restore we set it to whatever the next VCPU was
> using. In particular if barriers turn out to be required it would be
> good to omit as many of these as possible, including the ones where we
> turn it on if the guest already had it enabled etc.


No if we consider that nobody will use VFP in Xen. I can remove it.

> Do you know what happens to the values of d0..d31  and FPSCR etc when

> FPEXC.EN is disabled? I'm wondering if we can avoid saving the registers
> in that case, we'd still need to restore to prevent leaking the last
> guests state if the VCPU enabled exceptions. I suppose it's not worth it
> unless we implement a more full featured lazy saving regime.


I didn't find something about the state of the registers when VFP is
disabled. I think the registers is not clobbered.
Linux seems to save vfp registers at each context switch only if VFP is
enabled. But we cannot rely on that for the other distributions.

>> +}
>> +
>> +void vfp_restore_state(struct vcpu *v)
>> +{
> 
> Some of the same comments (tmps, barriers etc) apply to restore as to
> save.
> 
> Is there any chance that any of these loads could cause an FP fault?
> e.g. if the guest had an FP exception pending when we saved it, could
> reloading the register retrigger it?

I don't know.

>> +    uint32_t tmp = READ_CP32(FPEXC);
>> +
>> +    WRITE_CP32(tmp | FPEXC_EN, FPEXC);
>> +
>> +    /* Restore {d0-d15} */
>> +    asm volatile("ldc p11, cr0, [%0], #32*4" : : "r" (v->arch.vfp.fpregs1));
>> +
>> +    tmp = READ_CP32(MVFR0);
>> +    if ( (tmp & MVFR0_A_SIMD_MASK) == 2 ) /* 32 x 64 bits registers */
>> +        /* Restore {d16-d31} */
>> +        asm volatile("ldcl p11, cr0, [%0], #32*4" : : "r" 
>> (v->arch.vfp.fpregs2));
>> +
>> +    if ( v->arch.vfp.fpexc & FPEXC_EX )
>> +    {
>> +        WRITE_CP32(v->arch.vfp.fpinst, FPINST);
>> +        if ( v->arch.vfp.fpexc & FPEXC_FP2V )
>> +            WRITE_CP32(v->arch.vfp.fpinst2, FPINST2);
>> +    }
>> +
>> +    WRITE_CP32(v->arch.vfp.fpscr, FPSCR);
>> +
>> +    WRITE_CP32(v->arch.vfp.fpexc, FPEXC);
>> +}
>> +
>> +/*
>> + * Local variables:
>> + * mode: C
>> + * c-file-style: "BSD"
>> + * c-basic-offset: 4
>> + * indent-tabs-mode: nil
>> + * End:
>> + */
>> diff --git a/xen/include/asm-arm/cpregs.h b/xen/include/asm-arm/cpregs.h
>> index f08d59a..6d7d6ae 100644
>> --- a/xen/include/asm-arm/cpregs.h
>> +++ b/xen/include/asm-arm/cpregs.h
>> @@ -60,6 +60,14 @@
>>   * arguments, which are cp,opc1,crn,crm,opc2.
>>   */
>>  
>> +/* Coprocessor 10 */
>> +
>> +#define FPSCR           p10,7,c1,c0,0   /* Floating-Point Status and 
>> Control Register */
>> +#define MVFR0           p10,7,c7,c0,0   /* Media and VFP Feature Register 0 
>> */
>> +#define FPEXC           p10,7,c8,c0,0   /* Floating-Point Exception Control 
>> Register */
>> +#define FPINST          p10,7,c9,c0,0   /* Floating-Point Instruction 
>> Register */
>> +#define FPINST2         p10,7,c10,c0,0  /* Floating-point Instruction 
>> Register 2 */
>> +
>>  /* Coprocessor 14 */
>>  
>>  /* CP14 CR0: */
>> @@ -106,6 +114,7 @@
>>  #define NSACR           p15,0,c1,c1,2   /* Non-Secure Access Control 
>> Register */
>>  #define HSCTLR          p15,4,c1,c0,0   /* Hyp. System Control Register */
>>  #define HCR             p15,4,c1,c1,0   /* Hyp. Configuration Register */
>> +#define HCPTR           p15,4,c1,c1,2   /* Hyp. Coprocessor Trap Register */
> 
> Is this intentionally unused? I don't mind adding unused register
> definitions, just wanted to check there wasn't a hunk missing or
> anything.


Another lost when I have created the patch. I was using it to trap VFP
when I tested the context switch.

> The register resets to 0x0 which is OK but it might be wise to trap all
> accesses the coprocessors which we haven't implemented ctx switching
> for? So at least we find out about missing ones instead of silently
> leaking information between guests and/or corrupting their state.


What about sending an undefined instruction to the guest if the
coprocessor is not implemented?
It seems that some software (sshd, ntpdate) which are using libcrypto,
are trying to access to the cryptographic coprocessor even if it doesn't
exist.

Cheers,

-- 
Julien


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.