[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 13/16] xen/arm: Add support for GIC v3



On Mon, 2014-06-02 at 18:33 +0100, Stefano Stabellini wrote:
> > +static inline void gicv3_restore_lr(int nr_lrs, const struct vcpu *v)
> > +{
> > +    /* Fall through for all the cases */
> > +    switch ( nr_lrs )
> > +    {
> > +    case 16:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[15], ICH_LR15_EL2);
> > +    case 15:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[14], ICH_LR14_EL2);
> > +    case 14:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[13], ICH_LR13_EL2);
> > +    case 13:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[12], ICH_LR12_EL2);
> > +    case 12:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[11], ICH_LR11_EL2);
> > +    case 11:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[10], ICH_LR10_EL2);
> > +    case 10:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[9], ICH_LR9_EL2);
> > +    case 9:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[8], ICH_LR8_EL2);
> > +    case 8:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[7], ICH_LR7_EL2);
> > +    case 7:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[6], ICH_LR6_EL2);
> > +    case 6:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[5], ICH_LR5_EL2);
> > +    case 5:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[4], ICH_LR4_EL2);
> > +    case 4:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[3], ICH_LR3_EL2);
> > +    case 3:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[2], ICH_LR2_EL2);
> > +    case 2:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[1], ICH_LR1_EL2);
> > +    case 1:
> > +        WRITE_SYSREG(v->arch.gic.v3.lr[0], ICH_LR0_EL2);
> > +         break;
> > +    default:
> > +         BUG();
> > +    }
> > +}
> 
> Given that the number of LR registers has grown up quite a bit from
> GICv2, we should optimize this code and only save and restore the
> registers that are actually in used by checking on lr_mask.

I'm not convinced that will be faster, the code above involves a
straight line with no branches, but potentially some unnecessary stores.
Looping or lr_mask involves a loop, a test_bit and most critically a
switch over the lr number (to select the correct sysreg, which is a
static register, not an opcode argument), but has no redundant stores. I
suspect the straight line version will actually be quicker.

But in any case choosing one or the other would require someone to run
the numbers, I'm happy with the variant above in the first instance.

> It would also help performances if we restored them in ascending order.

This might help with memory prefetching, yes, but makes this switch
fall-thru trick harder.

It was suggested that perhaps the lr array should be backwards -- e.g.
ICH_LR0_EL2 is stored in v->arch.gic.v3.lr[15] and ICH_LR15_EL2 is
stored in v->arch.git.v3.lr[0], of course this should be hidden behind
an accessor macro e.g.

#define VCPU_GICV3_LR(v, n) v->arch.git.v3.lrs_descending[MAX_NR_LRS-n-1]

> > +static void gicv3_restore_state(const struct vcpu *v)
> > +{
> > +    gicv3_restore_lr(gicv3_info.nr_lrs, v);
> > +    restore_aprn_regs(&v->arch.gic);
> > +    WRITE_SYSREG32(v->arch.gic.v3.vmcr, ICH_VMCR_EL2);
> > +}
> 
> Shouldn't we restore vmcr first? Before writing to the LRs?

Can you remind me of the dependency?

> Should we save/restore SRE_EL1 too?

We certainly should IMHO.

Ian.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.