|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v5 4/6] xen/arm: vgic: Optimize the way to store the target vCPU in the rank
On Mon, 9 Nov 2015, Julien Grall wrote:
> Xen is currently directly storing the value of GICD_ITARGETSR register
> (for GICv2) and GICD_IROUTER (for GICv3) in the rank. This makes the
> emulation of the registers access very simple but makes the code to get
> the target vCPU for a given vIRQ more complex.
>
> While the target vCPU of an vIRQ is retrieved every time an vIRQ is
> injected to the guest, the access to the register occurs less often.
>
> So the data structure should be optimized for the most common case
> rather than the inverse.
>
> This patch introduces the usage of an array to store the target vCPU for
> every interrupt in the rank. This will make the code to get the target
> very quick. The emulation code will now have to generate the GICD_ITARGETSR
> and GICD_IROUTER register for read access and split it to store in a
> convenient way.
>
> With the new way to store the target vCPU, the structure vgic_irq_rank
> is shrunk down from 320 bytes to 92 bytes. This is saving about 228
> bytes of memory allocated separately per vCPU.
nice
> Note that with these changes, any read to those register will list only
> the target vCPU used by Xen. As the spec is not clear whether this is a
> valid choice or not, OSes which have a different interpretation of the
> spec (i.e OSes which perform read-modify-write operations on these
> registers) may not boot anymore on Xen. Although, I think this is fair
> trade between memory usage in Xen (1KB less on a domain using 4 vCPUs
> with no SPIs) and a strict interpretation of the spec (though all the
> cases are not clearly defined).
>
> Furthermore, the implementation of the callback get_target_vcpu is now
> exactly the same. Consolidate the implementation in the common vGIC code
> and drop the callback.
>
> Finally take the opportunity to fix coding style and replace "irq" by
> "virq" to make clear that we are dealing with virtual IRQ in section we
> are modifying.
>
> Signed-off-by: Julien Grall <julien.grall@xxxxxxxxxx>
>
> ---
> The real reason is I'd like to drop vgic_byte_* helpers in favor of more
> generic access helpers. Although it would make the code to retrieve the
> priority more complex. So reworking the code to get the target vCPU
> was the best solution.
>
> Changes in v5:
> - NR_TARGET_PER_REG has been renamed to NR_TARGETS_PER_ITARGETSR
> - NR_BIT_PER_TARGET has been renamed to NR_BITS_PER_TARGET
> - Fix comments
> - s/NR_BYTE_PER_IROUTER/NR_BYTES_PER_IROUTER/
> - Remove the duplicate declaration of virq in vgic_store_itargetsr
>
> Changes in v4:
> - Move the behavior changes in 2 separate patches:
> * patch #1: Handle correctly byte write in ITARGETSR
> * patch #2: Don't ignore a write in ITARGETSR if one field is 0
> - Update commit message
>
> Changes in v3:
> - Make clear in the commit message that we rely on a corner case
> of the spec.
> - Typoes
>
> Changes in v2:
> - Remove get_target_vcpu callback
> - s/irq/virq/ to make clear we are using virtual IRQ
> - Update the commit message
> - Rename vgic_generate_* to vgic_fetch
> - Improve code comment
> - Move the introduction of vgic_rank_init in a separate patch
> - Make use of rank->idx
>
> Changes in v1:
> - Patch added
> ---
> xen/arch/arm/vgic-v2.c | 86 +++++++++++++-------------------
> xen/arch/arm/vgic-v3.c | 119
> ++++++++++++++++++++++++---------------------
> xen/arch/arm/vgic.c | 45 ++++++++++++-----
> xen/include/asm-arm/vgic.h | 18 +++----
> 4 files changed, 137 insertions(+), 131 deletions(-)
>
> diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
> index b5249ff..4f976d4 100644
> --- a/xen/arch/arm/vgic-v3.c
> +++ b/xen/arch/arm/vgic-v3.c
> @@ -89,18 +89,72 @@ static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain
> *d, uint64_t irouter)
> return d->vcpu[vcpu_id];
> }
>
> -static struct vcpu *vgic_v3_get_target_vcpu(struct vcpu *v, unsigned int irq)
> -{
> - struct vcpu *v_target;
> - struct vgic_irq_rank *rank = vgic_rank_irq(v, irq);
> +#define NR_BYTES_PER_IROUTER 8U
Given the way this constant is used, it might be better to define it as
a bit shift of 3, removing the two division operations below.
> +/*
> + * Fetch an IROUTER register based on the offset from IROUTER0. Only one
> + * vCPU will be listed for a given vIRQ.
> + *
> + * Note the offset will be aligned to the appropriate boundary.
> + */
> +static uint64_t vgic_fetch_irouter(struct vgic_irq_rank *rank,
> + unsigned int offset)
> +{
> ASSERT(spin_is_locked(&rank->lock));
>
> - v_target = vgic_v3_irouter_to_vcpu(v->domain, rank->v3.irouter[irq %
> 32]);
> + /* There is exactly 1 vIRQ per IROUTER */
> + offset /= NR_BYTES_PER_IROUTER;
>
> - ASSERT(v_target != NULL);
> + /* Get the index in the rank */
> + offset &= INTERRUPT_RANK_MASK;
>
> - return v_target;
> + return vcpuid_to_vaffinity(rank->vcpu[offset]);
> +}
> +
> +/*
> + * Store an IROUTER register in a convenient way and migrate the vIRQ
> + * if necessary. This function only deals with IROUTER32 and onwards.
> + *
> + * Note the offset will be aligned to the appropriate boundary.
> + */
> +static void vgic_store_irouter(struct domain *d, struct vgic_irq_rank *rank,
> + unsigned int offset, uint64_t irouter)
> +{
> + struct vcpu *new_vcpu, *old_vcpu;
> + unsigned int virq;
> +
> + /* There is 1 vIRQ per IROUTER */
> + virq = offset / NR_BYTES_PER_IROUTER;
> +
> + /*
> + * The IROUTER0-31, used for SGIs/PPIs, are reserved and should
> + * never call this function.
> + */
> + ASSERT(virq >= 32);
> +
> + /* Get the index in the rank */
> + offset &= virq & INTERRUPT_RANK_MASK;
> +
> + new_vcpu = vgic_v3_irouter_to_vcpu(d, irouter);
> + old_vcpu = d->vcpu[rank->vcpu[offset]];
> +
> + /*
> + * From the spec (see 8.9.13 in IHI 0069A), any write with an
> + * invalid vCPU will lead to the interrupt being ignored.
> + *
> + * But the current code to inject an IRQ is not able to cope with
> + * invalid vCPU. So for now, just ignore the write.
> + *
> + * TODO: Respect the spec
> + */
> + if ( !new_vcpu )
> + return;
> +
> + /* Only migrate the IRQ if the target vCPU has changed */
> + if ( new_vcpu != old_vcpu )
> + vgic_migrate_irq(old_vcpu, new_vcpu, virq);
> +
> + rank->vcpu[offset] = new_vcpu->vcpu_id;
> }
>
> static int __vgic_v3_rdistr_rd_mmio_read(struct vcpu *v, mmio_info_t *info,
> @@ -726,8 +780,7 @@ static int vgic_v3_distr_mmio_read(struct vcpu *v,
> mmio_info_t *info,
> DABT_DOUBLE_WORD);
> if ( rank == NULL ) goto read_as_zero;
> vgic_lock_rank(v, rank, flags);
> - *r = rank->v3.irouter[REG_RANK_INDEX(64,
> - (gicd_reg - GICD_IROUTER), DABT_DOUBLE_WORD)];
> + *r = vgic_fetch_irouter(rank, gicd_reg - GICD_IROUTER);
> vgic_unlock_rank(v, rank, flags);
> return 1;
> case GICD_NSACR ... GICD_NSACRN:
> @@ -812,8 +865,6 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v,
> mmio_info_t *info,
> struct hsr_dabt dabt = info->dabt;
> struct vgic_irq_rank *rank;
> unsigned long flags;
> - uint64_t new_irouter, old_irouter;
> - struct vcpu *old_vcpu, *new_vcpu;
> int gicd_reg = (int)(info->gpa - v->domain->arch.vgic.dbase);
>
> perfc_incr(vgicd_writes);
> @@ -881,32 +932,8 @@ static int vgic_v3_distr_mmio_write(struct vcpu *v,
> mmio_info_t *info,
> rank = vgic_rank_offset(v, 64, gicd_reg - GICD_IROUTER,
> DABT_DOUBLE_WORD);
> if ( rank == NULL ) goto write_ignore;
> - new_irouter = r;
> vgic_lock_rank(v, rank, flags);
> -
> - old_irouter = rank->v3.irouter[REG_RANK_INDEX(64,
> - (gicd_reg - GICD_IROUTER),
> - DABT_DOUBLE_WORD)];
> - old_vcpu = vgic_v3_irouter_to_vcpu(v->domain, old_irouter);
> - new_vcpu = vgic_v3_irouter_to_vcpu(v->domain, new_irouter);
> -
> - if ( !new_vcpu )
> - {
> - printk(XENLOG_G_DEBUG
> - "%pv: vGICD: wrong irouter at offset %#08x val
> %#"PRIregister,
> - v, gicd_reg, r);
> - vgic_unlock_rank(v, rank, flags);
> - /*
> - * TODO: Don't inject a fault to the guest when the MPIDR is
> - * not valid. From the spec, the interrupt should be
> - * ignored.
> - */
> - return 0;
> - }
> - rank->v3.irouter[REG_RANK_INDEX(64, (gicd_reg - GICD_IROUTER),
> - DABT_DOUBLE_WORD)] = new_irouter;
> - if ( old_vcpu != new_vcpu )
> - vgic_migrate_irq(old_vcpu, new_vcpu, (gicd_reg -
> GICD_IROUTER)/8);
> + vgic_store_irouter(v->domain, rank, gicd_reg - GICD_IROUTER, r);
> vgic_unlock_rank(v, rank, flags);
> return 1;
> case GICD_NSACR ... GICD_NSACRN:
> @@ -1036,7 +1063,6 @@ static const struct mmio_handler_ops
> vgic_distr_mmio_handler = {
> static int vgic_v3_vcpu_init(struct vcpu *v)
> {
> int i;
> - uint64_t affinity;
> paddr_t rdist_base;
> struct vgic_rdist_region *region;
> unsigned int last_cpu;
> @@ -1045,15 +1071,6 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
> struct domain *d = v->domain;
> uint32_t rdist_stride = d->arch.vgic.rdist_stride;
>
> - /* For SGI and PPI the target is always this CPU */
> - affinity = (MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 3) << 32 |
> - MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 2) << 16 |
> - MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 1) << 8 |
> - MPIDR_AFFINITY_LEVEL(v->arch.vmpidr, 0));
> -
> - for ( i = 0 ; i < 32 ; i++ )
> - v->arch.vgic.private_irqs->v3.irouter[i] = affinity;
> -
> /*
> * Find the region where the re-distributor lives. For this purpose,
> * we look one region ahead as we have only the first CPU in hand.
> @@ -1098,7 +1115,7 @@ static int vgic_v3_vcpu_init(struct vcpu *v)
>
> static int vgic_v3_domain_init(struct domain *d)
> {
> - int i, idx;
> + int i;
>
> /*
> * Domain 0 gets the hardware address.
> @@ -1150,13 +1167,6 @@ static int vgic_v3_domain_init(struct domain *d)
> d->arch.vgic.rdist_regions[0].first_cpu = 0;
> }
>
> - /* By default deliver to CPU0 */
> - for ( i = 0; i < DOMAIN_NR_RANKS(d); i++ )
> - {
> - for ( idx = 0; idx < 32; idx++ )
> - d->arch.vgic.shared_irqs[i].v3.irouter[idx] = 0;
> - }
> -
> /* Register mmio handle for the Distributor */
> register_mmio_handler(d, &vgic_distr_mmio_handler, d->arch.vgic.dbase,
> SZ_64K, NULL);
> @@ -1182,7 +1192,6 @@ static int vgic_v3_domain_init(struct domain *d)
> static const struct vgic_ops v3_ops = {
> .vcpu_init = vgic_v3_vcpu_init,
> .domain_init = vgic_v3_domain_init,
> - .get_target_vcpu = vgic_v3_get_target_vcpu,
> .emulate_sysreg = vgic_v3_emulate_sysreg,
> /*
> * We use both AFF1 and AFF0 in (v)MPIDR. Thus, the max number of CPU
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |