[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen Dom0 boot failure on platform that supports ARM GICv4





On 03/09/18 17:54, Shameerali Kolothum Thodi wrote:
Hi Julien,

Thanks for taking a look at this.

-----Original Message-----
From: Julien Grall [mailto:julien.grall@xxxxxxx]
Sent: 03 September 2018 17:13
To: Shameerali Kolothum Thodi <shameerali.kolothum.thodi@xxxxxxxxxx>;
xen-devel@xxxxxxxxxxxxx
Cc: sstabellini@xxxxxxxxxx; Linuxarm <linuxarm@xxxxxxxxxx>; Andre
Przywara <andre.przywara@xxxxxxx>
Subject: Re: Xen Dom0 boot failure on platform that supports ARM GICv4



On 03/09/18 15:53, Shameerali Kolothum Thodi wrote:
Hi,

Hello,

I am trying to boot xen(stable-4.11) on one of our ARM64 boards which
has support for GICv4.

But dom0(kernel 4.18) boot fails with the below trap,

XEN) ............done.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch
input to Xen)
(XEN) Freed 304kB init memory.
(XEN) traps.c:2007:d0v0 HSR=0x93800004 pc=0xffff00000841af04
gva=0xffff00000b10ffe8 gpa=0x004000aa10ffe8

Which bits of Linux is trying to access the region?

I think it is the gic_iterate_rdists() as the offset just before this is ffe8, 
which is GICR_PIDR2


After a bit of debugging, it looks like, the GICR size used in
vgic_v3_domain_init()
is GICv4 GICR size(256K) and this upsets the first_cpu calculations.

Can you expand what you mean by upset? What's wrong with the first_cpu
calculations.

What I meant is, since this is a GICv4, the vgic_v3_hw.regions[i]->size is set 
to 256K and
since first_cpu is calculated like,

        first_cpu += size /GICV3_GICR_SIZE;

gets wrong as what I am seeing is,

(XEN) frst_cpu 2
(XEN) first_cpu 4
(XEN) first_cpu 6
(XEN) first_cpu 8
(XEN) first_cpu 10
(XEN) first_cpu 12
(XEN) first_cpu 14
.....
(XEN) first_cpu 192

But the original number of CPUS are only 96. Hence I thought this is wrong.

This is perfectly fine. Until recently it was not possible to know the number of vCPUs at domain creation. So the function is computing the first CPU for all the regions.

With the recent change, it would be possible to only compute what is necessary.


Since dom0 gicv3 is also an emulated one, I think the size should be
restricted to use the GICv3 GICR size(128K). I have made the below
changes and is able to boot dom0 now.

But not sure, this is the right approach to fix the issue. Please let me
know your thoughts.

Thanks,
Shameer

---->8-------------

diff --git a/xen/arch/arm/gic-v3.c b/xen/arch/arm/gic-v3.c
index b2ed0f8..bf028cc 100644
--- a/xen/arch/arm/gic-v3.c
+++ b/xen/arch/arm/gic-v3.c
@@ -1783,7 +1783,8 @@ static int __init gicv3_init(void)
       reg = readl_relaxed(GICD + GICD_TYPER);
       intid_bits = GICD_TYPE_ID_BITS(reg);

-    vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions,
intid_bits);
+    vgic_v3_setup_hw(dbase, gicv3.rdist_count, gicv3.rdist_regions,
+                                intid_bits, gic_dist_supports_dvis());
       gicv3_init_v2();

       spin_lock_init(&gicv3.lock);
diff --git a/xen/arch/arm/vgic-v3.c b/xen/arch/arm/vgic-v3.c
index 4b42739..0f53d88 100644
--- a/xen/arch/arm/vgic-v3.c
+++ b/xen/arch/arm/vgic-v3.c
@@ -59,18 +59,21 @@ static struct {
       unsigned int nr_rdist_regions;
       const struct rdist_region *regions;
       unsigned int intid_bits;  /* Number of interrupt ID bits */
+    bool dvis;
   } vgic_v3_hw;

   void vgic_v3_setup_hw(paddr_t dbase,
                         unsigned int nr_rdist_regions,
                         const struct rdist_region *regions,
-                      unsigned int intid_bits)
+                      unsigned int intid_bits,
+                      bool dvis)
   {
       vgic_v3_hw.enabled = true;
       vgic_v3_hw.dbase = dbase;
       vgic_v3_hw.nr_rdist_regions = nr_rdist_regions;
       vgic_v3_hw.regions = regions;
       vgic_v3_hw.intid_bits = intid_bits;
+    vgic_v3_hw.dvis = dvis;
   }

   static struct vcpu *vgic_v3_irouter_to_vcpu(struct domain *d, uint64_t
irouter)
@@ -1673,6 +1676,9 @@ static int vgic_v3_domain_init(struct domain *d)
           {
               paddr_t size = vgic_v3_hw.regions[i].size;

+            if (vgic_v3_hw.dvis && (size == GICV4_GICR_SIZE))
+                 size = GICV3_GICR_SIZE;

vgic_v3_hw.regions is describing the regions in the layout that could
hold re-distributor. You can have multiple re-distributor per region.

The variable size holds the size of the region, not the size of the
re-distributor.

I am not sure to understand why you want to restrict the size of the
region here because GICV4_GICR_SIZE is a multiple of GICV3_GICR_SIZE. So
you should be able to fit 2 re-distributors per region.

It looks like to me the re-distributor regions are not reported
correctly or Dom0 thinks it is on GICv4. Can you provide a bit more
details on the function that cause the crash and some logs from Linux?

Ok. I added few prints along the vgic mmio read path and this is what happens
before the trap.

     vgic_v3_rdistr_mmio_read()
          get_vcpu_from_rdist()  -->returns NULL here for 0x004000aa10ffe8 which
                                                     actually belongs to cpu id 
48 as per the log below

Do you mean region id 48? So if I get it correctly, you are trying to access re-distributor for vCPU ID 96.

[...]

If I remember correctly there was no logs from Dom0, but I need to double
check the Dom0 cmdline option to see earlycon was set.

I could also enable/add any prints that you think will help and rerun. Please
let me know

I may have an idea what is happening. As we populate more regions than necessary, it is possible that Linux is trying to access them. Would it be possible to add some debug in the Linux function gic_iterate_rdists to know what the kernel is trying to read?

Cheers,

--
Julien Grall

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.