[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Smoke test failure on Arm (was Re: [PATCH v4 0/8] xen/arm: Emulate ID registers)



On Wed, 6 Jan 2021, Julien Grall wrote:
> On 05/01/2021 22:43, Stefano Stabellini wrote:
> > On Tue, 5 Jan 2021, Julien Grall wrote:
> > > Hi Stefano,
> > > 
> > > On 05/01/2021 18:44, Stefano Stabellini wrote:
> > > > On Tue, 5 Jan 2021, André Przywara wrote:
> > > > > On 05/01/2021 16:06, Julien Grall wrote:
> > > > > > (+ Ian and Andre)
> > > > > > 
> > > > > > Hi Bertrand,
> > > > > > 
> > > > > > On IRC, Ian pointed out that the smoke test was failing on
> > > > > > Cubietruck.
> > > > > > The only patches because the last success and the failure are your
> > > > > > series.
> > > > > > 
> > > > > > This seems to be a very early failure as there is no output from Xen
> > > > > > [1].
> > > > > > 
> > > > > > I originally thought the problem was because some of the ID
> > > > > > registers
> > > > > > (such as ID_PFR2) introduced in patch #2 doesn't exist in Armv7.
> > > > > > 
> > > > > > But per B7.2.1 in ARM DDI 0406C.d, unallocated ID registers should
> > > > > > be
> > > > > > RAZ. So it would result to a crash. Andre confirmed that PFR2 can be
> > > > > > accessed by writing a small baremetal code and booted on Cortex-A7
> > > > > > and
> > > > > > Cortex-A20.
> > > > > > 
> > > > > > So I am not entirely sure what's the problem. Andre kindly accepted
> > > > > > to
> > > > > > try to boot Xen on his board. Hopefully, this will give us a clue on
> > > > > > the
> > > > > > problem.
> > > > > > 
> > > > > > If not, I will borrow a Cubietruck in OssTest and see if I can
> > > > > > reproduce
> > > > > > it and debug it.
> > > > > 
> > > > > 
> > > > > So I just compiled master and staging and ran just that on an
> > > > > Allwinner
> > > > > H3 (Cortex-A7 r0p5). Master boots fine (till it complains about the
> > > > > missing Dom0, as expected). However staging indeed fails:
> > > > > 
> > > > > (XEN) Xen version 4.15-unstable (andprz01@xxxxxxxxxxxx)
> > > > > (arm-slackware-linux-gnueabihf-gcc (GCC) 8.2.0) debug=y  Tue Jan  5
> > > > > 16:09:40 GMT 2021
> > > > > (XEN) Latest ChangeSet: Sun Nov 8 15:59:42 2020 +0100 git:c992efd06a
> > > > > (XEN) build-id: 85d361b8565b90d4e0defe2beb2419e191fd76b4
> > > > > (XEN) CPU0: Unexpected Trap: Undefined Instruction
> > > > > (XEN) ----[ Xen-4.15-unstable  arm32  debug=y   Not tainted ]----
> > > > > (XEN) CPU:    0
> > > > > (XEN) PC:     0026b8c8 identify_cpu+0xc0/0xd4
> > > > > (XEN) CPSR:   600001da MODE:Hypervisor
> > > > > (XEN)      R0: 002acb20 R1: 00000000 R2: 00000000 R3: 11111111
> > > > > (XEN)      R4: 002acb1c R5: 002acb20 R6: 4e000000 R7: 00000000
> > > > > (XEN)      R8: 00000002 R9: 002d8200 R10:00008000 R11:002f7e6c
> > > > > R12:00000080
> > > > > (XEN) HYP: SP: 002f7e68 LR: 002c419c
> > > > > (XEN)
> > > > > (XEN)   VTCR_EL2: 80002646
> > > > > (XEN)  VTTBR_EL2: 00000018e628bb80
> > > > > (XEN)
> > > > > (XEN)  SCTLR_EL2: 30cd187f
> > > > > (XEN)    HCR_EL2: 00000038
> > > > > (XEN)  TTBR0_EL2: 000000004013a000
> > > > > (XEN)
> > > > > (XEN)    ESR_EL2: 00000000
> > > > > (XEN)  HPFAR_EL2: 0003fff0
> > > > > (XEN)      HDFAR: 9d110000
> > > > > (XEN)      HIFAR: 0000a04a
> > > > > (XEN)
> > > > > (XEN) Xen stack trace from sp=002f7e68:
> > > > > (XEN)    00000000 002f7f54 00008000 00000000 00002000 002a4584
> > > > > 00000000
> > > > > 00000000
> > > > > (XEN)    00000000 00008000 49ff5000 002d81f0 40000000 00000000
> > > > > 00002000
> > > > > 00000001
> > > > > (XEN)    00000000 50000000 49ffd000 00000000 50000000 00000000
> > > > > 00000000
> > > > > 50000000
> > > > > (XEN)    4c000000 00000000 4e000000 00000000 ffffffff ffffffff
> > > > > 50000000
> > > > > 00000000
> > > > > (XEN)    50000000 00000000 50000000 00000000 00000000 00000000
> > > > > 00000000
> > > > > 00000000
> > > > > (XEN)    00000000 00000003 00000000 00000000 ffffffff 00000040
> > > > > ffffffff
> > > > > 00000000
> > > > > (XEN)    00000000 00000000 00000000 002a7000 40008050 0000001a
> > > > > 00000000
> > > > > 49ff5000
> > > > > (XEN)    40008000 3fe08000 00000004 0020006c 00000000 00000000
> > > > > 00000000
> > > > > 00000000
> > > > > (XEN)    00000000 00000000 00000000 00000000 00000000 00000000
> > > > > 00000000
> > > > > 00000000
> > > > > (XEN)    00000000 00000000 00000000 00000000 00000000 00000000
> > > > > 00000000
> > > > > 00000000
> > > > > (XEN)    00000000 00000000 00000000 00000000 00000000 00000000
> > > > > 00000000
> > > > > 00000000
> > > > > (XEN)    00000000 00000000 00000000 00000000 00000000 00000000
> > > > > 00000000
> > > > > 00000000
> > > > > (XEN)    00000000 00000000 00000000 00000000 00000000 00000000
> > > > > (XEN) Xen call trace:
> > > > > (XEN)    [<0026b8c8>] identify_cpu+0xc0/0xd4 (PC)
> > > > > (XEN)    [<002c419c>] start_xen+0x778/0xe50 (LR)
> > > > > (XEN)    [<002f7f54>] 002f7f54
> > > > > (XEN)
> > > > > (XEN)
> > > > > (XEN) ****************************************
> > > > > (XEN) Panic on CPU 0:
> > > > > (XEN) CPU0: Unexpected Trap: Undefined Instruction
> > > > > (XEN) ****************************************
> > > > > (XEN)
> > > > > (XEN) Reboot in five seconds...
> > > > > 
> > > > > 
> > > > > The code in question:
> > > > >     26b8c0:       eef63a10        vmrs    r3, mvfr1
> > > > >     26b8c4:       e5803058        str     r3, [r0, #88]   ; 0x58
> > > > > > 26b8c8:       eef53a10        vmrs    r3, mvfr2
> > > > >     26b8cc:       e580305c        str     r3, [r0, #92]   ; 0x5c
> > > > >     26b8d0:       e28bd000        add     sp, fp, #0
> > > > >     26b8d4:       e49db004        pop     {fp}       ; (ldr fp, [sp],
> > > > > #4)
> > > > >     26b8d8:       e12fff1e        bx      lr
> > > > > 
> > > > > And indeed MVFR2 is not mentioned in the ARMv7 ARM, and in contrast to
> > > > > the CP15 CPUID registers this is using the VMRS instruction, so it's
> > > > > not
> > > > > protected by future-proof CPUID register scheme.
> > > > > 
> > > > > Not sure what to do about this, maybe #ifdef'ing this register access
> > > > > with arm64?
> > > > > I guess this comes from the slightly too optimistic code-sharing
> > > > > between
> > > > > arm32 and arm64?
> > > > 
> > > > Yes and #ifdef'ing is what we have been doing with the other registers.
> > > 
> > > There is a catch here. This register is accessible from AArch32 on all
> > > Armv8
> > > HW. It is just not accessible on Armv7.
> > > 
> > > So hiding the MVFR2 behind #ifdef CONFIG_ARM_64 is technically not
> > > correct.
> > > 
> > > I know that we said that we don't officially support Xen on Arm32 on Armv8
> > > HW
> > > (I can't find where it is written though). So we could argue that
> > > shadowing
> > > MVFR2 is not worth it.
> > > 
> > > I do use Armv8 HW to test 32-bit, so I would at least like to get Xen
> > > booting.
> > 
> > Yep, me too.
> > 
> > 
> > > In addition to that, Linux 32-bit doesn't access MVFR2 at the moment.
> > > 
> > > Therefore, a #ifdef may be acceptable for now. However, I would suggest to
> > > introduce name it MAY_BE_UNDEFINED (or similar) that will be used to avoid
> > > reading the system register on 32-bit.
> > > 
> > > For the 32-bit case, I would just hardcode the value based on the Arm (it
> > > looks like for Armv8-A there is only one valid value).
> > > 
> > > IOW, the hack would be self-contained in cpufeature.c.
> > 
> > I think it makes sense that the hack should be self-contained in
> > cpufeature.c, leaving the definition of struct mvfr to 3 register_t also
> > on arm32 as it is today. Also leaving vcpreg.c as it is today so that a
> > guest can try to read mvfr2 without crashing thanks to
> > GENERATE_TID3_INFO(MVFR2, mvfr, 2).
> > 
> > For the arm32 case in cpufeature.c:identify_cpu, the two permitted
> > values are 0 and 0b0100, which one did you have in mind? I take you
> > meant 0 which stands for "not implemented, or no support for
> > miscellaneous features"?
> 
> It looks like I misread the spec. :/ I thought it would only be a single
> possible value.
> 
> Hmmm... I am not sure about the value here as I don't know what would be the
> impact.
> 
> It might be better to detect whether we are running on Armv8 HW or Armv7 HW. I
> haven't figured out an obvious way for that yet.
> 
> On IRC, Andre suggested a couple of options on IRC
> 
>  1) read ISAR5: This doesn't  seem to work as it is UNK on Armv7
>  2) detect the VFP subarchitecture: It is not clear how this can be leveraged
> 
> While skimming through the spec, I noticed that ID_DFR0.CopDbg requires a
> specific value for Armv8.x spec. This is the debug architecture, but maybe we
> can leverage it?
> 
> Any better ideas?

One thing to note is that if Xen has difficulties in distinguishing
between armv7 and armv8/aarch32 a guest will have the same issue. Given
that mvfr2 is not available on armv7, how can a guest safely attempt to
read it?

If somehow the guest figures out that it is running on armv8/aarch32,
then it could read mvfr2, but we could return "not implemented" in that
case, right? In other words, part of me thinks that maybe we are
overcomplicating this, especially because we don't officially support
Xen on armv8/aarch32 as you wrote (we need to update SUPPORT.md).

So I would be tempted to only do the following:

- read mvfr2 on arm64
- never read mvfr2 on aarch32 (no matter armv7 or armv8)
- keep c->mvfr.bits[2] as zero on armv7 and armv8/aarch32 so that if the
  guest attempts to read it, it won't crash but simply return zero


The most important things in regards to armv8/aarch32 are:

- be able to boot Xen armv8/aarch32 for our own arm32 testing of Xen
- be able to boot an armv7 and armv8/aarch32 guest on Xen armv8/aarch64

Both these things should work correctly with the simple approach above.


Your idea of trying to figure out if we are running on armv8/aarch32 via
ID_DFR0.CopDbg is clever but I think it is a bit overkill for a mode of
execution that we don't support. I don't think is worth the effort.

But if you think differently, I can invest some time in investigating
this further, following your line of thought on finding a register to
detect the mode of execution. (However I don't have any hardware to test
this anymore.) Otherwise, if you agree that the simpler approach is
sufficient, then I'll send a patch as described above.

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.