[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Porting Xen to Jetson Nano



Hi Julien,

Thanks for the tips. Comments inline...

Regards,
Srini

-----Original Message-----
From: Xen-devel <xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of Julien
Grall
Sent: Thursday, July 23, 2020 11:04 AM
To: Srinivas Bangalore <srini@xxxxxxxxxx>; xen-devel@xxxxxxxxxxxxxxxxxxxx;
Christopher Clark <christopher.w.clark@xxxxxxxxx>
Subject: Re: Porting Xen to Jetson Nano

On 22/07/2020 18:57, Srinivas Bangalore wrote:
> Dear Xen experts,

Hello,

> Would greatly appreciate some hints on how to move forward with this 
> one?

 From your first set of original log:

 > Xen version 4.8.5 (srinivas@) (aarch64-linux-gnu-gcc  > (Ubuntu/Linaro
7.5.0-3ubuntu1~18.04) 7.5.0) debug=n  Sun Jul 19 07:44:00  > PDT 2020

I would recommend to compile Xen with debug enabled (CONFIG_DEBUG=y) as it
may provide you more information of what's happening.

Xen image rebuild now with CONFIG_DEBUG=y. Also changed bootargs as
suggested.

(XEN) MODULE[0]: 00000000fc7f8000 - 00000000fc82d000 Device Tree
(XEN) MODULE[1]: 00000000e1000000 - 00000000e31bc808 Kernel
console=hvc0 earlycon=xenboot rootfstype=ext4 rw rootwait
root=/dev/mmcblk0p1 rdinit=/sbin/init clk_ignore_unused
(XEN)  RESVD[0]: 0000000080000000 - 0000000080020000
(XEN)  RESVD[1]: 00000000e3500000 - 00000000e3535000
(XEN)  RESVD[2]: 00000000fc7f8000 - 00000000fc82d000
(XEN)
(XEN) Command line: console=dtuart sync_console dom0_mem=128M log_lvl=all
guest_loglvl=all console_to_ring
(XEN) Placing Xen at 0x00000000fec00000-0x00000000fee00000
(XEN) Update BOOTMOD_XEN from 0000000080080000-0000000080198e01 =>
00000000fec00000-00000000fed18e01
(XEN) Domain heap initialised
(XEN) Platform: Tegra
(XEN) Taking dtuart configuration from /chosen/stdout-path
(XEN) Looking for dtuart at "/serial@70 Xen 4.8.5
(XEN) Xen version 4.8.5 (srinivas@) (aarch64-linux-gnu-gcc (Ubuntu/Linaro
7.5.0-3ubuntu1~18.04) 7.5.0) debug=y  Thu Jul 23 21:17:23 PDT 2020


Also, aside the Tegra series. Do you have any other patches on top?

No other patches. 

[...]

> (XEN) BANK[0] 0x000000a0000000-0x000000c0000000 (512MB)
> 
> (XEN) Grant table range: 0x000000fec00000-0x000000fec60000
> 
> (XEN) Loading zImage from 00000000e1000000 to
> 00000000a0080000-00000000a223c808
> 
> (XEN) Allocating PPI 16 for event channel interrupt
> 
> (XEN) Loading dom0 DTB to 0x00000000a8000000-0x00000000a803435c

[...]

> 
> (XEN) *** Dumping CPU0 guest state (d0v0): ***
> 
> (XEN) ----[ Xen-4.8.5  arm64  debug=n   Tainted:  C   ]----
> 
> (XEN) CPU:    0
> 
> (XEN) PC:     00000000a0080000

PC is pointing to the entry point of your kernel...

> 
> (XEN) LR:     0000000000000000
> 
> (XEN) SP_EL0: 0000000000000000
> 
> (XEN) SP_EL1: 0000000000000000
> 
> (XEN) CPSR:   000001c5 MODE:64-bit EL1h (Guest Kernel, handler)
> 
> (XEN)      X0: 00000000a8000000  X1: 0000000000000000  X2: 
> 0000000000000000
> 
> (XEN)      X3: 0000000000000000  X4: 0000000000000000  X5: 
> 0000000000000000
> 
> (XEN)      X6: 0000000000000000  X7: 0000000000000000  X8: 
> 0000000000000000
> 
> (XEN)      X9: 0000000000000000 X10: 0000000000000000 X11: 
> 0000000000000000
> 
> (XEN)     X12: 0000000000000000 X13: 0000000000000000 X14: 
> 0000000000000000
> 
> (XEN)     X15: 0000000000000000 X16: 0000000000000000 X17: 
> 0000000000000000
> 
> (XEN)     X18: 0000000000000000 X19: 0000000000000000 X20: 
> 0000000000000000
> 
> (XEN)     X21: 0000000000000000 X22: 0000000000000000 X23: 
> 0000000000000000
> 
> (XEN)     X24: 0000000000000000 X25: 0000000000000000 X26: 
> 0000000000000000
> 
> (XEN)     X27: 0000000000000000 X28: 0000000000000000  FP: 
> 0000000000000000
> 
> (XEN)
> 
> (XEN)    ELR_EL1: 0000000000000000
> 
> (XEN)    ESR_EL1: 00000000
> 
> (XEN)    FAR_EL1: 0000000000000000
> 
> (XEN)
> 
> (XEN)  SCTLR_EL1: 00c50838
> 
> (XEN)    TCR_EL1: 00000000
> 
> (XEN)  TTBR0_EL1: 0000000000000000
> 
> (XEN)  TTBR1_EL1: 0000000000000000
> 
> (XEN)
> 
> (XEN)   VTCR_EL2: 80043594
> 
> (XEN)  VTTBR_EL2: 000100017f0f9000
> 
> (XEN)
> 
> (XEN)  SCTLR_EL2: 30cd183d
> 
> (XEN)    HCR_EL2: 000000008038663f
> 
> (XEN)  TTBR0_EL2: 00000000fecfc000
> 
> (XEN)
> 
> (XEN)    ESR_EL2: 8200000d

... it looks like we are receiving a trap in EL2 because it can't execute
the instruction. This is a bit odd as the p2m (stage-2
page-tables) should be configured to allow execution. It would be useful if
you can dump the p2m walk here. This following patch should do the job (not
compiled test):

diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c index
d578a5c598dd..af1834cdf735 100644
--- a/xen/arch/arm/traps.c
+++ b/xen/arch/arm/traps.c
@@ -2489,9 +2489,14 @@ static void do_trap_instr_abort_guest(struct
cpu_user_regs *regs,
           */
          rc = gva_to_ipa(gva, &gpa, GV2M_READ);
          if ( rc == -EFAULT )
+        {
+            printk("Unable to translate 0x%lx\n", gva);
              return; /* Try again */
+        }
      }

+    dump_p2m_walk(current->domain, gpa);
+
      switch ( fsc )
      {
      case FSC_FLT_PERM:

I believe you meant 'dump_p2m_lookup'? I couldn't find 'dump_p2m_walk' in
the source, so included 'dump_p2m_lookup' (which actually calls
'dump_pm_walk').
Here's the output, truncated since it goes into an infinite loop printing
the same info:
[..]
(XEN) Allocating 1:1 mappings totalling 128MB for dom0:
(XEN) BANK[0] 0x00000088000000-0x00000090000000 (128MB)
(XEN) Grant table range: 0x000000fec00000-0x000000fec68000
(XEN) Loading zImage from 00000000e1000000 to
0000000088080000-000000008a23c808
(XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading dom0 DTB to 0x000000008fe00000-0x000000008fe34444
(XEN) Scrubbing Free RAM on 1 nodes using 4 CPUs
(XEN) ........done.
(XEN) Initial low memory virq threshold set at 0x4000 pages.
(XEN) Std. Loglevel: All
(XEN) Guest Loglevel: All
(XEN) ***************************************************
(XEN) WARNING: CONSOLE OUTPUT IS SYNCHRONOUS
(XEN) This option is intended to aid debugging of Xen by ensuring
(XEN) that all output is synchronously delivered on the serial line.
(XEN) However it can introduce SIGNIFICANT latencies and affect
(XEN) timekeeping. It is NOT recommended for production use!
(XEN) ***************************************************
(XEN) 3... 2... 1...
(XEN) *** Serial input -> DOM0 (type 'CTRL-a' three times to switch input to
Xen)
(XEN) Freed 296kB init memory.
(XEN) dom0 IPA 0x0000000088080000
(XEN) P2M @ 0000000803fc3d40 mfn:0x17f0f5
(XEN) 0TH[0x0] = 0x004000017f0f377f
(XEN) 1ST[0x2] = 0x02c00000800006fd
(XEN) Mem access check
(XEN) dom0 IPA 0x0000000088080000
(XEN) P2M @ 0000000803fc3d40 mfn:0x17f0f5
(XEN) 0TH[0x0] = 0x004000017f0f377f
(XEN) 1ST[0x2] = 0x02c00000800006fd
(XEN) Mem access check

[..]

I added the printk for 'Mem access check' inside the 'case FSC_FLT_PERM' of
the switch (fsc) code following the lookup. That's what you see in the
output above. 
So it does seem like there's a memory access fault somehow.
 
> 
> (XEN)  HPFAR_EL2: 0000000000000000
> 
> (XEN)    FAR_EL2: 00000000a0080000
> 
> (XEN)
> 
> (XEN) Guest stack trace from sp=0:
> 
> (XEN)   Failed to convert stack to physical address

[...]

> It seems the DOM0 kernel did not get added to the task list?.

 From a look at the dump, dom0 vCPU0 has been scheduled and running on
pCPU0.

> 
> Boot args for Xen and Dom0 are here:
> (XEN) Checking for initrd in /chosen
> 
> (XEN) linux,initrd limits invalid: 0000000084100000 >= 
> 0000000084100000
> 
> (XEN) RAM: 0000000080000000 - 00000000fedfffff
> 
> (XEN) RAM: 0000000100000000 - 000000017f1fffff
> 
> (XEN)
> 
> (XEN) MODULE[0]: 00000000fc7f8000 - 00000000fc82d000 Device Tree
> 
> (XEN) MODULE[1]: 00000000e1000000 - 00000000e31bc808 Kernel       
> console=hvc0 earlyprintk=xen earlycon=xen rootfstype=ext4 rw rootwait
> root=/dev/mmcblk0p1 rdinit=/sbin/init

You want to use earlycon=xenboot here.

> 
> (XEN)  RESVD[0]: 0000000080000000 - 0000000080020000
> 
> (XEN)  RESVD[1]: 00000000e3500000 - 00000000e3535000
> 
> (XEN)  RESVD[2]: 00000000fc7f8000 - 00000000fc82d000
> 
> (XEN)
> 
> (XEN) Command line: console=dtuart earlyprintk=xen
> earlycon=uart8250,mmio32,0x70006000 sync_console dom0_mem=512M 
> log_lvl=all guest_loglvl=all console_to_ring

FWIW, earlyprintk and earlycon are not understood by Xen. They are only
useful for Dom0.

BTW, to Christopher's point, the dtb did have some issues. I had to hack the
'interrupt-controller' node to get the GICv2 working.
I have attached the .dts file that I'm using.

Best regards,

--
Julien Grall

Attachment: jetson-nano-b00.dts
Description: Binary data


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.