[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 0/4] xen/arm: Unbreak ACPI



Hi Alex,

On 30/09/2020 11:38, Alex Bennée wrote:

Julien Grall <julien@xxxxxxx> writes:

Hi Alex,

On 29/09/2020 22:11, Alex Bennée wrote:

Julien Grall <julien@xxxxxxx> writes:

Hi Alex,

On 29/09/2020 16:29, Alex Bennée wrote:

Julien Grall <julien@xxxxxxx> writes:

From: Julien Grall <jgrall@xxxxxxxxxx>

Hi all,

Xen on ARM has been broken for quite a while on ACPI systems. This
series aims to fix it.

Unfortunately I don't have a system with ACPI v6.0 or later (QEMU seems
to only support 5.1). So I did only some light testing.

I was hoping to get more diagnostics out to get it working under QEMU
TCG so I think must of missed a step:

     Loading Xen 4.15-unstable ...
     Loading Linux 4.19.0-11-arm64 ...
     Loading initial ramdisk ...
     Using modules provided by bootloader in FDT
     Xen 4.15-unstable (c/s Sat Sep 26 21:55:42 2020 +0100 git:72f3d495d0) EFI 
loader
     ...silence...

I have a grub installed from testing on a buster base:

     dpkg --status grub-arm64-efi
     Version: 2.04-8

With:

     GRUB_CMDLINE_LINUX_DEFAULT=""
     GRUB_CMDLINE_LINUX="console=ttyAMA0"
     GRUB_CMDLINE_LINUX_XEN_REPLACE="console=hvc0 earlyprintk=xen"
     GRUB_CMDLINE_XEN="loglvl=all guest_loglvl=all 
com1=115200,8n1,0x3e8,5console=com1,vg"

And I built Xen with --enable-systemd and tweaked the hypervisor .config:

     CONFIG_EXPERT=y
     CONFIG_ACPI=y

So any pointers to make it more verbose would be helpful.

The error is hapenning before Xen setup the console. You can get early
output on QEMU if you rebuild Xen with the following .config options:

CONFIG_DEBUG=y
CONFIG_EARLY_UART_CHOICE_PL011=y
CONFIG_EARLY_UART_PL011=y
CONFIG_EARLY_PRINTK=y
CONFIG_EARLY_UART_BASE_ADDRESS=0x09000000
CONFIG_EARLY_UART_PL011_BAUD_RATE=0
CONFIG_EARLY_PRINTK_INC="debug-pl011.inc"

OK I can see it fails on the ACPI and then tries to fall back to FDT and
then fails to find the GIC:

    (XEN) CMDLINE[00000000f7bbe000]:chosen placeholder 
root=UUID=cf00cd3a-066b-4146-bedf-f811d3343077 ro console=hvc0 earlyprintk=xen
    (XEN)
    (XEN) Command line: placeholder loglvl=all guest_loglvl=all 
com1=115200,8n1,0x3e8,5console=com1,vg no-real-mode edd=off
    (XEN) parameter "placeholder" unknown!
    (XEN) parameter "no-real-mode" unknown!
    (XEN) parameter "edd" unknown!
    (XEN) ACPI: RSDP 138560000, 0024 (r2 BOCHS )
    (XEN) ACPI: XSDT 138550000, 004C (r1 BOCHS  BXPCFACP        1       1000013)
    (XEN) ACPI: FACP 138510000, 010C (r5 BOCHS  BXPCFACP        1 BXPC        1)
    (XEN) ACPI: DSDT 138520000, 14A6 (r2 BOCHS  BXPCDSDT        1 BXPC        1)
    (XEN) ACPI: APIC 138500000, 018C (r3 BOCHS  BXPCAPIC        1 BXPC        1)
    (XEN) ACPI: GTDT 1384F0000, 0060 (r2 BOCHS  BXPCGTDT        1 BXPC        1)
    (XEN) ACPI: MCFG 1384E0000, 003C (r1 BOCHS  BXPCMCFG        1 BXPC        1)
    (XEN) ACPI: SPCR 1384D0000, 0050 (r2 BOCHS  BXPCSPCR        1 BXPC        1)
    (XEN) Unsupported FADT revision 5.1, should be 6.0+, will disable ACPI
    (XEN) acpi_boot_table_init: FADT not found (-22)
    (XEN) Domain heap initialised
    (XEN) Booting using Device Tree
    (XEN) Platform: Generic System
    (XEN)
    (XEN) ****************************************
    (XEN) Panic on CPU 0:
    (XEN) Unable to find compatible GIC in the device tree
    (XEN) ****************************************
    (XEN)
    (XEN) Reboot in five seconds...

Despite saying it is going to reboot it never manages to. Any idea how
it is trying to reset the system?

This is a bit of chicken and eggs problem. To know the reset method, you
need to parse the ACPI tables. As we can't parse then we don't know the
reset method. So, Xen will just do an infinite loop.

Well you do get some ACPI tables - downgrading the minimum at least
restores the reset method detection. I wonder if it would be worth
defaulting to PSCI if you don't know rather than hang indefinitely?

The risk is probably low enough to try to use PSCI even on platform not supporting it.

Although, it might be worth to check if EL3 is present to avoid panicking again and again on XGene.


FWIW the failure after that is failing to find the GIC - I'm just
looking at the MADT table parsing now. Why am I getting a sense of
DejaVu?

The ACPI code in Xen is based on the first ACPI implementation in Linux. So it is quite possible you encountered the bug there :).

It would probably be good to be more forthcoming with the users and say
it will not reboot.

Also, IIRC, the time subsystem is not yet initialized. So it might be
possible to mdelay() doesn't work properly.

Surely that's an architectural subsystem so there is no reason that
couldn't be up and running.

In theory yes, but the code is also catering some interesting/weird platforms behavior:
   1) There are (were?) platform where CNTFREQ was not set correctly
2) Some platforms, such as the one with Exynos 5, (used to?) require specific code to enable the arch timer.

We are still using the Arndale for automated testing. So we would need to keep the hacks.

But it would be possible to rework the code and try to make the timer available earlier for well-behaved platforms.

Cheers,

--
Julien Grall



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.