[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] xen/arm: vgic-v3: Fix the typo of GICD IRQ active status range



Hi Julien,

On 2020/1/7 19:42, Julien Grall wrote:
Hi,

On 07/01/2020 09:48, Wei Xu wrote:
On 2020/1/7 17:10, Julien Grall wrote:


On 07/01/2020 08:39, Wei Xu wrote:
Hi Stefano,

On 2020/1/7 6:01, Stefano Stabellini wrote:
On Sat, 28 Dec 2019, Wei Xu wrote:
Hi Julien,

On 2019/12/28 16:09, Julien Grall wrote:
Hi,

On 28/12/2019 03:08, Wei Xu wrote:
This patch fixes the typo about the active status range of an IRQ
via GICD. Otherwise it will be failed to handle the mmio access and
inject a data abort.
I have seen a patch similar from NXP a month ago and I disagreed on the
approach.

If you look at the context you modifed, it says that reading ACTIVER is not supported. While I agree the behavior is not consistent accross ACTIVER, injecting a data abort is a perfectly fine behavior to me (though not spec
compliant) as we don't implement the registers correctly.

I guess you are sending this patch, because you tried Linux 5.4 (or later) on Xen, right? Linux has recently began to read ACTIVER to check whether an IRQ is active at the HW level during the synchronizing of the IRQS. From my understanding, this is used because there is a window where the interrupt is active at the HW level but the Linux IRQ subsystem is not aware of it.

While the patch below will allow Linux 5.4 to not crash, it is not going to make it fly very far because of the above. So I am rather not happy with
persuing with returning 0.

Yes, I am using Linux 5.5-rc2 :)
Got it and thanks for the explanation.
I am not insistent on this and OK to wait for the update.
Thanks and have a very happy new year!
Hi Wei,

what do you do to reproduce the issue? Are you just booting Linux
5.5-rc2 as dom0 and seeing the issue during boot, or are you doing
something specific?

.


I directly tested the mainline kernel with defconfig.
And the 5.5-rc5 kernel booting log is as below:

     root@ubuntu:~# dmesg | more
[ 0.000000] Booting Linux on physical CPU 0x0000000000 [0x481fd010] [ 0.000000] Linux version 5.5.0-rc5 (joyx@Turing-Arch-b) (gcc version 4.9.1 2 0140505 (prerelease) (crosstool-NG linaro-1.13.1-4.9-2014.05 - Linaro GCC 4.9-20
     14.05)) #132 SMP PREEMPT Tue Jan 7 15:43:06 CST 2020
     [    0.000000] Xen XEN_VERSION.XEN_SUBVERSION support found
     [    0.000000] efi: Getting EFI parameters from FDT:
     [    0.000000] efi: EFI v2.50 by Xen
     [    0.000000] efi:  ACPI 2.0=0x181d0e70
     [    0.000000] cma: Reserved 32 MiB at 0x000000007e000000
     [    0.000000] ACPI: Early table checksum verification disabled
     [    0.000000] ACPI: RSDP 0x00000000181D0E70 000024 (v02 HISI  )
[ 0.000000] ACPI: XSDT 0x00000000181D0DB0 0000BC (v01 HISI HIP08 0000000
     0      01000013)

Is that the full log from Linux? If not, can you post it in full?


But to boot with ACPI on our hardware, except above change I have also done some hacking based on
XEN 4.13 as below:

I haven't booted Xen on any ACPI systems recently so there might be bugs in the code. Your changes below is definitely a call to look more into details what's wrong.


Yes, my target is to make Xen booting with ACPI firstly.


     diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c
     index d028ec9..215a291 100644
     --- a/xen/arch/arm/traps.c
     +++ b/xen/arch/arm/traps.c
     @@ -1856,8 +1856,8 @@ static bool try_map_mmio(gfn_t gfn)
              return false;

/* The hardware domain can only map permitted MMIO regions */ - if ( !iomem_access_permitted(d, mfn_x(mfn), mfn_x(mfn) + 1) )
     -        return false;
+ /* if ( !iomem_access_permitted(d, mfn_x(mfn), mfn_x(mfn) + 1) ) */
     +    /*     return false; */

Dom0 should be able to map nearly all the address space through this function. The only thing not allowed is the GIC and UART (see acpi_iomem_deny_access).

So why do you want this change? What sort of address Dom0 is trying to map and fail?

Yes, it is the UART address 0x3f00002f8.
Without this, during DOM0 UART initialization, the mem_serial_in in the kernel side will be failed and reported a unhandled fault at 0xffff80001006d2f9(gva)
because of mem abort.
The Xen printed "HSR=0x930100005 pc=0xffff800010645d94 gva=0xffff80001006d2f9 gpa=0x000003f00002f9" in traps.c.

I assume this is your primary address as specified in the SPCR, right?

Yes.

As only one entity should manage the UART (i.e Xen or Dom0), we today assume this will be managed by Xen. Xen should expose a partial virtual UART (only a few registers are emulating) to dom0 in replacement.

This is usually done by the UART driver. Looking at the log you pasted in a separate e-mail:

(XEN) Platform: Generic System
(XEN) Unable to initialize acpi uart: -9
(XEN) Bad console= option 'dtuart'

So Xen didn't manage to initialize the uart. The -9 suggests, Xen didn't find a driver for your UART. At the moment, Xen is only able to detect pl011, sbsa, sbsa32 UART for ACPI. What is the type of the UART used on your platform?


Thanks!
Got it.
Our UART is 8250.




          return !map_regions_p2mt(d, gfn, 1, mfn, p2m_mmio_direct_c);
      }


     diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c
     index 4d6c971..c626f9e 100644
     --- a/xen/arch/arm/mm.c
     +++ b/xen/arch/arm/mm.c
@@ -1095,7 +1095,7 @@ static bool xen_pt_check_entry(lpae_t entry, mfn_t mfn, unsigned int flags)
              {
mm_printk("Changing MFN for a valid entry is not allowed (%#"PRI_mfn" -> %#"PRI_mfn").\n",
                            mfn_x(lpae_get_mfn(entry)), mfn_x(mfn));
     -            return false;
     +            return true;

There is a pretty good reason to prevent modifying the MFN on a valid entry. Indeed, the PT code is not handling of the Break-Before-Make sequence (requires when updating certain entry) as this is a can of worms.

However, during my testing I never found a place where an valid entry is modified (other than the permissions part). So can you give more details how you came up here?

In the full log, I found the RSDP(0x39de0) replaced by XSDT(0x39dd0).
But I did not know why :(

On Arm, we requires the fixmap to be empty (by calling clear_fixmap()) before the fixmap can used for a different mapping.

Looking at the implementation of acpi_os_unmap_memory(), the fixmap is not cleared and hence why you see the warning about changing the MFN on valid entry.

IHMO, acpi_os_unmap_memory() ought to clear the fixmap. This will make it a much saner interface to use. Would you mind to have a look?


I will try to look it.



              }
          }
          /* Sanity check when removing a page. */


     diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
     index 3c899cd..1e83351 100644
     --- a/xen/arch/arm/setup.c
     +++ b/xen/arch/arm/setup.c
@@ -852,7 +852,8 @@ void __init start_xen(unsigned long boot_phys_offset,
          else
          {
              printk("Booting using ACPI\n");
     -        device_tree_flattened = NULL;
+ device_tree_flattened = relocate_fdt(fdt_paddr, fdt_size);
     +        dt_unflatten_host_device_tree();

When using ACPI, the DT should not be used. So why do you need this?

I have tried not passing DT with grub-2.04 but also to load DOM0 kernel.
The log is as below:

     (XEN) *** LOADING DOMAIN 0 ***
     (XEN) Loading d0 kernel from boot module @ 0000000016221000
     (XEN) Allocating 1:1 mappings totalling 4096MB for dom0:
     (XEN) BANK[0] 0x00000008000000-0x00000010000000 (128MB)
     (XEN) BANK[1] 0x00000020000000-0x00000038000000 (384MB)
     (XEN) BANK[2] 0x00000050000000-0x00000080000000 (768MB)
     (XEN) BANK[3] 0x00202000000000-0x00202080000000 (2048MB)
     (XEN) BANK[4] 0x002020b0000000-0x002020c0000000 (256MB)
     (XEN) BANK[5] 0x00202600000000-0x00202620000000 (512MB)
     (XEN) Grant table range: 0x000000181da000-0x0000001821a000
     (XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading zImage from 0000000016221000 to 0000000008080000-00000000099caa00
     (XEN) Loading d0 DTB to 0x000000000fe00000-0x000000000fe0025b
     (XEN) Initial low memory virq threshold set at 0x4000 pages.
     (XEN) Scrubbing Free RAM in background
     (XEN) Std. Loglevel: All
     (XEN) Guest Loglevel: All
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
     (XEN) Data Abort Trap. Syndrome=0x6
(XEN) Walking Hypervisor VA 0x38 on CPU0 via TTBR 0x000000001831a000
     (XEN) 0TH[0x0] = 0x000000001831df7f
     (XEN) 1ST[0x0] = 0x000000001831bf7f
     (XEN) 2ND[0x0] = 0x0000000000000000
     (XEN) CPU0: Unexpected Trap: Data Abort
     (XEN) ----[ Xen-4.13.0-rc  arm64  debug=y   Not tainted ]----
     (XEN) CPU:    0
     (XEN) PC:     00000000002c6398 create_domUs+0x20/0x208
     (XEN) LR:     00000000002c6398
     (XEN) SP:     0000000000307d60
     (XEN) CPSR:   60000249 MODE:64-bit EL2h (Hypervisor, handler)
(XEN) X0: 0000000000000000 X1: 0000000000000003 X2: 0000000000000000 (XEN) X3: 0000000000000000 X4: 0000000000000000 X5: 0000000000000024 (XEN) X6: 0080808080808080 X7: fefefefefefeff09 X8: 7f7f7f7f7f7f7f7f (XEN) X9: 731f646b61606d54 X10: 7f7f7f7f7f7f7f7f X11: 0101010101010101 (XEN) X12: 0000000000000008 X13: 00000000002871b8 X14: 0000000000000020 (XEN) X15: 00000000004002f8 X16: 00000000002b2000 X17: 00000000002b2000 (XEN) X18: 00000000002b2000 X19: 000080662f3d7000 X20: 00000000002b1480 (XEN) X21: 0000000000348430 X22: 0000000000000080 X23: 00000000002a4240 (XEN) X24: 0000000000000080 X25: 0000000000348000 X26: 00000000002e9078 (XEN) X27: 000020279c000000 X28: 00000000002e83f0 FP: 0000000000307d60
     (XEN)
     (XEN)   VTCR_EL2: 800d3590
     (XEN)  VTTBR_EL2: 0000000000000000
     (XEN)
     (XEN)  SCTLR_EL2: 30cd183d
     (XEN)    HCR_EL2: 000000008000003a
     (XEN)  TTBR0_EL2: 000000001831a000
     (XEN)
     (XEN)    ESR_EL2: 96000006
     (XEN)  HPFAR_EL2: 0000000000000000
     (XEN)    FAR_EL2: 0000000000000038
     (XEN)
     (XEN) Xen stack trace from sp=0000000000307d60:
(XEN) 0000000000307de0 00000000002cb5f0 000080662f3d7000 00000000002b1480 (XEN) 0000000000348430 0000000000000080 00000000002a4240 0000000000000080 (XEN) 0000000080000000 6d681f6762736876 0000000000307dc0 00000000002bc570 (XEN) 0000000000307de0 00000000002cb5e0 000080662f3d7000 00000000002b1480 (XEN) 000000003f14a780 00000000002001b8 00000000181da000 0000000017fda000 (XEN) 000000001a637000 0000000000000000 00000000004002f8 000000001828cdc8 (XEN) 0000000000001500 0000000000000001 0000000000000001 000000001828cdc0 (XEN) 0000000000000000 0000000000003000 000000001a637000 0000003700000000 (XEN) 0000000000000000 0000000f86db1000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000300000000 0000008000000000 00000040ffffffff (XEN) 00000002ffffffff 0000000000000280 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
     (XEN) Xen call trace:
     (XEN)    [<00000000002c6398>] create_domUs+0x20/0x208 (PC)
     (XEN)    [<00000000002c6398>] create_domUs+0x20/0x208 (LR)
     (XEN)    [<00000000002cb5f0>] start_xen+0xc34/0xcbc
(XEN) [<00000000002001b8>] arm64/head.o#primary_switched+0x10/0x30
     (XEN)
     (XEN) debugtrace_dump() global buffer starting
     1 cpupool_add_domain(dom=0,pool=0) n_dom 1 rc 0
     (XEN) wrap: 0
     (XEN) debugtrace_dump() global buffer finished
     (XEN)
     (XEN) ****************************************
     (XEN) Panic on CPU 0:
     (XEN) CPU0: Unexpected Trap: Data Abort
     (XEN) ****************************************
     (XEN)
     (XEN) Reboot in five seconds...


So I passed DT to the Xen with grub-2.02 and hacked above code because in the create_domUs
will report a bug if chosen node  can not be find.

The function is used for creating multiple domains at boot time from Xen. It is very DT centric at the moment, but ACPI platform may not come with a DT (though the EFI stub may create one ATM). I can see other issue with dom0less and ACPI, so I think it would be best to just not call the function when using ACPI. Could you try following patch (not tested nor compiled):

diff --git a/xen/arch/arm/setup.c b/xen/arch/arm/setup.c
index 3c899cd4a0..3e9dc8fe9f 100644
--- a/xen/arch/arm/setup.c
+++ b/xen/arch/arm/setup.c
@@ -985,7 +985,8 @@ void __init start_xen(unsigned long boot_phys_offset,

     system_state = SYS_STATE_active;

-    create_domUs();
+    if ( !acpi_disabled )
+        create_domUs();

     domain_unpause_by_systemcontroller(dom0);

Cheers,



Thanks!
It is not working even I changed the condition to " if ( acpi_disabled ) ".
My grub 2.04 configuration is as below:

    xen_hypervisor /xen dom0_mem=4G acpi=force loglvl=all guest_loglvl=all
    xen_module /Image rdinit=/init  acpi=force noinitrd root=/dev/sdb1 rw

The log with the condition " if ( acpi_disabled ) " is as following:

    (XEN) Adding cpu 126 to runqueue 0
    (XEN) Adding cpu 127 to runqueue 0
(XEN) alternatives: Patching with alt table 00000000002d4f48 -> 00000000002d5764
    (XEN) *** LOADING DOMAIN 0 ***
    (XEN) Loading d0 kernel from boot module @ 0000000016257000
    (XEN) Allocating 1:1 mappings totalling 4096MB for dom0:
    (XEN) BANK[0] 0x00000008000000-0x00000010000000 (128MB)
    (XEN) BANK[1] 0x00000020000000-0x00000038000000 (384MB)
    (XEN) BANK[2] 0x00000050000000-0x00000080000000 (768MB)
    (XEN) BANK[3] 0x00202000000000-0x00202080000000 (2048MB)
    (XEN) BANK[4] 0x002020b0000000-0x002020c0000000 (256MB)
    (XEN) BANK[5] 0x00202600000000-0x00202620000000 (512MB)
    (XEN) Grant table range: 0x000000181c7000-0x00000018207000
    (XEN) Allocating PPI 16 for event channel interrupt
(XEN) Loading zImage from 0000000016257000 to 0000000008080000-0000000009981200
    (XEN) Loading d0 DTB to 0x000000000fe00000-0x000000000fe0025b
    (XEN) Initial low memory virq threshold set at 0x4000 pages.
    (XEN) Scrubbing Free RAM in background
    (XEN) Std. Loglevel: All
    (XEN) Guest Loglevel: All
(XEN) *** Serial input to DOM0 (type 'CTRL-a' three times to switch input)
    (XEN) Data Abort Trap. Syndrome=0x6
    (XEN) Walking Hypervisor VA 0x10 on CPU0 via TTBR 0x00000000182ff000
    (XEN) 0TH[0x0] = 0x0000000018302f7f
    (XEN) 1ST[0x0] = 0x0000000018300f7f
    (XEN) 2ND[0x0] = 0x0000000000000000
    (XEN) CPU0: Unexpected Trap: Data Abort
    (XEN) ----[ Xen-4.13.0-rc  arm64  debug=y   Not tainted ]----
    (XEN) CPU:    0
    (XEN) PC:     00000000002b65c8 00000000002b65c8
    (XEN) LR:     00000000002c8e94
    (XEN) SP:     000080662ffcfe00
    (XEN) CPSR:   60000249 MODE:64-bit EL2h (Hypervisor, handler)
(XEN) X0: 0000000000000000 X1: 000000001a627000 X2: 000000000021e548 (XEN) X3: 0000000000000000 X4: 00000000002af508 X5: 0000000000000000 (XEN) X6: 000080662ffd0000 X7: 0000000000000000 X8: 0000000000000000 (XEN) X9: 0000000000000000 X10: 0000000000000000 X11: 0000000000000000 (XEN) X12: 0000000000000000 X13: 0000000000000000 X14: 0000000000000000 (XEN) X15: 00000000004002f8 X16: 00000000002b1000 X17: 00000000002b1000 (XEN) X18: 00000000002b1000 X19: 0000000000000000 X20: 000000001a624000 (XEN) X21: 000000001a627000 X22: 000000000021e548 X23: 00000000002b15c0 (XEN) X24: 00000000002b1000 X25: 000000000021e548 X26: 00000000002e1078 (XEN) X27: 000020279c000000 X28: 00000000002e0410 FP: 000080662ffcfe00
    (XEN)
    (XEN)   VTCR_EL2: 800d3590
    (XEN)  VTTBR_EL2: 0000000000000000
    (XEN)
    (XEN)  SCTLR_EL2: 30cd183d
    (XEN)    HCR_EL2: 000000008000003a
    (XEN)  TTBR0_EL2: 00000000182ff000
    (XEN)
    (XEN)    ESR_EL2: 96000006
    (XEN)  HPFAR_EL2: 0000000000000000
    (XEN)    FAR_EL2: 0000000000000010
    (XEN)
    (XEN) Xen stack trace from sp=000080662ffcfe00:
(XEN) 000080662ffcfe50 00000000002c95c0 00000000002e0078 0000000000000001 (XEN) 000000001a627000 000000001a624000 00000000002b15c0 00000000002b1000 (XEN) 0000000000000240 0000000000000000 000080662ffcfea0 0000000000266f60 (XEN) 00000000002b0000 00000000002b0480 0000000000340430 0000000000000080 (XEN) 00000000002a2c50 0000000000000080 0000000000340000 0000000000266f5c (XEN) 00000000002ffde0 00000000002ca260 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000 (XEN) 0000000000000000 0000000000000000 0000000000000000 0000000000000000
    (XEN) Xen call trace:
    (XEN)    [<00000000002b65c8>] 00000000002b65c8 (PC)
    (XEN)    [<00000000002c8e94>] 00000000002c8e94 (LR)
    (XEN)    [<00000000002c95c0>] 00000000002c95c0
    (XEN)    [<0000000000266f60>] setup.c#init_done+0x10/0x20
    (XEN)    [<00000000002ca260>] 00000000002ca260
    (XEN)
    (XEN)
    (XEN) ****************************************
    (XEN) Panic on CPU 0:
    (XEN) CPU0: Unexpected Trap: Data Abort
    (XEN) ****************************************
    (XEN)
    (XEN) Reboot in five seconds...

Best Regards,
Wei



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.