[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Bug] Bring up Dom0 on Arm board
Hi, I was on vacation, sorry for the late reply. >>>>>> About the mali and sdmmc drivers problem, I compare the log between boot with xen and boot without xen. >>>>>> And found an error log as below: >>>>>> [ 65.517345] arm-scmi firmware:scmi: SCMI Notifications - Core Enabled. >>>>>> (XEN) d0v2 Unhandled SMC/HVC: 0x82000010 >>>>>> [ 66.559382] arm-scmi firmware:scmi: unable to communicate with SCMI >>>>>> [ 66.559516] arm-scmi: probe of firmware:scmi failed with error -95 >>>>>> It seems SCMI driver probe failed. >>>>>> So I did an experiment, disable SCMI driver and rebuild the Linux kernel, >>>>>> boot up in normal way without xen, and reproduces the problem that mali and sdmmc did not bring up. >>>>>> It looks like a high probability SCMI cause the problem. >>>>>> I read the Linux code and targeting located the error -95, >>>>>> It seems SCMI probe failed cause by SMCCC not supported, code as below: >>>>>> static int smc_send_message(struct scmi_chan_info *cinfo, >>>>>> struct scmi_xfer *xfer) >>>>>> { >>>>>> struct scmi_smc *scmi_info = cinfo->transport_info; >>>>>> struct arm_smccc_res res; >>>>>> mutex_lock(&scmi_info->shmem_lock); >>>>>> shmem_tx_prepare(scmi_info->shmem, xfer); >>>>>> if (scmi_info->irq) >>>>>> reinit_completion(&scmi_info->tx_complete); >>>>>> arm_smccc_1_1_invoke(scmi_info->func_id, 0, 0, 0, 0, 0, 0, 0, &res); >>>>>> if (scmi_info->irq) >>>>>> wait_for_completion(&scmi_info->tx_complete); >>>>>> scmi_rx_callback(scmi_info->cinfo, shmem_read_header(scmi_info->shmem)); >>>>>> mutex_unlock(&scmi_info->shmem_lock); >>>>>> /* Only SMCCC_RET_NOT_SUPPORTED is valid error code */ >>>>>> if (res.a0) >>>>>> return -EOPNOTSUPP; >>>>>> return 0; >>>>>> } >>>>>> #define EOPNOTSUPP 95 /* Operation not supported on transport endpoint */ >>>>>> I also check the code where Unhandled SMC/HVC print in xen, >>>>>> and found the log cause by unhandled SMCCC call in function vsmccc_handle_call(). >>>>>> Could it be xen unhandle SMCCC call cause SCMI driver probe failed ? >>>>> >>>>> Yes. The domain would need to talk to the host SCMI. This is not yet >>>>> supported because Xen doesn't provide a mediator (this is necessary to >>>>> ensure the safety of the call). >>>>> >>>>> If you are *only* looking to use the Mali driver in dom0. So you could >>>>> add some code in Xen to forward simply forward the request to the host >>>>> and check if it helps you. >>>> >>>> How can I pass the requset to the host? I'm not familiar with xen code, >>>> is there any >>>> reference code in xen? >>> >>> There are a couple of solution: >>> 1) You request the hypervisor to avoid trapping SVC. This would also >>> need some changes in Linux to force the Mali driver to use SVC rather >>> than HVC. There is a patch on xen-devel, to avoid trapping (see [1]). >>> 2) Add an allow list of the SMCCC operations. There are some examples >>> how to "emulate" SMC call in Xen (see vsmccc_handle_call()). >>> >>> Cheers, >>> >>> [1] >>> https://lore.kernel.org/xen-devel/alpine.DEB.2.21.2106241749310.24906@sstabellini-ThinkPad-T480s/ >> I tried both solutions but it didn't work. >> First way: >> I add the code as the patch, add forward_smc=true to the Xen command line. >> Then boot, but still print the log 'Unhandled SMC/HVC ...', Julien Grall wrote: > Can you confirm whether you ask the mali driver to use SMC call? I add log to print hsr.ec value in do_trap_guest_sync(), and shows call are HSR_EC_HVC64 and HSR_EC_SMC64. But HCR_TSC seems only handle HSR_EC_SMC64 call. >> meanwhile xen >> throw an exception, log as below: >> (XEN) traps.c:1987:d0v3 HSR=0x92000007 pc=0xffffffc0106a3be8 gva=0xffffffc0127ad0a0 gpa=0x000000001000a0 >> [ 5.966596] Unhandled fault at 0xffffffc0127ad0a0 >> [ 5.966619] Mem abort info: >> [ 5.966633] ESR = 0x96000000 >> [ 5.966649] EC = 0x25: DABT (current EL), IL = 32 bits >> [ 5.966666] SET = 0, FnV = 0 >> [ 5.966680] EA = 0, S1PTW = 0 >> [ 5.966694] Data abort info: >> [ 5.966708] ISV = 0, ISS = 0x00000000 >> [ 5.966722] CM = 0, WnR = 0 >> [ 5.966738] swapper pgtable: 4k pages, 39-bit VAs, pgdp=00000000194b6000 >> It seems forward_smc=true did not work, but cause an exception. Julien Grall wrote: > This is indicating that the dom0 is trying to access a region that is > not mapped. > Can you check what the address 0x1000a0 is used for the host layout? Stefano Stabellini wrote: > This is not "Unhandled SMC/HVC". The guest is trying to access address > 0x1000a0 which doesn't seem to be valid in the guest? > > Do you know where 0x1000a0 is coming from? Could it be that one of the > SMC calls returns 0x1000a0 to the guest and the guest tries to access > it, but actually 0x1000a0 is not present in any valid device tree ranges > so it is not accessible from the guest? > If 0x1000a0 is a "special" address returned by the firmware, it needs to > belong to a range described in one of the device tree nodes for it to be > accessible by dom0. I searched 0x1000a0 in kernel code and in dtb files, but found nothing. So far I haven't figured out where 0x1000a0 is coming from. >> Second way: >> I used zynqmp_eemi() in xilinx-zynqmp-eemi.c to handle smc call. >> The smc call seems succeed according to xen log. >> But after run smc call, kernel throw an exception, log as below: >> [ 8.771446] rockchip-pm-domain fd8d8000.power-management:power-controller: >> Looking up pcie-supply from device tree >> [ 8.771485] rockchip-pm-domain fd8d8000.power-management:power-controller: >> Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed >> [ 66.037851] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: >> [ 66.037874] rcu: 3-...0: (0 ticks this GP) idle=196/1/0x4000000000000000 softirq=30/30 fqs=6000 >> [ 66.037892] (detected by 5, t=18002 jiffies, g=-1147, q=81) >> [ 66.037905] Task dump for CPU 3: >> [ 66.037916] task:swapper/0 state:R running task stack: 0 pid: 1 ppid: 0 flags:0x0000002a >> [ 66.037939] Call trace: >> [ 66.037952] __switch_to+0x130/0x1a0 >> [ 66.037968] 0xffffffc0112bc173 >> It seems I need to modify zynqmp_eemi() code to adapt the kerenl, >> But based on what to modify zynqmp_eemi()? Julien Grall wrote: > I am not sure I understand what are the changes you made in Xen. Can you > post the diff? The changes made in xen as below: diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c index b633ff2fe8..83bb063016 100644 --- a/xen/arch/arm/vsmc.c +++ b/xen/arch/arm/vsmc.c @@ -27,6 +27,7 @@ #include <asm/traps.h> #include <asm/vpsci.h> #include <asm/platform.h> +#include <asm/platforms/xilinx-zynqmp-eemi.h> /* Number of functions currently supported by Hypervisor Service. */ #define XEN_SMCCC_FUNCTION_COUNT 3 @@ -280,7 +281,8 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs) handled = handle_sssc(regs); break; case ARM_SMCCC_OWNER_SIP: - handled = platform_smc(regs); + // handled = platform_smc(regs); + handled = zynqmp_eemi(regs); break; case ARM_SMCCC_OWNER_TRUSTED_APP ... ARM_SMCCC_OWNER_TRUSTED_APP_END: case ARM_SMCCC_OWNER_TRUSTED_OS ... ARM_SMCCC_OWNER_TRUSTED_OS_END: diff --git a/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h b/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h index cf25a9014d..55e20e99ca 100644 --- a/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h +++ b/xen/include/asm-arm/platforms/xilinx-zynqmp-eemi.h @@ -18,7 +18,7 @@ #include <asm/smccc.h> #define EEMI_FID(fid) ARM_SMCCC_CALL_VAL(ARM_SMCCC_FAST_CALL, \ - ARM_SMCCC_CONV_64, \ + ARM_SMCCC_CONV_32, \ ARM_SMCCC_OWNER_SIP, \ fid) Stefano Stabellini wrote: > In theory, there is no reason why forward_smc=true would not work but > xilinx-zynqmp-eemi.c would work. But anyway, I am appending the changes > to make sure xilinx-zynqmp-eemi.c forwards everything to the firmware. > > It looks like the kernel is throwing an exception because it is blocked > for too long. Maybe the SMC call didn't actually succeed after all. > > diff --git a/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c b/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c > index 2053ed7ac5..20aa6afb47 100644 > --- a/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c > +++ b/xen/arch/arm/platforms/xilinx-zynqmp-eemi.c > @@ -57,6 +57,8 @@ bool zynqmp_eemi(struct cpu_user_regs *regs) > unsigned int pm_fn = fid & 0xFFFF; > enum pm_ret_status ret; > > + goto forward_to_fw; > + > switch ( fid ) > { > /* Mandatory SMC32 functions. */ I tried this change in xen, still has error in kernel, log as below: [ 8.985102] rockchip-pm-domain fd8d8000.power-management:power-controller: Looking up pcie-supply from device tree [ 8.985185] rockchip-pm-domain fd8d8000.power-management:power-controller: Looking up pcie-supply property in node /power-management@fd8d8000/power-controller failed [ 65.911230] rcu: INFO: rcu_sched detected stalls on CPUs/tasks: [ 65.911274] rcu: 0-...0: (1 ticks this GP) idle=1ba/1/0x4000000000000000 softirq=29/29 fqs=6001 [ 65.911311] (detected by 2, t=18003 jiffies, g=-1147, q=82) [ 65.911337] Task dump for CPU 0: [ 65.911360] task:swapper/0 state:R running task stack: 0 pid: 1 ppid: 0 flags:0x0000002a [ 65.911406] Call trace: [ 65.911429] __switch_to+0x130/0x1a0 [ 65.911453] die+0x58/0x20c [ 65.911473] arm64_notify_die+0x88/0x8c [ 65.911497] do_mem_abort+0xa0/0xb8 [ 65.911520] el1_abort+0x3c/0x5c [ 65.911542] el1_sync_handler+0x94/0xd0 [ 65.911563] el1_sync+0x88/0x140 [ 65.911587] copy_from_kernel_nofault+0xc8/0x128 [ 65.911612] aarch64_insn_read+0x34/0x6c [ 65.911635] show_data.constprop.0+0xac/0x128 [ 65.911660] show_regs+0xd8/0x118 [ 65.911681] die+0xcc/0x20c [ 65.911700] arm64_notify_die+0x88/0x8c [ 65.911722] do_mem_abort+0xa0/0xb8 [ 65.911743] el1_abort+0x3c/0x5c [ 65.911764] el1_sync_handler+0x94/0xd0 [ 65.911785] el1_sync+0x88/0x140 [ 65.911808] rockchip_gem_get_ddr_info+0x28/0x40 [ 65.911833] rockchip_drm_init+0x110/0x114 [ 65.911855] do_one_initcall+0xa0/0x1e8 [ 65.911878] kernel_init_freeable+0x2a4/0x2ac [ 65.911902] kernel_init+0x20/0x11c [ 65.911924] ret_from_fork+0x10/0x30 I continue to try the second way, and find that during initialization, SMC/HVC call come from different case, funcid 0x82000010 handle in HSR_EC_HVC64 case, funcid 0x82000009 handle in HSR_EC_SMC64 case. log as below: (XEN) d0v2 do_trap_hvc_smccc trap.... (XEN) d0v2 Unhandled SMC/HVC: 0x82000010 (XEN) d0v3 do_trap_smc trap.... (XEN) d0v3 Unhandled SMC/HVC: 0x82000009 So I change the code in vsmc.c as below: (The changes "goto forward_to_fw" remain) diff --git a/xen/arch/arm/vsmc.c b/xen/arch/arm/vsmc.c index b633ff2fe8..09636aab17 100644 --- a/xen/arch/arm/vsmc.c +++ b/xen/arch/arm/vsmc.c @@ -27,6 +27,7 @@ #include <asm/traps.h> #include <asm/vpsci.h> #include <asm/platform.h> +#include <asm/platforms/xilinx-zynqmp-eemi.h> /* Number of functions currently supported by Hypervisor Service. */ #define XEN_SMCCC_FUNCTION_COUNT 3 @@ -280,7 +281,12 @@ static bool vsmccc_handle_call(struct cpu_user_regs *regs) handled = handle_sssc(regs); break; case ARM_SMCCC_OWNER_SIP: - handled = platform_smc(regs); + if(hsr.ec == HSR_EC_HVC64) + { + handled = zynqmp_eemi(regs); + } + else + handled = platform_smc(regs); break; case ARM_SMCCC_OWNER_TRUSTED_APP ... ARM_SMCCC_OWNER_TRUSTED_APP_END: case ARM_SMCCC_OWNER_TRUSTED_OS ... ARM_SMCCC_OWNER_TRUSTED_OS_END: After this change, mali and sdmmc driver bring up succeed. But I'm confused, is this the xen problem or kernel work SMC/HVC call in a quirks way? Best regards Cailigang
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |