[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [ARM] Bash often segfaults in Dom0 with the latest Xen
On 06/05/2013 02:38 AM, Christoffer Dall wrote: > On 4 June 2013 15:45, Julien Grall <julien.grall@xxxxxxxxxx> wrote: >> Hi all, >> >> Since a couple of week, I'm tracking an issue with Xen on ARM with no luck. >> >> I'm run out of idea, so I send this email to have advice from the community. >> >> Most of the time bash will abort with random error in dom0: >> - page fault (data and prefetch abort) >> - memory corruption (malloc corruption and invalid pointer) >> >> It's easily to reproduce by doing ./configure on the xen tree. >> >> My environment is an arndale board: >> - linux linaro 13.05 (using arndale_xen_dom0_defconfig and >> exynos5250_arndale.dts) >> - opensuse 12.03 (http://en.opensuse.org/HCL:Arndale) >> - xen upstream >> >> The linux tree can be retrieved from >> git://xenbits.xen.org/people/julieng/linux-arm.git >> using the branch linaro-3.10. >> The previous branch is based on the linaro tree with some patches for the >> dts and xen. >> >> The issue also occurs on the versatile express. But it's harder to reproduce. >> Here the environment is: >> - linux linaro 13.05 (using vexpress_xen_dom0_defconfig and >> vexpress_v2p_ca15_a7.dtb) >> - ubuntu linaro 13.05 >> - xen upstream >> >> I have tried different distributions and linux version, the issue was the >> same. >> I made some testing to narrow down the bug and I came to the following test >> case: >> >> Only dom0 is running and each VCPUs are pinned to a specific cpu >> (vcpu0 -> cpu0 and vcpu1 -> cpu1). >> >> The patch below removes WFI trap and by consequence avoid a VCPU to move to >> another physical CPU. >> ========================================= >> diff --git a/xen/arch/arm/traps.c b/xen/arch/arm/traps.c >> index 6cfba1a..e89ca15 100644 >> --- a/xen/arch/arm/traps.c >> +++ b/xen/arch/arm/traps.c >> @@ -62,7 +62,7 @@ void __cpuinit init_traps(void) >> WRITE_SYSREG((vaddr_t)hyp_traps_vector, VBAR_EL2); >> >> /* Setup hypervisor traps */ >> - >> WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TWI|HCR_TSC, >> HCR_EL2); >> + WRITE_SYSREG(HCR_PTW|HCR_BSU_OUTER|HCR_AMO|HCR_IMO|HCR_VM|HCR_TSC, >> HCR_EL2); >> isb(); >> } >> >> ========================================= >> >> If a bash process is assigned to a specific cpu with taskset, the process >> seems >> to always run without any issue. >> >> taskset -c 0 ./configure >> >> I guess it's a caching issue, but each time I've tried to play with the >> caching >> policy Linux was not booting. >> >> Thanks in advance for any advice. > > Some thoughts: > > - Does dom0 run with Stage-2 translation? If so, you should be able > to disable caches in both Hyp mode and for dom0 by manipulating the > hyp registers to try and exclude caches. If Linux doesn't boot under > such configuration, something else is completely broken, as it must be > transparent to your dom0. > > - Are you doing any swapping and/or page reclaiming? I wouldn't > assume so for dom0, but if you are, you need to maintain the icache > properly, since it can be aliasing, see > http://lxr.linux.no/linux+v3.9.4/arch/arm/kvm/mmu.c#L495 (I doubt this > is the case though) > > - All other cache accesses should be coherent across cores and are > physically indexed/physically tagged so I don't see how this could be > your issue. It was only an idea because I have noticed the memory was often corrupted. > - Do you always see the crash in user space or kernel space in dom0 or > is it all over the map? Only in user space in dom0. -- Julien _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |