[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-users] Debugging DomU


  • To: Julien Grall <julien.grall@xxxxxxxxxx>
  • From: "Chris (Christopher) Brand" <chris.brand@xxxxxxxxxxxx>
  • Date: Fri, 29 May 2015 02:54:22 +0000
  • Accept-language: en-US
  • Cc: xen-users <xen-users@xxxxxxxxxxxxx>, Ian Campbell <ian.campbell@xxxxxxxxxx>
  • Delivery-date: Fri, 29 May 2015 02:55:48 +0000
  • List-id: Xen user discussion <xen-users.lists.xen.org>
  • Thread-index: AdCB2wWZcr9SQU3CS2ODe9QE3SnJMACfCemAAAUH66AAIWukAACjOORwACWVbwABa+9kIAAqoyMAAAGcR8AAL4ylAAEI4IqwAIsLYwAAZrncAAArZ/SAAA3FjUAANrEygAAF+fvgABn11wAAB6x6AA==
  • Thread-topic: [Xen-users] Debugging DomU

Hi Julien,

>> I hunted around quite a bit, and didn't find anything. Nothing leaps out in 
>> the list of upstream kernel patches to mmu.c (there's a migration from 
>> meminfo >to memblock, which I tried backporting with no effect on 
>> behaviour). Most of the reports of similar panics that I found, the 
>> recommendation was to ensure >that u-boot was disabling the L2 cache before 
>> jumping to the kernel, which is presumably not helpful.
>
>Even though, the bug occurred in mmu.c the bug was because of miscalculation 
>in kernel/head.S

__fixup_pv_table() ?

Looking at "git blame" for that file upstream and in my kernel, there are four 
patches that affect the part of the code that is conditional on 
CONFIG_ARM_PATCH_PHYS_VIRT:
E26a9e00afc - this sounds like just an optimisation
7a06192834414 - this just replaces "12" with "PAGE_SHIFT"
E3892e9160 - this says it only affects big-endian
6ebbf2ce437b3 - this should just be an optimization
None of those sound like likely candidates.

>> Throwing some printk() calls into sanity_check_meminfo() shows that it 
>> decides that all the memory is highmem, and so passes 0 to 
>> memblock_set_current_limit(). That then seems to lead to the failure to find 
>> suitable blocks of memory to allocate, and hence the panic.
>
>That's exactly the problem I had with some CONFIG_VMSLIPT_*. It was related to 
>Linux computing a wrong offset between the virtual and the physical >address.
 
>> As an experiment, I tried changing the start of memory in the DTS from 
>> 0x80000000. With that change, I can get the same result with 
>> >CONFIG_VMSPLIT_3G as I got with the other configs above (PC=0xfff000c). 
>> That seems to indicate that this is the problem you recalled, but that 
>> there's yet >another problem I'm hitting afterwards. I *think* I saw it go 
>> from __arm_ioremap_pfn() into do_DataAbort(), but I'm far from certain.
>
>How did you choose the 0x80000000?

That was suggested to me by somebody here. Is it arbitrary ? Seems like it 
should be.

>On a previous mail you were saying that you are using a custom kernel based on 
>3.14, right? I'm wondering if the kernel is trying to map device which it 
>>should not do.

Yes, that's correct. I've attached my dts, which is pretty minimal.

>Can you try to apply the patch below in Xen? It will print any guest data 
>abort not handled by Xen before injecting it to the guest.

Between that patch and more printk debugging, I know where it was dying:
Setup_arch
Paging_init
Dma_contiguous_remap
Iotable_init
Early_alloc_aligned
Memset(0xee7fffd0, 0, 0x30)
The output from your patch is:
(XEN) traps.c:2022:d8v0 HSR=0x90000046 pc=0xc025ab80 gva=0xee7fffd0

So I'm thinking that this still could be related to the 
__memblock_find_range_top_down(), if it now "succeeds" but still returns 
something invalid...

I applied e26a9e00afc. It made no difference by itself. I then tried tweaking 
the memory base address. With 0x20000000, I saw the same crash I was seeing 
before. With 0x40000000, though, it gets much further, dying in 
gic_init_bases():
(XEN) traps.c:2022:d12v0 HSR=0x93820005 pc=0xc06bf180 gva=0xe0800004
(d12) 1Unhandled fault: debug event (0x222) at 0xe0800004
(d12) 0Internal error: : 222 [#1] SMP ARM
(d12) dModules linked in:
(d12) dCPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.13-1.0pre-g1ba194f963e0-di
rty #107
(d12) dtask: c0e1aa20 ti: c0e10000 task.ti: c0e10000
(d12) PC is at gic_init_bases+0x8c/0x2b0
(d12) LR is at 0xffffffff
(d12) pc : [<c06bf178>]    lr : [<ffffffff>]    psr: 600001d3
(d12) sp : c0e11f20  ip : 00000008  fp : c06e05a0
(d12) r10: e0804000  r9 : ffffffff  r8 : defffc68
(d12) r7 : e0800000  r6 : c0e188ac  r5 : 00000010  r4 : c0e18e40
(d12) r3 : 00000000  r2 : e0800000  r1 : ffffffff  r0 : 00000000
(d12) Flags: nZCv  IRQs off  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
(d12) Control: 30c5387d  Table: 40003000  DAC: fffffffd
(d12) 0Process swapper/0 (pid: 0, stack limit = 0xc0e10240)
(d12) 0Stack: (0xc0e11f20 to 0xc0e12000)
(d12) 01f20: 00000000 e0804000 defffc68 c06dc7e8 00000000 e0804000 c0e11f88 0020
0200
(d12) 01f40: 00100100 c06bf46c 00000000 defffc68 c0e11f88 00000000 c0e11f80 de40
3200
(d12) 01f60: c0e11f80 c0e11f80 00000000 c06c75f0 de404bc0 c0e1137c 00000000 c0e4
bc08
(d12) 01f80: c0e11f80 c0e11f80 c0e11f88 c0e11f88 00000001 c0e4d040 c0e183c0 c0e4
d040
(d12) 01fa0: 00000001 ffffffff c06d9470 c0e183c0 00000000 c06aa9c8 ffffffff ffff
ffff
(d12) 01fc0: c06aa578 00000000 ffffffff 00000000 00000000 c06d9470 30c7387d c0e1
83f0
(d12) 01fe0: c06d946c c0e1bb30 40007000 420f00f3 00000000 40008084 00000000 0000
0000
(d12) [<c06bf178>] (gic_init_bases) from [<c06bf46c>] (gic_of_init+0xd0/0x108)
(d12) [<c06bf46c>] (gic_of_init) from [<c06c75f0>] (of_irq_init+0x1cc/0x2e8)
(d12) [<c06c75f0>] (of_irq_init) from [<c06aa9c8>] (start_kernel+0x220/0x378)
(d12) [<c06aa9c8>] (start_kernel) from [<40008084>] (0x40008084)
(d12) 0Code: 0a000004 e3790001 13c9901f 13a05010 (10899005) 
(d12) 4---[ end trace 3406ff24bd97382e ]---

Thanks,

Chris

Attachment: config_domu.dts
Description: config_domu.dts

_______________________________________________
Xen-users mailing list
Xen-users@xxxxxxxxxxxxx
http://lists.xen.org/xen-users

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.