[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Xen 4.11.1 panic



>>> On 18.12.18 at 23:19, <bouyer@xxxxxxxxxxxxxxx> wrote:
> I tried updating my NetBSD dom0 to 4.11.1 (from 4.11.0 with security patches),

Hmm, the issue stems from the XSA-273 changes, so did you perhaps
mean "with some security patches", and you didn't have those ones
applied?

> and on a 32bits PV domU shutdown I get (100% reproductible):
> (XEN) Assertion 'preemptible' failed at mm.c:2493                           
> (XEN) ----[ Xen-4.11.1nb0  x86_64  debug=y   Tainted:  C   ]----            
> (XEN) CPU:    1                                                             
> (XEN) RIP:    e008:[<ffff82d08028b192>] free_page_type+0x232/0x790          
> (XEN) RFLAGS: 0000000000010246   CONTEXT: hypervisor (d0v0)                 
> (XEN) rax: 4000000000000000   rbx: 4400000000000001   rcx: 4000000000000000 
> (XEN) rdx: ffff830000000000   rsi: 4400000000000001   rdi: ffff82e004215260 
> (XEN) rbp: ffff82e004215260   rsp: ffff83023704fab8   r8:  0000000000000000 
> (XEN) r9:  0000000000000000   r10: ffff82e000000000   r11: ffff82e004226000 
> (XEN) r12: 0000000000000000   r13: ffff8302135d9000   r14: 10ffffffffffffff 
> (XEN) r15: 1000000000000000   cr0: 000000008005003b   cr4: 0000000000002660 
> (XEN) cr3: 000000022f0f6000   cr2: 00007f7ff60ce7a0
> (XEN) fsb: 00007f7ff7ff36c0   gsb: ffffffff80ca42c0   gss: 0000000000000000
> (XEN) ds: 003f   es: 003f   fs: 0000   gs: 0000   ss: e010   cs: e008
> (XEN) Xen code around <ffff82d08028b192> (free_page_type+0x232/0x790):
> (XEN)  05 00 00 45 85 e4 75 02 <0f> 0b 8b 45 18 85 c0 74 18 89 c0 48 c1 e0 0c 
> 49
> (XEN) Xen stack trace from rsp=ffff83023704fab8:
> (XEN)    ffff83023704fe38 00000000000000ec 4400000000000001 ffff82e004215260
> (XEN)    ffff82e004215240 00ffffffffffffff 10ffffffffffffff 1000000000000000
> (XEN)    ffff82d08028b83d 00ff8300bedfc000 ffff83023704ffff ffff82d000000000
> (XEN)    ffff82e004215260 ffff82e004215240 ffff8302135d9000 0000000000210a92
> (XEN)    ffff820040019000 0200000000000000 ffff82d08028bedf 00000000000001ff
> (XEN)    ffff82e004215240 ffff82d08028b25e ffff830200000000 ffff83023704ffff
> (XEN)    4400000000000001 ffff82e004215240 ffff82e004206c60 00ffffffffffffff
> (XEN)    10ffffffffffffff 1000000000000000 ffff82d08028b83d 0100000000000002
> (XEN)    ffff83023704ffff ffff830200000001 ffff82e004215240 ffff82e004206c60
> (XEN)    0000000000000000 ffff820040015010 0000000000000000 ffff820040015000
> (XEN)    ffff82d08028af30 0000000000000002 ffff82e004206c60 0000000000000000
> (XEN)    ffff820040015010 ffff82d08028b3d1 0000000000210363 ffff8302135d9000
> (XEN)    6400000000000001 ffff82e004206c60 0000000000000000 00ffffffffffffff
> (XEN)    10ffffffffffffff 1000000000000000 ffff82d08028b83d 01ff82d08022697f
> (XEN)    ffff83023704ffff ffff830200000001 ffff82e004206c60 ffff83023704fd10
> (XEN)    ffff8302135d9028 ffff8302135d9000 ffff8302135d9020 ffff82e004206c70
> (XEN)    ffff82d08028bf1f ffff82d080274e6b ffff83023701ec00 e400000000000001
> (XEN)    ffff83023704ffff 8000000000000000 ffff8302135d9000 0000000000000000
> (XEN)    ffff8302135d9018 deadbeefdeadf00d 0000000000000001 00007f7ff7b32004
> (XEN)    ffff82d080278f83 ffff8302135d9000 00007f7ff7b32004 ffff82d080208b2d
> (XEN) Xen call trace:
> (XEN)    [<ffff82d08028b192>] free_page_type+0x232/0x790
> (XEN)    [<ffff82d08028b83d>] mm.c#_put_page_type+0x14d/0x380
> (XEN)    [<ffff82d08028bedf>] mm.c#put_page_from_l2e+0xdf/0x110
> (XEN)    [<ffff82d08028b25e>] free_page_type+0x2fe/0x790
> (XEN)    [<ffff82d08028b83d>] mm.c#_put_page_type+0x14d/0x380
> (XEN)    [<ffff82d08028af30>] mm.c#put_page_from_l3e+0x1a0/0x1d0

The line number above doesn't match any one with a respective
ASSERT() in plain 4.11.1. There are a few nearby ones, and hence
I can only guess that it's the one that was recently added (in
PGT_l2_page_table handling of free_page_type()). Can you confirm
this please with the exact sources you've used for your build?

In any event, both Andrew and I must have overlooked the one
crucial place due to which the assertion is indeed wrong from
put_page_from_l2e():

        int rc = _put_page_type(pg, false, mfn_to_page(_mfn(pfn)));

Not allowing for preemption there is fine if the L2E is pointing to
an L1 table, but is now wrong if the L2E points to another L2,
which surely is the case when you see the assertion trigger.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.