[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Linux 2.6.33 crashes on boot as Xen PV domU



On Thu, Feb 25, 2010 at 11:10:45AM -0800, Jeremy Fitzhardinge wrote:
> On 02/25/2010 11:04 AM, Pasi Kärkkäinen wrote:
>> Hello,
>>
>> I just built and tried to boot upstream kernel.org Linux 2.6.33 kernel
>> as Xen PV domU, but that doesn't get very far:
>>
>> http://pasik.reaktio.net/xen/debug/bootlog-linux-2.6.33-xen-pv-domu-x86_64-crash.txt
>>    
>
> Try the attached patch.
>

Yep, this patch fixes the problem, boots OK now.

Thanks! Now some save/restore testing..

-- Pasi

>     J
>
>>
>> Freeing unused kernel memory: 1544k freed
>> Write protecting the kernel read-only data: 10240k
>> Freeing unused kernel memory: 1764k freed
>> BUG: unable to handle kernel paging request at ffff880001447000
>> IP: [<ffffffff8102e9f2>] free_init_pages+0xb2/0xdb
>> PGD 1a3c067 PUD 1a40067 PMD 138d5067 PTE 10000001447025
>> Oops: 0003 [#1] SMP
>> last sysfs file:
>> CPU 3
>> Pid: 1, comm: swapper Not tainted 2.6.33 #1 /
>> RIP: e030:[<ffffffff8102e9f2>]  [<ffffffff8102e9f2>] 
>> free_init_pages+0xb2/0xdb
>> RSP: e02b:ffff88007dfdbe60  EFLAGS: 00010286
>> RAX: 00000000cccccccc RBX: ffff880001600000 RCX: 0000000000000400
>> RDX: ffff880001447000 RSI: 0000000000000000 RDI: ffff880001447000
>> RBP: ffff88007dfdbe90 R08: 0000000000000000 R09: ffff88007fc04000
>> R10: ffff88007fc04000 R11: 0000000000100000 R12: ffff880001447000
>> R13: 0000000000000400 R14: ffffea0000000000 R15: 00000000cccccccc
>> FS:  0000000000000000(0000) GS:ffff8800139d6000(0000) knlGS:0000000000000000
>> CS:  e033 DS: 0000 ES: 0000 CR0: 000000008005003b
>> CR2: ffff880001447000 CR3: 0000000001a3b000 CR4: 0000000000002620
>> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>> DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000000
>> Process swapper (pid: 1, threadinfo ffff88007dfda000, task ffff88007dfe0000)
>> Stack:
>>   0000000000000000 ffff880000000000 6db6db6db6db6db7 ffffffff81a00000
>> <0>  0000000000a00000 0000000000000000 ffff88007dfdbec0 ffffffff8102ed73
>> <0>  ffffffff81c6aa38 ffffffff81aefdf0 0000000000000100 0000000000000100
>> Call Trace:
>>   [<ffffffff8102ed73>] mark_rodata_ro+0xea/0x151
>>   [<ffffffff810021b9>] init_post+0x30/0x113
>>   [<ffffffff81b0f715>] kernel_init+0x1c3/0x1d2
>>   [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10
>>   [<ffffffff81009e91>] ? int_ret_from_sys_call+0x7/0x1b
>>   [<ffffffff8143ae1d>] ? retint_restore_args+0x5/0x6
>>   [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10
>> Code: cd 47 00 00 48 c1 e8 0c 4c 89 e2 4c 89 e9 48 6b c0 38 48 81 e2 00 f0 
>> ff ff 31 f6 48 89 d7 4c 01 f0 c7 40 08 01 00 00 00 44 89 f8<f3>  ab 4c 89 e7 
>> 49 81 c4 00 10 00 00 e8 bc ca 09 00 48 ff 05 16
>> RIP  [<ffffffff8102e9f2>] free_init_pages+0xb2/0xdb
>>   RSP<ffff88007dfdbe60>
>> CR2: ffff880001447000
>> ---[ end trace 6e676731d52211fa ]---
>> Kernel panic - not syncing: Attempted to kill init!
>> Pid: 1, comm: swapper Tainted: G      D    2.6.33 #1
>> Call Trace:
>>   [<ffffffff81438663>] panic+0x7a/0x13d
>>   [<ffffffff81057609>] ? exit_ptrace+0xa1/0x121
>>   [<ffffffff8105074d>] do_exit+0x7a/0x6f3
>>   [<ffffffff8104d15d>] ? spin_unlock_irqrestore+0xe/0x10
>>   [<ffffffff8104dd76>] ? kmsg_dump+0x12b/0x145
>>   [<ffffffff8143bc31>] oops_end+0xbf/0xc7
>>   [<ffffffff8102f901>] no_context+0x1fc/0x20b
>>   [<ffffffff8102fa94>] __bad_area_nosemaphore+0x184/0x1a7
>>   [<ffffffff81004399>] ? __raw_callee_save_xen_pmd_val+0x11/0x1e
>>   [<ffffffff8102faca>] bad_area_nosemaphore+0x13/0x15
>>   [<ffffffff8143d663>] do_page_fault+0x14f/0x2a0
>>   [<ffffffff8143b0b5>] page_fault+0x25/0x30
>>   [<ffffffff8102e9f2>] ? free_init_pages+0xb2/0xdb
>>   [<ffffffff8102ed73>] mark_rodata_ro+0xea/0x151
>>   [<ffffffff810021b9>] init_post+0x30/0x113
>>   [<ffffffff81b0f715>] kernel_init+0x1c3/0x1d2
>>   [<ffffffff8100aa64>] kernel_thread_helper+0x4/0x10
>>   [<ffffffff81009e91>] ? int_ret_from_sys_call+0x7/0x1b
>>   [<ffffffff8143ae1d>] ? retint_restore_args+0x5/0x6
>>   [<ffffffff8100aa60>] ? kernel_thread_helper+0x0/0x10
>>
>> -- Pasi
>>
>>
>> _______________________________________________
>> Xen-devel mailing list
>> Xen-devel@xxxxxxxxxxxxxxxxxxx
>> http://lists.xensource.com/xen-devel
>>
>>    
>

> From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> To: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: "linux-kernel@xxxxxxxxxxxxxxx" <linux-kernel@xxxxxxxxxxxxxxx>,
>       "hpa@xxxxxxxxx" <hpa@xxxxxxxxx>,
>       "rostedt@xxxxxxxxxxx" <rostedt@xxxxxxxxxxx>,
>       "jeremy@xxxxxxxx" <jeremy@xxxxxxxx>, Ingo Molnar <mingo@xxxxxxxxxx>,
>       Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Date: Thu, 18 Feb 2010 11:51:40 -0800
> Subject: Re: [LKML] Re: [PATCH] x86_64: allow sections that are recycled to
>       set _PAGE_RW
> 
> On Tue, 2010-02-16 at 14:13 -0800, Konrad Rzeszutek Wilk wrote:
> > On Sat, Feb 13, 2010 at 12:08:17PM -0800, Suresh Siddha wrote:
> > > The checks in static_protections() for kernel text mapping ensure that
> > > we don't break the 2MB kernel text pages unnecessarily on 64bit kernels
> > > (as it has performance implications). We should be fine as long as the
> > > kernel identity mappings reflect the correct RW permissions.
> > > 
> > > But somehow this is working fine on native kernels but not on Xen pv
> > > guest. Your patch will cause the performance issues that we are
> > 
> > That would not be good.
> > 
> > > addressing using the static protections checks. I will look at this more
> > > detailed on tuesday.
> > 
> > Great. Thank you for doing that. If you find yourself in a bind, here are
> > some steps on how to build the Xen pv-ops kernel and such:
> > http://wiki.xensource.com/xenwiki/XenParavirtOps
> > 
> > It goes without saying that I would be happy to test your patch when
> > you have one ready.
> 
> x86 folks, can you please queue the appended patch? If you think it is
> too late for 2.6.33, I added a "cc: stable", so that they can pick this
> up for both .32 and .33. Thanks.
> ---
> 
> From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> Subject: x86_64, cpa: don't work hard in preserving kernel text 2M mapping 
> when using 4K already
> 
> We currently enforce the !RW mapping for the kernel mapping that maps
> holes between different text, rodata and data sections. However, kernel
> identity mappings will have different RWX permissions to the pages mapping to
> text and to the pages padding (which are freed) the text, rodata sections.
> Hence kernel identity mappings will be broken to smaller pages. For 64-bit,
> kernel text and kernel identity mappings are different, so we can enable
> protection checks that come with CONFIG_DEBUG_RODATA, as well as retain 2MB
> large page mappings for kernel text.
> 
> Konrad reported a boot failure with the Linux Xen paravirt guest because of
> this. In this paravirt guest case, the kernel text mapping and the kernel
> identity mapping share the same page-table pages. Thus forcing the !RW mapping
> for some of the kernel mappings also cause the kernel identity mappings to be
> read-only resulting in the boot failure. Linux Xen paravirt guest also
> uses 4k mappings and don't use 2M mapping.
> 
> Fix this issue and retain large page performance advantage for native kernels
> by not working hard and not enforcing !RW for the kernel text mapping,
> if the current mapping is already using small page mapping.
> 
> Reported-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> Tested-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: stable@xxxxxxxxxx [2.6.32, 2.6.33]
> ---
> 
> index 1d4eb93..cf07c26 100644
> --- a/arch/x86/mm/pageattr.c
> +++ b/arch/x86/mm/pageattr.c
> @@ -291,8 +291,29 @@ static inline pgprot_t static_protections(pgprot_t prot, 
> unsigned long address,
>        */
>       if (kernel_set_to_readonly &&
>           within(address, (unsigned long)_text,
> -                (unsigned long)__end_rodata_hpage_align))
> -             pgprot_val(forbidden) |= _PAGE_RW;
> +                (unsigned long)__end_rodata_hpage_align)) {
> +             unsigned int level;
> +
> +             /*
> +              * Don't enforce the !RW mapping for the kernel text mapping,
> +              * if the current mapping is already using small page mapping.
> +              * No need to work hard to preserve large page mappings in this
> +              * case.
> +              *
> +              * This also fixes the Linux Xen paravirt guest boot failure
> +              * (because of unexpected read-only mappings for kernel identity
> +              * mappings). In this paravirt guest case, the kernel text
> +              * mapping and the kernel identity mapping share the same
> +              * page-table pages. Thus we can't really use different
> +              * protections for the kernel text and identity mappings. Also,
> +              * these shared mappings are made of small page mappings.
> +              * Thus this don't enforce !RW mapping for small page kernel
> +              * text mapping logic will help Linux Xen parvirt guest boot
> +              * aswell.
> +              */
> +             if (lookup_address(address, &level) && (level != PG_LEVEL_4K))
> +                     pgprot_val(forbidden) |= _PAGE_RW;
> +     }
>  #endif
>  
>       prot = __pgprot(pgprot_val(prot) & ~pgprot_val(forbidden));
> 
> 


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.