[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] Debian linux-image-2.6.32-4-xen-amd64 2.6.32-11 doesn't boot with > 4 GiB; resets immediatelly, no log messages



Hello!

On Fri, Apr 09, 2010 at 11:20:52AM -0700, Jeremy Fitzhardinge wrote:
> On 04/09/2010 11:00 AM, Thomas Schwinge wrote:
> > Before we get to the backtrace, one further detail: this kernel *does*
> > boot if one of the following has happened before: the BIOS memchecker has
> > run, memtest86+ has run, some other kernel has run (though it doesn't
> > always boot in this latter case).  Thus, I wildly guess that some
> > uninitialized data structure (in memory) is dereferenced -- that happens
> > to be in a sane state after memtest86+ et al.
> >   
> 
> OK, I think I see what's happening here...
> 
> >     $ for ip in ffffffff814f6d88 ffffffff81433e38 ffffffff814f6d3d 
> > ffffffff81433e60 ffffffff815a73ac ffffffff81433f98 ffffffff814f6f85 
> > ffffffff8152b2d0 ffffffff814f95fb ffffffff814f8249 ffffffff813f3f5f 
> > ffffffff813b4119 ffffffff81433f90 ffffffff811ff14f ffffffff8100e361 
> > ffffffff8100e343 ffffffff813b4119 ffffffff813f3f5f ffffffff8152a7b0 
> > ffffffff814f49d0 ffffffff81001000 ffffffff814f6aca; do echo "* $ip:" && 
> > addr2line -fie debian/build/build_amd64_xen_amd64/vmlinux "$ip"; done > 
> > ~/shared/tmp/tmp
> >     * ffffffff814f6d88:
> >     xen_release_chunk
> >   
> 
> This is the code which goes through the gaps between the E820 table
> entries looking for pages which Xen has assigned the kernel, but the
> kernel can't use (because they're not covered by E820).  It does this with:
> 
>       for(pfn = start; pfn < end; pfn++) {
>               unsigned long mfn = pfn_to_mfn(pfn);
> 
>               /* Make sure pfn exists to start with */
>               if (mfn == INVALID_P2M_ENTRY || mfn_to_pfn(mfn) != pfn)
>                       continue;
>               ...
> 
> 
> So in theory we're poking at the p2m and m2p tables for random pages
> which may or may not be valid.  So if we do a pfn_to_mfn on a pfn which
> is within the range of valid pfns, but not actually a valid pfn for our
> domain, then the resulting mfn is undefined (and may depend on random
> memory contents, which is why it is affected by what you've previously
> booted).
> 
> We then pass that mfn back to mfn_to_pfn to see if it really does belong
> to us (because it will return the same pfn back).  But it could be
> random garbage, which mfn_to_pfn uses to index an array.
> 
> Normally that would be OK, because it uses:
> 
>       __get_user(pfn, &machine_to_phys_mapping[mfn]);
> 
> to dereference the array.  But at this early stage, none of the kernel's
> exception handlers have been set up, so this will just fault into Xen.
> 
> It would be interesting to confirm this by building your kernel with
> CONFIG_DEBUG_INFO=y in the .config, and verify that the faulting
> instruction is actually this line.

Bingo!

    $ for ip in ffffffff814f6d88 ffffffff81433e38 ffffffff814f6d3d 
ffffffff81433e60 ffffffff815a73ac ffffffff81433f98 ffffffff814f6f85 
ffffffff8152b2d0 ffffffff814f95fb ffffffff814f8249 ffffffff813f3f5f 
ffffffff813b4119 ffffffff81433f90 ffffffff811ff14f ffffffff8100e361 
ffffffff8100e343 ffffffff813b4119 ffffffff813f3f5f ffffffff8152a7b0 
ffffffff814f49d0 ffffffff81001000 ffffffff814f6aca ffffffff82fdb000; do echo "* 
$ip:" && addr2line -fie debian/build/build_amd64_xen_amd64/vmlinux "$ip" && gdb 
-q --batch --eval-command="x/i 0x$ip" --eval-command="list *0x$ip" 
debian/build/build_amd64_xen_amd64/vmlinux; done
    * ffffffff814f6d88:
    mfn_to_pfn
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/include/asm/xen/page.h:77
    xen_release_chunk
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/xen/setup.c:63
    0xffffffff814f6d88 <xen_release_chunk+193>: mov    (%rax),%rdx
    0xffffffff814f6d88 is in xen_release_chunk 
(/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/include/asm/xen/page.h:77).
    72          /*
    73           * The array access can fail (e.g., device space beyond end of 
RAM).
    74           * In such cases it doesn't matter what we return (we return 
garbage),
    75           * but we must handle the fault without crashing!
    76           */
    77          __get_user(pfn, &machine_to_phys_mapping[mfn]);
    78  
    79          return pfn;
    80  }
    81  
    * ffffffff81433e38:
    _sdata
    ??:0
    0xffffffff81433e38: add    %al,(%rax)
    No source file for address 0xffffffff81433e38.
    * ffffffff814f6d3d:
    pfn_to_mfn
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/include/asm/xen/page.h:50
    xen_release_chunk
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/xen/setup.c:60
    0xffffffff814f6d3d <xen_release_chunk+118>: mov    %rax,%rdx
    0xffffffff814f6d3d is in xen_release_chunk 
(/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/include/asm/xen/page.h:50).
    45  static inline unsigned long pfn_to_mfn(unsigned long pfn)
    46  {
    47          if (xen_feature(XENFEAT_auto_translated_physmap))
    48                  return pfn;
    49  
    50          return get_phys_to_machine(pfn) & ~FOREIGN_FRAME_BIT;
    51  }
    52  
    53  static inline int phys_to_machine_mapping_valid(unsigned long pfn)
    54  {
    * ffffffff81433e60:
    _sdata
    ??:0
    0xffffffff81433e60: add    %al,(%rax)
    No source file for address 0xffffffff81433e60.
    * ffffffff815a73ac:
    idt_table
    ??:0
    0xffffffff815a73ac: add    %al,(%rax)
    No source file for address 0xffffffff815a73ac.
    * ffffffff81433f98:
    _sdata
    ??:0
    0xffffffff81433f98: add    %al,(%rax)
    No source file for address 0xffffffff81433f98.
    * ffffffff814f6f85:
    xen_return_unused_memory
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/xen/setup.c:91
    xen_memory_setup
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/xen/setup.c:173
    0xffffffff814f6f85 <xen_memory_setup+366>:  mov    0x8(%rbx),%rdi
    0xffffffff814f6f85 is in xen_memory_setup 
(/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/xen/setup.c:91).
    86          unsigned long released = 0;
    87          int i;
    88  
    89          for (i = 0; i < e820->nr_map; i++) {
    90                  released += xen_release_chunk(last_end, 
e820->map[i].addr);
    91                  last_end = e820->map[i].addr + e820->map[i].size;
    92          }
    93  
    94          released += xen_release_chunk(last_end, 
PFN_PHYS(xen_start_info->nr_pages));
    95  
    * ffffffff8152b2d0:
    ??
    ??:0
    0xffffffff8152b2d0: add    %al,(%rax)
    No source file for address 0xffffffff8152b2d0.
    * ffffffff814f95fb:
    setup_memory_map
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/kernel/e820.c:1463
    0xffffffff814f95fb <setup_memory_map+7>:    mov    $0xffffffff815a8c40,%rdi
    0xffffffff814f95fb is in setup_memory_map 
(/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/kernel/e820.c:1463).
    1458        void __init setup_memory_map(void)
    1459        {
    1460                char *who;
    1461        
    1462                who = x86_init.resources.memory_setup();
    1463                memcpy(&e820_saved, &e820, sizeof(struct e820map));
    1464                printk(KERN_INFO "BIOS-provided physical RAM map:\n");
    1465                e820_print_map(who);
    1466        }
    * ffffffff814f8249:
    setup_arch
    ??:0
    0xffffffff814f8249: cmpw   $0x208,0x34e84(%rip)        # 0xffffffff8152d0d6
    No source file for address 0xffffffff814f8249.
    * ffffffff813f3f5f:
    kallsyms_token_index
    ??:0
    0xffffffff813f3f5f: outsb  %ds:(%rsi),(%dx)
    No source file for address 0xffffffff813f3f5f.
    * ffffffff813b4119:
    kallsyms_token_index
    ??:0
    0xffffffff813b4119: add    %bh,(%rsp,%rsi,1)
    No source file for address 0xffffffff813b4119.
    * ffffffff81433f90:
    _sdata
    ??:0
    0xffffffff81433f90: add    %al,(%rax)
    No source file for address 0xffffffff81433f90.
    * ffffffff811ff14f:
    extract_entropy
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/drivers/char/random.c:865
    0xffffffff811ff14f <extract_entropy+106>:   cmpq   $0x0,0x38(%rbp)
    0xffffffff811ff14f is in extract_entropy 
(/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/drivers/char/random.c:865).
    860         nbytes = account(r, nbytes, min, reserved);
    861 
    862         while (nbytes) {
    863                 extract_buf(r, tmp);
    864 
    865                 if (r->last_data) {
    866                         spin_lock_irqsave(&r->lock, flags);
    867                         if (!memcmp(tmp, r->last_data, EXTRACT_SIZE))
    868                                 panic("Hardware RNG duplicated 
output!\n");
    869                         memcpy(r->last_data, tmp, EXTRACT_SIZE);
    * ffffffff8100e361:
    __raw_callee_save_xen_irq_disable
    irq.c:0
    0xffffffff8100e361: pop    %r11
    No source file for address 0xffffffff8100e361.
    * ffffffff8100e343:
    __raw_callee_save_xen_restore_fl
    irq.c:0
    0xffffffff8100e343: pop    %r11
    No source file for address 0xffffffff8100e343.
    * ffffffff813b4119:
    kallsyms_token_index
    ??:0
    0xffffffff813b4119: add    %bh,(%rsp,%rsi,1)
    No source file for address 0xffffffff813b4119.
    * ffffffff813f3f5f:
    kallsyms_token_index
    ??:0
    0xffffffff813f3f5f: outsb  %ds:(%rsi),(%dx)
    No source file for address 0xffffffff813f3f5f.
    * ffffffff8152a7b0:
    ??
    ??:0
    0xffffffff8152a7b0: add    %al,(%rax)
    No source file for address 0xffffffff8152a7b0.
    * ffffffff814f49d0:
    setup_command_line
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/init/main.c:395
    start_kernel
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/init/main.c:548
    0xffffffff814f49d0 <start_kernel+219>:      mov    $0xffff880001000000,%rdi
    0xffffffff814f49d0 is in start_kernel 
(/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/init/main.c:395).
    390  * parsing is performed in place, and we should allow a component to
    391  * store reference of name/value for future reference.
    392  */
    393 static void __init setup_command_line(char *command_line)
    394 {
    395         saved_command_line = alloc_bootmem(strlen 
(boot_command_line)+1);
    396         static_command_line = alloc_bootmem(strlen (command_line)+1);
    397         strcpy (saved_command_line, boot_command_line);
    398         strcpy (static_command_line, command_line);
    399 }
    * ffffffff81001000:
    init_level4_pgt
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/kernel/head_64.S:272
    0xffffffff81001000: movslq (%rax),%esp
    0xffffffff81001000 is at 
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/kernel/head_64.S:272.
    267         ENTRY(stack_start)
    268         .quad  init_thread_union+THREAD_SIZE-8
    269         .word  0
    270 
    271 bad_address:
    272         jmp bad_address
    273 
    274         .section ".init.text","ax"
    275 #ifdef CONFIG_EARLY_PRINTK
    276         .globl early_idt_handlers
    * ffffffff814f6aca:
    xen_start_kernel
    
/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/xen/enlighten.c:1267
    0xffffffff814f6aca <xen_start_kernel+1609>: add    $0x30,%rsp
    0xffffffff814f6aca is in xen_start_kernel 
(/media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/source_amd64_xen/arch/x86/xen/enlighten.c:1267).
    1262        #ifdef CONFIG_X86_32
    1263                i386_start_kernel();
    1264        #else
    1265                x86_64_start_reservations((char 
*)__pa_symbol(&boot_params));
    1266        #endif
    1267        }
    * ffffffff82fdb000:
    ??
    ??:0
    0xffffffff82fdb000: Cannot access memory at address 0xffffffff82fdb000
    No source file for address 0xffffffff82fdb000.


On Fri, Apr 09, 2010 at 02:52:06PM -0400, Konrad Rzeszutek Wilk wrote:
> >     $ gdb -q debian/build/build_amd64_xen_amd64/vmlinux
> >     Reading symbols from 
> > /media/data-local/thomas/tmp/linux_xen_/linux-2.6-2.6.32/debian/build/build_amd64_xen_amd64/vmlinux...(no
> >  debugging symbols found)...done.
> >     (gdb) x/i ffffffff814f6d88
> 
> You need to prefix it with '0x', so:
>  (gdb) x/i 0xffffffff814f6d88

Indeed.  -- On some days, I should either stay away from the keyboard,
or, even better, start to think some more, as well as read what I've been
typing...  Usually, I do know how to operate GDB.  ;-)


Regards,
 Thomas

Attachment: signature.asc
Description: Digital signature

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.