[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] Debian linux-image-2.6.32-4-xen-amd64 2.6.32-11 doesn't boot with > 4 GiB; resets immediatelly, no log messages
On 04/09/2010 11:00 AM, Thomas Schwinge wrote: > Before we get to the backtrace, one further detail: this kernel *does* > boot if one of the following has happened before: the BIOS memchecker has > run, memtest86+ has run, some other kernel has run (though it doesn't > always boot in this latter case). Thus, I wildly guess that some > uninitialized data structure (in memory) is dereferenced -- that happens > to be in a sane state after memtest86+ et al. > OK, I think I see what's happening here... > $ for ip in ffffffff814f6d88 ffffffff81433e38 ffffffff814f6d3d > ffffffff81433e60 ffffffff815a73ac ffffffff81433f98 ffffffff814f6f85 > ffffffff8152b2d0 ffffffff814f95fb ffffffff814f8249 ffffffff813f3f5f > ffffffff813b4119 ffffffff81433f90 ffffffff811ff14f ffffffff8100e361 > ffffffff8100e343 ffffffff813b4119 ffffffff813f3f5f ffffffff8152a7b0 > ffffffff814f49d0 ffffffff81001000 ffffffff814f6aca; do echo "* $ip:" && > addr2line -fie debian/build/build_amd64_xen_amd64/vmlinux "$ip"; done > > ~/shared/tmp/tmp > * ffffffff814f6d88: > xen_release_chunk > This is the code which goes through the gaps between the E820 table entries looking for pages which Xen has assigned the kernel, but the kernel can't use (because they're not covered by E820). It does this with: for(pfn = start; pfn < end; pfn++) { unsigned long mfn = pfn_to_mfn(pfn); /* Make sure pfn exists to start with */ if (mfn == INVALID_P2M_ENTRY || mfn_to_pfn(mfn) != pfn) continue; ... So in theory we're poking at the p2m and m2p tables for random pages which may or may not be valid. So if we do a pfn_to_mfn on a pfn which is within the range of valid pfns, but not actually a valid pfn for our domain, then the resulting mfn is undefined (and may depend on random memory contents, which is why it is affected by what you've previously booted). We then pass that mfn back to mfn_to_pfn to see if it really does belong to us (because it will return the same pfn back). But it could be random garbage, which mfn_to_pfn uses to index an array. Normally that would be OK, because it uses: __get_user(pfn, &machine_to_phys_mapping[mfn]); to dereference the array. But at this early stage, none of the kernel's exception handlers have been set up, so this will just fault into Xen. It would be interesting to confirm this by building your kernel with CONFIG_DEBUG_INFO=y in the .config, and verify that the faulting instruction is actually this line. Thanks, J _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |