[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH] Re: [Xen-devel] x86-64 problem with invalid page fault in linux 2.6.16-rc1



Hi,

On Fri, 2006-01-20 at 17:02 +0000, Keir Fraser wrote:
> On SMP systems we need the guest to handle spurious page-not-present 
> faults at any time and at any virtual address. This is a side effect of 
> the writable pagetable implementation.
> 
> If the vmalloc_fault path no longer covers all of the kernel virtual 
> address space then a spurious-fault detection needs to be added before 
> oops'ing the kernel.

OK, I've been seeing precisely the same symptoms as Jan on current
xen-unstable HV+dom0.  And with the Fedora kernel build being so
modular, this makes it completely impossible to boot on a 64-bit SMP
box.  But it looks like there's an easy fix.

The x86_64 vmalloc_fault() path is already doing a soft pagetable walk
to detect if it's a true or a spurious fault.  It only does that for the
vmalloc area; if we can get spurious faults in the module area too, then
the same test probably needs to be applied there too.

And indeed, the attached patch fixes the problem entirely for me (so
far, at least.)

--Stephen


# HG changeset patch
# User sct@xxxxxxxxxxxxxxxxxxxxx
# Node ID 5ff3cb1a144d471c2993567edf07ea78f1599846
# Parent  74e2a7b3fa02aad4788cd35b1ef62147cda464f2
We need to handle spurious page faults in the kernel's module VA range
in order to avoid OOPSing on modular x86_64 SMP builds.

Signed-off-by: Stephen Tweedie <sct@xxxxxxxxxx>

diff -r 74e2a7b3fa02 -r 5ff3cb1a144d 
linux-2.6-xen-sparse/arch/x86_64/mm/fault-xen.c
--- a/linux-2.6-xen-sparse/arch/x86_64/mm/fault-xen.c   Wed Feb 15 12:21:32 
2006 -0500
+++ b/linux-2.6-xen-sparse/arch/x86_64/mm/fault-xen.c   Wed Feb 15 15:02:02 
2006 -0500
@@ -367,12 +367,14 @@ asmlinkage void __kprobes do_page_fault(
         */
        if (unlikely(address >= TASK_SIZE64)) {
                /*
-                * Don't check for the module range here: its PML4
+                * Even the module range needs checked here: its PML4
                 * is always initialized because it's shared with the main
-                * kernel text. Only vmalloc may need PML4 syncups.
+                * kernel text, but the writable pagetable code can still
+                * result in spurious faults.
                 */
                if (!(error_code & (PF_RSVD|PF_USER|PF_PROT)) &&
-                     ((address >= VMALLOC_START && address < VMALLOC_END))) {
+                   ((address >= VMALLOC_START && address < VMALLOC_END) ||
+                    (address >= MODULES_VADDR && address < MODULES_END))) {
                        if (vmalloc_fault(address) < 0)
                                goto bad_area_nosemaphore;
                        return;
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.