[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH] Fixing PAE SMP dom0 hang at boot time (take 2)


  • To: "xen-devel" <xen-devel@xxxxxxxxxxxxxxxxxxx>
  • From: "Nakajima, Jun" <jun.nakajima@xxxxxxxxx>
  • Date: Wed, 16 Nov 2005 22:46:05 -0800
  • Delivery-date: Thu, 17 Nov 2005 06:46:10 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xensource.com>
  • Thread-index: AcXqzIWO7h3zr+VaTOepFJieMYqifgAAFuTgAByUl1A=
  • Thread-topic: [Xen-devel] [PATCH] Fixing PAE SMP dom0 hang at boot time (take 2)

Nakajima, Jun wrote:
>> And why would we need to take interrupts between loading esp0 and
>> LDT? 
>> 
>>         load_esp0(t, thread);
>> 
>> +       local_irq_enable();
>> +
>>         load_LDT(&init_mm.context);
> 
> I thought it's required to get IPI working (for load_LDT and the other
> on-going flush TLB actitivies), but looks bogus after sleeping on it.
> I'm pretty sure that it resolves the hang, and it's hiding an
> underlying bug.
> 

I've finally root caused it. It's much deeper than I expect...
Here is what's happening:

void arch_do_createdomain(struct vcpu *v)
{
    ...
    l1_pgentry_t gdt_l1e;
    ...
    d->arch.mm_perdomain_pt = alloc_xenheap_page();
    memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE);
    ...

   for ( vcpuid = 0; vcpuid < MAX_VIRT_CPUS; vcpuid++ )
        d->arch.mm_perdomain_pt[
            (vcpuid << PDPT_VCPU_SHIFT) + FIRST_RESERVED_GDT_PAGE] =
gdt_l1e;

The max value of (vcpuid << PDPT_VCPU_SHIFT) + FIRST_RESERVED_GDT_PAGE
is 1006 (< 1024), but the size of each entry is 8 bytes for PAE (and
x86_64), so alloc_xenheap_page() (i.e. a single page) was not
sufficient, and it's corrupting the next page which contains the areas
for vcpu_info, which contains  evtchn_upcall_pending for vcpus. That
affected vcpu 7 (and 23) on my machine, and at load_LDT, we check the
pending events at hypercall_preempt_check(), and it's already on for
vcpu 7, but it's never cleared by hypercall4_create_continuation()
because nobody set such events... So it was looping there.

int do_mmuext_op(
    struct mmuext_op *uops,
    ...
{
   ...

   for ( i = 0; i < count; i++ )
    {
        if ( hypercall_preempt_check() )
        {
            rc = hypercall4_create_continuation(
                __HYPERVISOR_mmuext_op, uops,
                (count - i) | MMU_UPDATE_PREEMPTED, pdone, foreigndom);
            break;
        }

Signed-off-by: Jun Nakajima <jun.nakajima@xxxxxxxxx>

----
diff -r 9c7aeec94f8a xen/arch/x86/domain.c
--- a/xen/arch/x86/domain.c     Tue Nov 15 19:46:48 2005 +0100
+++ b/xen/arch/x86/domain.c     Wed Nov 16 23:23:44 2005 -0700
@@ -252,6 +252,8 @@
     struct domain *d = v->domain;
     l1_pgentry_t gdt_l1e;
     int vcpuid;
+    physaddr_t size;
+    int order;
 
     if ( is_idle_task(d) )
         return;
@@ -265,9 +267,11 @@
     SHARE_PFN_WITH_DOMAIN(virt_to_page(d->shared_info), d);
     set_pfn_from_mfn(virt_to_phys(d->shared_info) >> PAGE_SHIFT,
             INVALID_M2P_ENTRY);
-
-    d->arch.mm_perdomain_pt = alloc_xenheap_page();
-    memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE);
+    size = ((((MAX_VIRT_CPUS - 1) << PDPT_VCPU_SHIFT) 
+             + FIRST_RESERVED_GDT_PAGE) * sizeof (l1_pgentry_t));
+    order = get_order_from_bytes(size);
+    d->arch.mm_perdomain_pt = alloc_xenheap_pages(order);
+    memset(d->arch.mm_perdomain_pt, 0, PAGE_SIZE << order);
     set_pfn_from_mfn(virt_to_phys(d->arch.mm_perdomain_pt) >>
PAGE_SHIFT,
             INVALID_M2P_ENTRY);
     v->arch.perdomain_ptes = d->arch.mm_perdomain_pt;

  

Jun
---
Intel Open Source Technology Center

Attachment: pae_smp_2.patch
Description: pae_smp_2.patch

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.