[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC][PATCH] walking the page lists needs the page_alloc lock



At 14:49 +0100 on 23 Jul (1279896553), Tim Deegan wrote:
> There are a few places in Xen where we walk a domain's page lists
> without holding the page_alloc lock.  They race with updates to the page
> lists, which are normally rare but can be quite common under PoD when
> the domain is close to its memory limit and the PoD reclaimer is busy.
> This patch protects those places by taking the page_alloc lock.

I should say that the other place I found is in construct_dom0(), which
I left because it (a) can't really race with allocations and (b) calls
process_pending_softirqs() within the page_list_for_each().

Tim.

> I think this is OK for the two debug-key printouts - they don't run from
> irq context and look deadlock-free.  The tboot change seems safe too
> unless tboot shutdown functions are called from irq context or with the
> page_alloc lock held.  The p2m one is the scariest but there are already
> code paths in PoD that take the page_alloc lock with the p2m lock held
> so it's no worse than existing code. 
> 
> Signed-off-by: Tim Deegan <Tim.Deegan@xxxxxxxxxx>
> 
> diff -r e8dbc1262f52 xen/arch/x86/domain.c
> --- a/xen/arch/x86/domain.c   Wed Jul 21 09:02:10 2010 +0100
> +++ b/xen/arch/x86/domain.c   Fri Jul 23 14:33:22 2010 +0100
> @@ -139,12 +139,14 @@ void dump_pageframe_info(struct domain *
>      }
>      else
>      {
> +        spin_lock(&d->page_alloc_lock);
>          page_list_for_each ( page, &d->page_list )
>          {
>              printk("    DomPage %p: caf=%08lx, taf=%" PRtype_info "\n",
>                     _p(page_to_mfn(page)),
>                     page->count_info, page->u.inuse.type_info);
>          }
> +        spin_unlock(&d->page_alloc_lock);
>      }
>  
>      if ( is_hvm_domain(d) )
> @@ -152,12 +154,14 @@ void dump_pageframe_info(struct domain *
>          p2m_pod_dump_data(d);
>      }
>  
> +    spin_lock(&d->page_alloc_lock);
>      page_list_for_each ( page, &d->xenpage_list )
>      {
>          printk("    XenPage %p: caf=%08lx, taf=%" PRtype_info "\n",
>                 _p(page_to_mfn(page)),
>                 page->count_info, page->u.inuse.type_info);
>      }
> +    spin_unlock(&d->page_alloc_lock);
>  }
>  
>  struct domain *alloc_domain_struct(void)
> diff -r e8dbc1262f52 xen/arch/x86/mm/p2m.c
> --- a/xen/arch/x86/mm/p2m.c   Wed Jul 21 09:02:10 2010 +0100
> +++ b/xen/arch/x86/mm/p2m.c   Fri Jul 23 14:33:22 2010 +0100
> @@ -1833,6 +1833,7 @@ int p2m_alloc_table(struct domain *d,
>          goto error;
>  
>      /* Copy all existing mappings from the page list and m2p */
> +    spin_lock(&d->page_alloc_lock);
>      page_list_for_each(page, &d->page_list)
>      {
>          mfn = page_to_mfn(page);
> @@ -1848,13 +1849,16 @@ int p2m_alloc_table(struct domain *d,
>  #endif
>               && gfn != INVALID_M2P_ENTRY
>              && !set_p2m_entry(d, gfn, mfn, 0, p2m_ram_rw) )
> -            goto error;
> +            goto error_unlock;
>      }
> +    spin_unlock(&d->page_alloc_lock);
>  
>      P2M_PRINTK("p2m table initialised (%u pages)\n", page_count);
>      p2m_unlock(p2m);
>      return 0;
>  
> +error_unlock:
> +    spin_unlock(&d->page_alloc_lock);
>   error:
>      P2M_PRINTK("failed to initialize p2m table, gfn=%05lx, mfn=%"
>                 PRI_mfn "\n", gfn, mfn_x(mfn));
> diff -r e8dbc1262f52 xen/arch/x86/numa.c
> --- a/xen/arch/x86/numa.c     Wed Jul 21 09:02:10 2010 +0100
> +++ b/xen/arch/x86/numa.c     Fri Jul 23 14:33:22 2010 +0100
> @@ -385,11 +385,13 @@ static void dump_numa(unsigned char key)
>               for_each_online_node(i)
>                       page_num_node[i] = 0;
>  
> +             spin_lock(&d->page_alloc_lock);
>               page_list_for_each(page, &d->page_list)
>               {
>                       i = phys_to_nid((paddr_t)page_to_mfn(page) << 
> PAGE_SHIFT);
>                       page_num_node[i]++;
>               }
> +             spin_unlock(&d->page_alloc_lock);
>  
>               for_each_online_node(i)
>                       printk("    Node %u: %u\n", i, page_num_node[i]);
> diff -r e8dbc1262f52 xen/arch/x86/tboot.c
> --- a/xen/arch/x86/tboot.c    Wed Jul 21 09:02:10 2010 +0100
> +++ b/xen/arch/x86/tboot.c    Fri Jul 23 14:33:22 2010 +0100
> @@ -211,12 +211,14 @@ static void tboot_gen_domain_integrity(c
>              continue;
>          printk("MACing Domain %u\n", d->domain_id);
>  
> +        spin_lock(&d->page_alloc_lock);
>          page_list_for_each(page, &d->page_list)
>          {
>              void *pg = __map_domain_page(page);
>              vmac_update(pg, PAGE_SIZE, &ctx);
>              unmap_domain_page(pg);
>          }
> +        spin_unlock(&d->page_alloc_lock);
>  
>          if ( !is_idle_domain(d) )
>          {
> 
> _______________________________________________
> Xen-devel mailing list
> Xen-devel@xxxxxxxxxxxxxxxxxxx
> http://lists.xensource.com/xen-devel

-- 
Tim Deegan <Tim.Deegan@xxxxxxxxxx>
Principal Software Engineer, XenServer Engineering
Citrix Systems UK Ltd.  (Company #02937203, SL9 0BG)

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.