Xen project Mailing List

Re: [Xen-devel] [PATCH 5/5] just realized that it's broken

From: "Jan Beulich" <jbeulich@xxxxxxxxxx>

Date: Wed, 04 Feb 2009 08:27:24 +0000

Delivery-date: Wed, 04 Feb 2009 00:27:09 -0800

List-id: Xen developer discussion <xen-devel.lists.xensource.com>

>>> "Jan Beulich" <jbeulich@xxxxxxxxxx> 30.01.09 11:48 >>> >This patch was moving the cpumask out of struct page_info into the page >itself, but I just realized that this it not valid, since the purpose of the >mask >is to avoid a later TLB flush, hence the base assumption is that the page >might still be in the TLB of some CPU, thus making it possible for a mis- >behaved guest to still write to that page. > >Sorry for the mis-numbering, >Jan Actually, after some more consideration I think the patch is actually correct (minus an issue that already has been in place for a while anyway): All access a domain may continue have to a page it's freeing (through stale TLB entries) is read/execute. This is because when the type count of a page (i.e. in particular a writeable one) a (domain-)global TLB flush is performed anyway. Allowing the freeing domain (as well as the one subsequently allocating) to read the cpumask does not present a problem in my opinion. The issue mentioned above that I think current code has even without that patch (though I may easily have missed some aspect here) is that there is no enforcement of an immediate TLB flush when a writeable page's type count drops to zero - instead the flush is deferred until the end of the current operation or batch. With the removal of the use of the per-domain lock in these code paths it is no longer valid to defer the flush this much - it must be carried out before the page in question gets unlocked. And I would think that invalidate_shadow_ldt() has a similar issue, plus it seems questionable that it does a local flush when acting on the current vCPU, but a global one for all others. Jan ******************************************************* Move the (potentially large) cpumask field of free pages into the page itself, thus making sizeof(struct page_info) independent of the configured number of CPUs, which avoids penalizing systems with few CPUs just because the hypervisor is able to support many. Signed-off-by: Jan Beulich <jbeulich@xxxxxxxxxx> --- 2009-01-27.orig/xen/common/page_alloc.c 2009-01-29 15:27:38.000000000 +0100 +++ 2009-01-27/xen/common/page_alloc.c 2009-01-29 15:33:09.000000000 +0100 @@ -376,7 +376,7 @@ static struct page_info *alloc_heap_page BUG_ON(pg[i].count_info != 0); /* Add in any extra CPUs that need flushing because of this page. */ - cpus_andnot(extra_cpus_mask, pg[i].u.free.cpumask, mask); + cpus_andnot(extra_cpus_mask, PFN_CPUMASK(&pg[i]), mask); tlbflush_filter(extra_cpus_mask, pg[i].tlbflush_timestamp); cpus_or(mask, mask, extra_cpus_mask); @@ -425,11 +425,11 @@ static void free_heap_pages( if ( (d = page_get_owner(&pg[i])) != NULL ) { pg[i].tlbflush_timestamp = tlbflush_current_time(); - pg[i].u.free.cpumask = d->domain_dirty_cpumask; + PFN_CPUMASK(&pg[i]) = d->domain_dirty_cpumask; } else { - cpus_clear(pg[i].u.free.cpumask); + cpus_clear(PFN_CPUMASK(&pg[i])); } } --- 2009-01-27.orig/xen/include/asm-ia64/mm.h 2009-01-30 08:26:33.000000000 +0100 +++ 2009-01-27/xen/include/asm-ia64/mm.h 2009-01-29 14:24:42.000000000 +0100 @@ -35,8 +35,11 @@ typedef unsigned long page_flags_t; * Every architecture must ensure the following: * 1. 'struct page_info' contains a 'struct list_head list'. * 2. Provide a PFN_ORDER() macro for accessing the order of a free page. + * 3. Provide a PFN_CPUMASK() macro for accessing the mask of possibly-tainted + * TLBs. */ -#define PFN_ORDER(_pfn) ((_pfn)->u.free.order) +#define PFN_ORDER(_pfn) ((_pfn)->u.free.order) +#define PFN_CPUMASK(_pfn) ((_pfn)->u.free.cpumask) #define PRtype_info "016lx" --- 2009-01-27.orig/xen/include/asm-x86/mm.h 2009-01-29 14:03:37.000000000 +0100 +++ 2009-01-27/xen/include/asm-x86/mm.h 2009-01-30 08:28:27.000000000 +0100 @@ -14,8 +14,18 @@ * Every architecture must ensure the following: * 1. 'struct page_info' contains a 'struct page_list_entry list'. * 2. Provide a PFN_ORDER() macro for accessing the order of a free page. + * 3. Provide a PFN_CPUMASK() macro for accessing the mask of possibly-tainted + * TLBs. */ #define PFN_ORDER(_pfn) ((_pfn)->v.free.order) +#ifdef __i386__ +# define PFN_CPUMASK(_pfn) ((_pfn)->u.free.cpumask) +#else +# define PFN_CPUMASK(_pfn) (*(cpumask_t *)page_to_virt(_pfn)) +# if NR_CPUS > PAGE_SIZE * 8 +# error Cannot handle this many CPUs. +# endif +#endif /* * This definition is solely for the use in struct page_info (and @@ -72,8 +82,10 @@ struct page_info /* Page is on a free list: ((count_info & PGC_count_mask) == 0). */ struct { +#ifdef __i386__ /* Mask of possibly-tainted TLBs. */ cpumask_t cpumask; +#endif } free; } u; _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.