[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCHv10 3/4] gnttab: make the grant table lock a read-write lock
In combination with the per-active entry locks, the grant table lock can be made a read-write lock since the majority of cases only the read lock is required. The grant table read lock protects against changes to the table version or size (which are done with the write lock held). The write lock is also required when two active entries must be acquired. The double lock is still required when updating IOMMU page tables. With the lock contention being only on the maptrack lock (unless IOMMU updates are required), performance and scalability is improved. Based on a patch originally by Matt Wilson <msw@xxxxxxxxxx>. Signed-off-by: David Vrabel <david.vrabel@xxxxxxxxxx> --- v10: - In gnttab_map_grant_ref(), keep double lock around maptrack update if gnttab_need_iommu_mapping(). Use a wmb(), otherwise. --- docs/misc/grant-tables.txt | 30 ++++---- xen/arch/arm/mm.c | 4 +- xen/arch/x86/mm.c | 4 +- xen/common/grant_table.c | 156 +++++++++++++++++++++++------------------ xen/include/xen/grant_table.h | 9 ++- 5 files changed, 114 insertions(+), 89 deletions(-) diff --git a/docs/misc/grant-tables.txt b/docs/misc/grant-tables.txt index 83b3454..9d9d01f 100644 --- a/docs/misc/grant-tables.txt +++ b/docs/misc/grant-tables.txt @@ -83,7 +83,7 @@ is complete. ~~~~~~~ Xen uses several locks to serialize access to the internal grant table state. - grant_table->lock : lock used to prevent readers from accessing + grant_table->lock : rwlock used to prevent readers from accessing inconsistent grant table state such as current version, partially initialized active table pages, etc. @@ -91,9 +91,13 @@ is complete. active_grant_entry->lock : spinlock used to serialize modifications to active entries - The primary lock for the grant table is a spinlock. All functions - that access members of struct grant_table must acquire the lock - around critical sections. + The primary lock for the grant table is a read/write spinlock. All + functions that access members of struct grant_table must acquire a + read lock around critical sections. Any modification to the members + of struct grant_table (e.g., nr_status_frames, nr_grant_frames, + active frames, etc.) must only be made if the write lock is + held. These elements are read-mostly, and read critical sections can + be large, which makes a rwlock a good choice. The maptrack state is protected by its own spinlock. Any access (read or write) of struct grant_table members that have a "maptrack_" @@ -105,25 +109,25 @@ is complete. Active entries are obtained by calling active_entry_acquire(gt, ref). This function returns a pointer to the active entry after locking its - spinlock. The caller must hold the grant table lock for the gt in - question before calling active_entry_acquire(). This is because the - grant table can be dynamically extended via gnttab_grow_table() while - a domain is running and must be fully initialized. Once all access to - the active entry is complete, release the lock by calling + spinlock. The caller must hold the grant table read lock before + calling active_entry_acquire(). This is because the grant table can + be dynamically extended via gnttab_grow_table() while a domain is + running and must be fully initialized. Once all access to the active + entry is complete, release the lock by calling active_entry_release(act). Summary of rules for locking: active_entry_acquire() and active_entry_release() can only be - called when holding the relevant grant table's lock. I.e.: - spin_lock(>->lock); + called when holding the relevant grant table's read lock. I.e.: + read_lock(>->lock); act = active_entry_acquire(gt, ref); ... active_entry_release(act); - spin_unlock(>->lock); + read_unlock(>->lock); Active entries cannot be acquired while holding the maptrack lock. Multiple active entries can be acquired while holding the grant table - lock. + _write_ lock. ******************************************************************************** diff --git a/xen/arch/arm/mm.c b/xen/arch/arm/mm.c index a91ea77..b305041 100644 --- a/xen/arch/arm/mm.c +++ b/xen/arch/arm/mm.c @@ -1048,7 +1048,7 @@ int xenmem_add_to_physmap_one( switch ( space ) { case XENMAPSPACE_grant_table: - spin_lock(&d->grant_table->lock); + write_lock(&d->grant_table->lock); if ( d->grant_table->gt_version == 0 ) d->grant_table->gt_version = 1; @@ -1078,7 +1078,7 @@ int xenmem_add_to_physmap_one( t = p2m_ram_rw; - spin_unlock(&d->grant_table->lock); + write_unlock(&d->grant_table->lock); break; case XENMAPSPACE_shared_info: if ( idx != 0 ) diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index 5fe08df..8ca38b0 100644 --- a/xen/arch/x86/mm.c +++ b/xen/arch/x86/mm.c @@ -4593,7 +4593,7 @@ int xenmem_add_to_physmap_one( mfn = virt_to_mfn(d->shared_info); break; case XENMAPSPACE_grant_table: - spin_lock(&d->grant_table->lock); + write_lock(&d->grant_table->lock); if ( d->grant_table->gt_version == 0 ) d->grant_table->gt_version = 1; @@ -4615,7 +4615,7 @@ int xenmem_add_to_physmap_one( mfn = virt_to_mfn(d->grant_table->shared_raw[idx]); } - spin_unlock(&d->grant_table->lock); + write_unlock(&d->grant_table->lock); break; case XENMAPSPACE_gmfn_range: case XENMAPSPACE_gmfn: diff --git a/xen/common/grant_table.c b/xen/common/grant_table.c index b8c7a40..d67b7f4 100644 --- a/xen/common/grant_table.c +++ b/xen/common/grant_table.c @@ -196,7 +196,7 @@ active_entry_acquire(struct grant_table *t, grant_ref_t e) { struct active_grant_entry *act; - ASSERT(spin_is_locked(&t->lock)); + ASSERT(rw_is_locked(&t->lock)); act = &_active_entry(t, e); spin_lock(&act->lock); @@ -254,23 +254,23 @@ double_gt_lock(struct grant_table *lgt, struct grant_table *rgt) { if ( lgt < rgt ) { - spin_lock(&lgt->lock); - spin_lock(&rgt->lock); + write_lock(&lgt->lock); + write_lock(&rgt->lock); } else { if ( lgt != rgt ) - spin_lock(&rgt->lock); - spin_lock(&lgt->lock); + write_lock(&rgt->lock); + write_lock(&lgt->lock); } } static inline void double_gt_unlock(struct grant_table *lgt, struct grant_table *rgt) { - spin_unlock(&lgt->lock); + write_unlock(&lgt->lock); if ( lgt != rgt ) - spin_unlock(&rgt->lock); + write_unlock(&rgt->lock); } static inline int @@ -528,7 +528,7 @@ static int grant_map_exists(const struct domain *ld, { unsigned int ref, max_iter; - ASSERT(spin_is_locked(&rgt->lock)); + ASSERT(rw_is_locked(&rgt->lock)); max_iter = min(*ref_count + (1 << GNTTABOP_CONTINUATION_ARG_SHIFT), nr_grant_entries(rgt)); @@ -568,10 +568,10 @@ static void mapcount( *wrc = *rdc = 0; /* - * Must have the remote domain's grant table lock while counting - * its active entries. + * Must have the remote domain's grant table write lock while + * counting its active entries. */ - ASSERT(spin_is_locked(&rd->grant_table->lock)); + ASSERT(rw_is_write_locked(&rd->grant_table->lock)); for ( handle = 0; handle < lgt->maptrack_limit; handle++ ) { @@ -656,7 +656,7 @@ __gnttab_map_grant_ref( } rgt = rd->grant_table; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); if ( rgt->gt_version == 0 ) PIN_FAIL(unlock_out, GNTST_general_error, @@ -730,7 +730,7 @@ __gnttab_map_grant_ref( cache_flags = (shah->flags & (GTF_PAT | GTF_PWT | GTF_PCD) ); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); /* pg may be set, with a refcount included, from __get_paged_frame */ if ( !pg ) @@ -806,12 +806,13 @@ __gnttab_map_grant_ref( goto undo_out; } - double_gt_lock(lgt, rgt); - if ( gnttab_need_iommu_mapping(ld) ) { unsigned int wrc, rdc; int err = 0; + + double_gt_lock(lgt, rgt); + /* We're not translated, so we know that gmfns and mfns are the same things, so the IOMMU entry is always 1-to-1. */ mapcount(lgt, rd, frame, &wrc, &rdc); @@ -837,12 +838,22 @@ __gnttab_map_grant_ref( TRACE_1D(TRC_MEM_PAGE_GRANT_MAP, op->dom); + /* + * All maptrack entry users check mt->flags first before using the + * other fields so just ensure the flags field is stored last. + * + * However, if gnttab_need_iommu_mapping() then this would race + * with a concurrent mapcount() call (on an unmap, for example) + * and a lock is required. + */ mt = &maptrack_entry(lgt, handle); mt->domid = op->dom; mt->ref = op->ref; - mt->flags = op->flags; + wmb(); + write_atomic(&mt->flags, op->flags); - double_gt_unlock(lgt, rgt); + if ( gnttab_need_iommu_mapping(ld) ) + double_gt_unlock(lgt, rgt); op->dev_bus_addr = (u64)frame << PAGE_SHIFT; op->handle = handle; @@ -865,7 +876,7 @@ __gnttab_map_grant_ref( put_page(pg); } - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, op->ref); @@ -888,7 +899,7 @@ __gnttab_map_grant_ref( active_entry_release(act); unlock_out: - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); op->status = rc; put_maptrack_handle(lgt, handle); rcu_unlock_domain(rd); @@ -938,18 +949,19 @@ __gnttab_unmap_common( } op->map = &maptrack_entry(lgt, op->handle); - spin_lock(&lgt->lock); + + read_lock(&lgt->lock); if ( unlikely(!op->map->flags) ) { - spin_unlock(&lgt->lock); + read_unlock(&lgt->lock); gdprintk(XENLOG_INFO, "Zero flags for handle (%d).\n", op->handle); op->status = GNTST_bad_handle; return; } dom = op->map->domid; - spin_unlock(&lgt->lock); + read_unlock(&lgt->lock); if ( unlikely((rd = rcu_lock_domain_by_id(dom)) == NULL) ) { @@ -970,7 +982,8 @@ __gnttab_unmap_common( TRACE_1D(TRC_MEM_PAGE_GRANT_UNMAP, dom); rgt = rd->grant_table; - double_gt_lock(lgt, rgt); + + read_lock(&rgt->lock); op->flags = op->map->flags; if ( unlikely(!op->flags) || unlikely(op->map->domid != dom) ) @@ -1019,31 +1032,34 @@ __gnttab_unmap_common( act->pin -= GNTPIN_hstw_inc; } - if ( gnttab_need_iommu_mapping(ld) ) + act_release_out: + active_entry_release(act); + unmap_out: + read_unlock(&rgt->lock); + + if ( rc == GNTST_okay && gnttab_need_iommu_mapping(ld) ) { unsigned int wrc, rdc; int err = 0; + + double_gt_lock(lgt, rgt); + mapcount(lgt, rd, op->frame, &wrc, &rdc); if ( (wrc + rdc) == 0 ) err = iommu_unmap_page(ld, op->frame); else if ( wrc == 0 ) err = iommu_map_page(ld, op->frame, op->frame, IOMMUF_readable); + + double_gt_unlock(lgt, rgt); + if ( err ) - { rc = GNTST_general_error; - goto act_release_out; - } } /* If just unmapped a writable mapping, mark as dirtied */ - if ( !(op->flags & GNTMAP_readonly) ) + if ( rc == GNTST_okay && !(op->flags & GNTMAP_readonly) ) gnttab_mark_dirty(rd, op->frame); - act_release_out: - active_entry_release(act); - unmap_out: - double_gt_unlock(lgt, rgt); - op->status = rc; rcu_unlock_domain(rd); } @@ -1073,8 +1089,8 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op) rcu_lock_domain(rd); rgt = rd->grant_table; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); if ( rgt->gt_version == 0 ) goto unlock_out; @@ -1140,7 +1156,7 @@ __gnttab_unmap_common_complete(struct gnttab_unmap_common *op) act_release_out: active_entry_release(act); unlock_out: - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); if ( put_handle ) { @@ -1327,11 +1343,13 @@ gnttab_unpopulate_status_frames(struct domain *d, struct grant_table *gt) gt->nr_status_frames = 0; } +/* + * Grow the grant table. The caller must hold the grant table's + * write lock before calling this function. + */ int gnttab_grow_table(struct domain *d, unsigned int req_nr_frames) { - /* d's grant table lock must be held by the caller */ - struct grant_table *gt = d->grant_table; unsigned int i, j; @@ -1437,7 +1455,7 @@ gnttab_setup_table( } gt = d->grant_table; - spin_lock(>->lock); + write_lock(>->lock); if ( gt->gt_version == 0 ) gt->gt_version = 1; @@ -1465,7 +1483,7 @@ gnttab_setup_table( } out3: - spin_unlock(>->lock); + write_unlock(>->lock); out2: rcu_unlock_domain(d); out1: @@ -1507,13 +1525,13 @@ gnttab_query_size( goto query_out_unlock; } - spin_lock(&d->grant_table->lock); + read_lock(&d->grant_table->lock); op.nr_frames = nr_grant_frames(d->grant_table); op.max_nr_frames = max_grant_frames; op.status = GNTST_okay; - spin_unlock(&d->grant_table->lock); + read_unlock(&d->grant_table->lock); query_out_unlock: @@ -1539,7 +1557,7 @@ gnttab_prepare_for_transfer( union grant_combo scombo, prev_scombo, new_scombo; int retries = 0; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); if ( rgt->gt_version == 0 ) { @@ -1590,11 +1608,11 @@ gnttab_prepare_for_transfer( scombo = prev_scombo; } - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return 1; fail: - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return 0; } @@ -1787,7 +1805,7 @@ gnttab_transfer( TRACE_1D(TRC_MEM_PAGE_GRANT_TRANSFER, e->domain_id); /* Tell the guest about its new page frame. */ - spin_lock(&e->grant_table->lock); + read_lock(&e->grant_table->lock); if ( e->grant_table->gt_version == 1 ) { @@ -1805,7 +1823,7 @@ gnttab_transfer( shared_entry_header(e->grant_table, gop.ref)->flags |= GTF_transfer_completed; - spin_unlock(&e->grant_table->lock); + read_unlock(&e->grant_table->lock); rcu_unlock_domain(e); @@ -1843,7 +1861,7 @@ __release_grant_for_copy( released_read = 0; released_write = 0; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, gref); sha = shared_entry_header(rgt, gref); @@ -1885,7 +1903,7 @@ __release_grant_for_copy( } active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); if ( td != rd ) { @@ -1943,7 +1961,7 @@ __acquire_grant_for_copy( *page = NULL; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); if ( rgt->gt_version == 0 ) PIN_FAIL(gt_unlock_out, GNTST_general_error, @@ -2019,20 +2037,20 @@ __acquire_grant_for_copy( * here and reacquire */ active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); rc = __acquire_grant_for_copy(td, trans_gref, rd->domain_id, readonly, &grant_frame, page, &trans_page_off, &trans_length, 0); - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, gref); if ( rc != GNTST_okay ) { __fixup_status_for_copy_pin(act, status); rcu_unlock_domain(td); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return rc; } @@ -2045,7 +2063,7 @@ __acquire_grant_for_copy( __fixup_status_for_copy_pin(act, status); rcu_unlock_domain(td); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); put_page(*page); return __acquire_grant_for_copy(rd, gref, ldom, readonly, frame, page, page_off, length, @@ -2114,7 +2132,7 @@ __acquire_grant_for_copy( *frame = act->frame; active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return rc; unlock_out_clear: @@ -2129,7 +2147,7 @@ __acquire_grant_for_copy( active_entry_release(act); gt_unlock_out: - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); return rc; } @@ -2445,7 +2463,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARAM(gnttab_set_version_t) uop) if ( gt->gt_version == op.version ) goto out; - spin_lock(>->lock); + write_lock(>->lock); /* Make sure that the grant table isn't currently in use when we change the version number, except for the first 8 entries which are allowed to be in use (xenstore/xenconsole keeps them mapped). @@ -2530,7 +2548,7 @@ gnttab_set_version(XEN_GUEST_HANDLE_PARAM(gnttab_set_version_t) uop) gt->gt_version = op.version; out_unlock: - spin_unlock(>->lock); + write_unlock(>->lock); out: op.version = gt->gt_version; @@ -2586,7 +2604,7 @@ gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) uop, op.status = GNTST_okay; - spin_lock(>->lock); + read_lock(>->lock); for ( i = 0; i < op.nr_frames; i++ ) { @@ -2595,7 +2613,7 @@ gnttab_get_status_frames(XEN_GUEST_HANDLE_PARAM(gnttab_get_status_frames_t) uop, op.status = GNTST_bad_virt_addr; } - spin_unlock(>->lock); + read_unlock(>->lock); out2: rcu_unlock_domain(d); out1: @@ -2645,7 +2663,7 @@ __gnttab_swap_grant_ref(grant_ref_t ref_a, grant_ref_t ref_b) struct active_grant_entry *act_b = NULL; s16 rc = GNTST_okay; - spin_lock(>->lock); + write_lock(>->lock); /* Bounds check on the grant refs */ if ( unlikely(ref_a >= nr_grant_entries(d->grant_table))) @@ -2689,7 +2707,7 @@ out: active_entry_release(act_b); if ( act_a != NULL ) active_entry_release(act_a); - spin_unlock(>->lock); + write_unlock(>->lock); rcu_unlock_domain(d); @@ -2760,12 +2778,12 @@ static int __gnttab_cache_flush(gnttab_cache_flush_t *cflush, if ( d != owner ) { - spin_lock(&owner->grant_table->lock); + read_lock(&owner->grant_table->lock); ret = grant_map_exists(d, owner->grant_table, mfn, ref_count); if ( ret != 0 ) { - spin_unlock(&owner->grant_table->lock); + read_unlock(&owner->grant_table->lock); rcu_unlock_domain(d); put_page(page); return ret; @@ -2785,7 +2803,7 @@ static int __gnttab_cache_flush(gnttab_cache_flush_t *cflush, ret = 0; if ( d != owner ) - spin_unlock(&owner->grant_table->lock); + read_unlock(&owner->grant_table->lock); unmap_domain_page(v); put_page(page); @@ -3004,7 +3022,7 @@ grant_table_create( goto no_mem_0; /* Simple stuff. */ - spin_lock_init(&t->lock); + rwlock_init(&t->lock); spin_lock_init(&t->maptrack_lock); t->nr_grant_frames = INITIAL_NR_GRANT_FRAMES; @@ -3114,7 +3132,7 @@ gnttab_release_mappings( } rgt = rd->grant_table; - spin_lock(&rgt->lock); + read_lock(&rgt->lock); act = active_entry_acquire(rgt, ref); sha = shared_entry_header(rgt, ref); @@ -3175,7 +3193,7 @@ gnttab_release_mappings( gnttab_clear_flag(_GTF_reading, status); active_entry_release(act); - spin_unlock(&rgt->lock); + read_unlock(&rgt->lock); rcu_unlock_domain(rd); @@ -3223,7 +3241,7 @@ static void gnttab_usage_print(struct domain *rd) printk(" -------- active -------- -------- shared --------\n"); printk("[ref] localdom mfn pin localdom gmfn flags\n"); - spin_lock(>->lock); + read_lock(>->lock); if ( gt->gt_version == 0 ) goto out; @@ -3276,7 +3294,7 @@ static void gnttab_usage_print(struct domain *rd) } out: - spin_unlock(>->lock); + read_unlock(>->lock); if ( first ) printk("grant-table for remote domain:%5d ... " diff --git a/xen/include/xen/grant_table.h b/xen/include/xen/grant_table.h index 0b35a5e..f22ebd0 100644 --- a/xen/include/xen/grant_table.h +++ b/xen/include/xen/grant_table.h @@ -64,6 +64,11 @@ struct grant_mapping { /* Per-domain grant information. */ struct grant_table { + /* + * Lock protecting updates to grant table state (version, active + * entry list, etc.) + */ + rwlock_t lock; /* Table size. Number of frames shared with guest */ unsigned int nr_grant_frames; /* Shared grant table (see include/public/grant_table.h). */ @@ -84,8 +89,6 @@ struct grant_table { unsigned int maptrack_limit; /* Lock protecting the maptrack page list, head, and limit */ spinlock_t maptrack_lock; - /* Lock protecting updates to active and shared grant tables. */ - spinlock_t lock; /* The defined versions are 1 and 2. Set to 0 if we don't know what version to use yet. */ unsigned gt_version; @@ -103,7 +106,7 @@ gnttab_release_mappings( struct domain *d); /* Increase the size of a domain's grant table. - * Caller must hold d's grant table lock. + * Caller must hold d's grant table write lock. */ int gnttab_grow_table(struct domain *d, unsigned int req_nr_frames); -- 1.7.10.4 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |