[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 04/15] argo: init, destroy and soft-reset, with enable command line opt
aOn Mon, Jan 7, 2019 at 8:44 AM Christopher Clark <christopher.w.clark@xxxxxxxxx> wrote: > > Initialises basic data structures and performs teardown of argo state > for domain shutdown. > > Inclusion of the Argo implementation is dependent on CONFIG_ARGO. > > Introduces a new Xen command line parameter 'argo': bool to enable/disable > the argo hypercall. Defaults to disabled. > > New headers: > public/argo.h: with definions of addresses and ring structure, including > indexes for atomic update for communication between domain and hypervisor. > > xen/argo.h: to expose the hooks for integration into domain lifecycle: > argo_init: per-domain init of argo data structures for domain_create. > argo_destroy: teardown for domain_destroy and the error exit > path of domain_create. > argo_soft_reset: reset of domain state for domain_soft_reset. > > Adds two new fields to struct domain: > rwlock_t argo_lock; > struct argo_domain *argo; > > In accordance with recent work on _domain_destroy, argo_destroy is > idempotent. It will tear down: all rings registered by this domain, all > rings where this domain is the single sender (ie. specified partner, > non-wildcard rings), and all pending notifications where this domain is > awaiting signal about available space in the rings of other domains. > > A count will be maintained of the number of rings that a domain has > registered in order to limit it below the fixed maximum limit defined here. > > The software license on the public header is the BSD license, standard > procedure for the public Xen headers. The public header was originally > posted under a GPL license at: [1]: > https://lists.xenproject.org/archives/html/xen-devel/2013-05/msg02710.html > > The following ACK by Lars Kurth is to confirm that only people being > employees of Citrix contributed to the header files in the series posted at > [1] and that thus the copyright of the files in question is fully owned by > Citrix. The ACK also confirms that Citrix is happy for the header files to > be published under a BSD license in this series (which is based on [1]). > > Signed-off-by: Christopher Clark <christopher.clark6@xxxxxxxxxxxxxx> > Acked-by: Lars Kurth <lars.kurth@xxxxxxxxxx> > --- > v2 rewrite locking explanation comment > v2 header copyright line now includes 2019 > v2 self: use ring_info backpointer in pending_ent to maintain npending > v2 self: rename all_rings_remove_info to domain_rings_remove_all > v2 feedback Jan: drop cookie, implement teardown > v2 self: add npending to track number of pending entries per ring > v2 self: amend comment on locking; drop section comments > v2 cookie_eq: test low bits first and use likely on high bits > v2 self: OVERHAUL > v2 self: s/argo_pending_ent/pending_ent/g > v2 self: drop pending_remove_ent, inline at single call site > v1 feedback Roger, Jan: drop argo prefix on static functions > v2 #4 Lars: add Acked-by and details to commit message. > v2 feedback #9 Jan: document argo boot opt in xen-command-line.markdown > v2 bugfix: xsm use in soft-reset prior to introduction > v2 feedback #9 Jan: drop 'message' from do_argo_message_op > v1 #5 feedback Paul: init/destroy unsigned, brackets and whitespace fixes > v1 #5 feedback Paul: Use mfn_eq for comparing mfns. > v1 #5 feedback Paul: init/destroy : use currd > v1 #6 (#5) feedback Jan: init/destroy: s/ENOSYS/EOPNOTSUPP/ > v1 #6 feedback Paul: Folded patch 6 into patch 5. > v1 #6 feedback Jan: drop opt_argo_enabled initializer > v1 $6 feedback Jan: s/ENOSYS/EOPNOTSUPP/g and drop useless dprintk > v1. #5 feedback Paul: change the license on public header to BSD > - ack from Lars at Citrix. > v1. self, Jan: drop unnecessary xen include from sched.h > v1. self, Jan: drop inclusion of public argo.h in private one > v1. self, Jan: add include of public argo.h to argo.c > v1. self, Jan: drop fwd decl of argo_domain in priv header > v1. Paul/self/Jan: add data structures to xlat.lst and compat/argo.h to > Makefile > v1. self: removed allocation of event channel since switching to VIRQ > v1. self: drop types.h include from private argo.h > v1: reorder public argo include position > v1: #13 feedback Jan: public namespace: prefix with xen > v1: self: rename pending ent "id" to "domain_id" > v1: self: add domain_cookie to ent struct > v1. #15 feedback Jan: make cmd unsigned > v1. #15 feedback Jan: make i loop variable unsigned > v1: self: adjust dprintks in init, destroy > v1: #18 feedback Jan: meld max ring count limit > v1: self: use type not struct in public defn, affects compat gen header > v1: feedback #15 Jan: handle upper-halves of hypercall args > v1: add comment explaining the 'magic' field > v1: self + Jan feedback: implement soft reset > v1: feedback #13 Roger: use ASSERT_UNREACHABLE > > docs/misc/xen-command-line.pandoc | 11 + > xen/common/argo.c | 461 > +++++++++++++++++++++++++++++++++++++- > xen/common/domain.c | 20 ++ > xen/include/Makefile | 1 + > xen/include/public/argo.h | 59 +++++ > xen/include/xen/argo.h | 23 ++ > xen/include/xen/sched.h | 6 + > xen/include/xlat.lst | 2 + > 8 files changed, 582 insertions(+), 1 deletion(-) > create mode 100644 xen/include/public/argo.h > create mode 100644 xen/include/xen/argo.h > > diff --git a/docs/misc/xen-command-line.pandoc > b/docs/misc/xen-command-line.pandoc > index a755a67..aea13eb 100644 > --- a/docs/misc/xen-command-line.pandoc > +++ b/docs/misc/xen-command-line.pandoc > @@ -182,6 +182,17 @@ Permit Xen to use "Always Running APIC Timer" support on > compatible hardware > in combination with cpuidle. This option is only expected to be useful for > developers wishing Xen to fall back to older timing methods on newer > hardware. > > +### argo > +> `= <boolean>` > + > +> Default: `false` > + > +Enable the Argo hypervisor-mediated interdomain communication mechanism. > + > +This allows domains access to the Argo hypercall, which supports registration > +of memory rings with the hypervisor to receive messages, sending messages to > +other domains by hypercall and querying the ring status of other domains. > + > ### asid (x86) > > `= <boolean>` > > diff --git a/xen/common/argo.c b/xen/common/argo.c > index 6f782f7..86195d3 100644 > --- a/xen/common/argo.c > +++ b/xen/common/argo.c > @@ -17,7 +17,177 @@ > */ > > #include <xen/errno.h> > +#include <xen/sched.h> > +#include <xen/domain.h> > +#include <xen/argo.h> > +#include <xen/event.h> > +#include <xen/domain_page.h> > #include <xen/guest_access.h> > +#include <xen/time.h> > +#include <public/argo.h> We usually try to sort header includes alphabetically, and I would add a newline between the xen/* and the public/* header includes. > + > +DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t); > +DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t); > + > +/* Xen command line option to enable argo */ > +static bool __read_mostly opt_argo_enabled; > +boolean_param("argo", opt_argo_enabled); I would drop the opt_* prefix, new options added recently don't include the prefix already. > + > +typedef struct argo_ring_id > +{ > + uint32_t port; > + domid_t partner_id; > + domid_t domain_id; > +} argo_ring_id; > + > +/* Data about a domain's own ring that it has registered */ > +struct argo_ring_info > +{ > + /* next node in the hash, protected by L2 */ > + struct hlist_node node; > + /* this ring's id, protected by L2 */ > + struct argo_ring_id id; > + /* L3 */ > + spinlock_t lock; > + /* length of the ring, protected by L3 */ > + uint32_t len; > + /* number of pages in the ring, protected by L3 */ > + uint32_t npage; Can you infer number of pages form the length of the ring, or the other way around? I'm not sure why both need to be stored here. > + /* number of pages translated into mfns, protected by L3 */ > + uint32_t nmfns; > + /* cached tx pointer location, protected by L3 */ > + uint32_t tx_ptr; All this fields are not part of any public structure, so I wonder if it would be better to simply use unsigned int for those, or size_t. > + /* mapped ring pages protected by L3 */ > + uint8_t **mfn_mapping; Why 'uint8_t *', wouldn't it be better to just use 'void *' if it's a mapping? > + /* list of mfns of guest ring, protected by L3 */ > + mfn_t *mfns; > + /* list of struct pending_ent for this ring, protected by L3 */ > + struct hlist_head pending; > + /* number of pending entries queued for this ring, protected by L3 */ > + uint32_t npending; > +}; > + > +/* Data about a single-sender ring, held by the sender (partner) domain */ > +struct argo_send_info > +{ > + /* next node in the hash, protected by Lsend */ > + struct hlist_node node; > + /* this ring's id, protected by Lsend */ > + struct argo_ring_id id; > +}; > + > +/* A space-available notification that is awaiting sufficient space */ > +struct pending_ent > +{ > + /* List node within argo_ring_info's pending list */ > + struct hlist_node node; > + /* > + * List node within argo_domain's wildcard_pend_list. Only used if the > + * ring is one with a wildcard partner (ie. that any domain may send to) > + * to enable cancelling signals on wildcard rings on domain destroy. > + */ > + struct hlist_node wildcard_node; > + /* > + * Pointer to the ring_info that this ent pertains to. Used to ensure > that > + * ring_info->npending is decremented when ents for wildcard rings are > + * cancelled for domain destroy. > + * Caution: Must hold the correct locks before accessing ring_info via > this. > + */ > + struct argo_ring_info *ring_info; > + /* domain to be notified when space is available */ > + domid_t domain_id; > + uint16_t pad; No need for the pad in internal structures. > + /* minimum ring space available that this signal is waiting upon */ > + uint32_t len; > +}; > + > +/* > + * The value of the argo element in a struct domain is > + * protected by the global lock argo_lock: L1 > + */ > +#define ARGO_HTABLE_SIZE 32 > +struct argo_domain > +{ > + /* L2 */ > + rwlock_t lock; > + /* > + * Hash table of argo_ring_info about rings this domain has registered. > + * Protected by L2. > + */ > + struct hlist_head ring_hash[ARGO_HTABLE_SIZE]; > + /* Counter of rings registered by this domain. Protected by L2. */ > + uint32_t ring_count; > + > + /* Lsend */ > + spinlock_t send_lock; > + /* > + * Hash table of argo_send_info about rings other domains have registered > + * for this domain to send to. Single partner, non-wildcard rings. > + * Protected by Lsend. > + */ > + struct hlist_head send_hash[ARGO_HTABLE_SIZE]; > + > + /* Lwildcard */ > + spinlock_t wildcard_lock; > + /* > + * List of pending space-available signals for this domain about wildcard > + * rings registered by other domains. Protected by Lwildcard. > + */ > + struct hlist_head wildcard_pend_list; > +}; > + > +/* > + * Locking is organized as follows: > + * > + * Terminology: R(<lock>) means taking a read lock on the specified lock; > + * W(<lock>) means taking a write lock on it. > + * > + * L1 : The global lock: argo_lock > + * Protects the argo elements of all struct domain *d in the system. > + * It does not protect any of the elements of d->argo, only their > + * addresses. > + * > + * By extension since the destruction of a domain with a non-NULL > + * d->argo will need to free the d->argo pointer, holding W(L1) > + * guarantees that no domains pointers that argo is interested in > + * become invalid whilst this lock is held. > + */ > + > +static DEFINE_RWLOCK(argo_lock); /* L1 */ You also add an argo_lock to each domain struct which doesn't seem to be mentioned here at all. Shouldn't that lock be the one that protects d->argo? (instead of this global lock?) > + > +/* > + * L2 : The per-domain ring hash lock: d->argo->lock > + * Holding a read lock on L2 protects the ring hash table and > + * the elements in the hash_table d->argo->ring_hash, and > + * the node and id fields in struct argo_ring_info in the > + * hash table. > + * Holding a write lock on L2 protects all of the elements of > + * struct argo_ring_info. > + * > + * To take L2 you must already have R(L1). W(L1) implies W(L2) and L3. > + * > + * L3 : The ringinfo lock: argo_ring_info *ringinfo; ringinfo->lock > + * Protects all the fields within the argo_ring_info, aside from the ones > that > + * L2 already protects: node, id, lock. > + * > + * To aquire L3 you must already have R(L2). W(L2) implies L3. > + * > + * Lsend : The per-domain single-sender partner rings lock: > d->argo->send_lock > + * Protects the per-domain send hash table : d->argo->send_hash > + * and the elements in the hash table, and the node and id fields > + * in struct argo_send_info in the hash table. > + * > + * To take Lsend, you must already have R(L1). W(L1) implies Lsend. > + * Do not attempt to acquire a L2 on any domain after taking and while > + * holding a Lsend lock -- acquire the L2 (if one is needed) beforehand. > + * > + * Lwildcard : The per-domain wildcard pending list lock: > d->argo->wildcard_lock > + * Protects the per-domain list of outstanding signals for space availability > + * on wildcard rings. > + * > + * To take Lwildcard, you must already have R(L1). W(L1) implies Lwildcard. > + * No other locks are acquired after obtaining Lwildcard. > + */ IMO I think the locking is overly complicated, and there's no reasoning why so many locks are needed. Wouldn't it be enough to start with a single lock that protects the whole d->argo existence and contents? I would start with a very simple (as simple as possible) locking structure and go improving from there if there are performance bottlenecks. > /* Change this to #define ARGO_DEBUG here to enable more debug messages */ > #undef ARGO_DEBUG > @@ -28,10 +198,299 @@ > #define argo_dprintk(format, ... ) ((void)0) > #endif > > +static void > +ring_unmap(struct argo_ring_info *ring_info) > +{ > + unsigned int i; > + > + if ( !ring_info->mfn_mapping ) > + return; > + > + for ( i = 0; i < ring_info->nmfns; i++ ) > + { > + if ( !ring_info->mfn_mapping[i] ) > + continue; > + if ( ring_info->mfns ) > + argo_dprintk(XENLOG_ERR "argo: unmapping page %"PRI_mfn" from > %p\n", > + mfn_x(ring_info->mfns[i]), > + ring_info->mfn_mapping[i]); > + unmap_domain_page_global(ring_info->mfn_mapping[i]); > + ring_info->mfn_mapping[i] = NULL; > + } As noted in another patch, I would consider mapping this in contiguous virtual address space using vmap, but I'm not going to insist. > +} > + > +static void > +wildcard_pending_list_remove(domid_t domain_id, struct pending_ent *ent) > +{ > + struct domain *d = get_domain_by_id(domain_id); Newline. > + if ( !d ) > + return; > + > + if ( d->argo ) Don't you need to pick d->argo_lock here to prevent d->argo from being removed under your feet? > + { > + spin_lock(&d->argo->wildcard_lock); > + hlist_del(&ent->wildcard_node); > + spin_unlock(&d->argo->wildcard_lock); > + } > + put_domain(d); > +} > + > +static void > +pending_remove_all(struct argo_ring_info *ring_info) > +{ > + struct hlist_node *node, *next; > + struct pending_ent *ent; > + > + hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node) As a side note, it might be interesting to introduce a helper like hlist_first_entry_or_null, that would remove the need to have an extra *next element, and would be the more natural way to drain a hlist (seeing that you have the same pattern in wildcard_rings_pending_remove). > + { > + if ( ring_info->id.partner_id == XEN_ARGO_DOMID_ANY ) > + wildcard_pending_list_remove(ent->domain_id, ent); > + hlist_del(&ent->node); > + xfree(ent); > + } > + ring_info->npending = 0; > +} > + > +static void > +wildcard_rings_pending_remove(struct domain *d) > +{ > + struct hlist_node *node, *next; > + struct pending_ent *ent; > + > + ASSERT(rw_is_write_locked(&argo_lock)); > + > + hlist_for_each_entry_safe(ent, node, next, &d->argo->wildcard_pend_list, > + node) > + { > + hlist_del(&ent->node); > + ent->ring_info->npending--; > + hlist_del(&ent->wildcard_node); > + xfree(ent); > + } > +} > + > +static void > +ring_remove_mfns(const struct domain *d, struct argo_ring_info *ring_info) > +{ > + unsigned int i; > + > + ASSERT(rw_is_write_locked(&d->argo->lock) || > + rw_is_write_locked(&argo_lock)); I think the above requires a comment of why two different locks are used to protect the ring mfns, and why just having one of them locked is enough. > + > + if ( !ring_info->mfns ) > + return; > + > + if ( !ring_info->mfn_mapping ) > + { > + ASSERT_UNREACHABLE(); > + return; > + } > + > + ring_unmap(ring_info); > + > + for ( i = 0; i < ring_info->nmfns; i++ ) > + if ( !mfn_eq(ring_info->mfns[i], INVALID_MFN) ) > + put_page_and_type(mfn_to_page(ring_info->mfns[i])); > + > + xfree(ring_info->mfns); > + ring_info->mfns = NULL; Xen has a handly macro for this, XFREE. That would free the memory and assign the pointer to NULL. > + ring_info->npage = 0; > + xfree(ring_info->mfn_mapping); > + ring_info->mfn_mapping = NULL; > + ring_info->nmfns = 0; > +} > + > +static void > +ring_remove_info(struct domain *d, struct argo_ring_info *ring_info) I think the domain parameter can be constifyed here, since it's only used by ring_remove_mfnsnd that function already expects a const domain struct. > +{ > + ASSERT(rw_is_write_locked(&d->argo->lock) || > + rw_is_write_locked(&argo_lock)); > + > + pending_remove_all(ring_info); > + hlist_del(&ring_info->node); > + ring_remove_mfns(d, ring_info); > + xfree(ring_info); > +} > + > +static void > +domain_rings_remove_all(struct domain *d) > +{ > + unsigned int i; > + > + for ( i = 0; i < ARGO_HTABLE_SIZE; ++i ) > + { > + struct hlist_node *node, *next; > + struct argo_ring_info *ring_info; > + > + hlist_for_each_entry_safe(ring_info, node, next, > + &d->argo->ring_hash[i], node) > + ring_remove_info(d, ring_info); > + } > + d->argo->ring_count = 0; > +} > + > +/* > + * Tear down all rings of other domains where src_d domain is the partner. > + * (ie. it is the single domain that can send to those rings.) > + * This will also cancel any pending notifications about those rings. > + */ > +static void > +partner_rings_remove(struct domain *src_d) > +{ > + unsigned int i; > + > + ASSERT(rw_is_write_locked(&argo_lock)); > + > + for ( i = 0; i < ARGO_HTABLE_SIZE; ++i ) > + { > + struct hlist_node *node, *next; > + struct argo_send_info *send_info; > + > + hlist_for_each_entry_safe(send_info, node, next, > + &src_d->argo->send_hash[i], node) > + { > + struct argo_ring_info *ring_info; > + struct domain *dst_d; > + > + dst_d = get_domain_by_id(send_info->id.domain_id); > + if ( dst_d ) > + { > + ring_info = ring_find_info(dst_d, &send_info->id); > + if ( ring_info ) > + { > + ring_remove_info(dst_d, ring_info); > + dst_d->argo->ring_count--; > + } > + > + put_domain(dst_d); > + } > + > + hlist_del(&send_info->node); > + xfree(send_info); > + } > + } > +} > + > long > do_argo_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1, > XEN_GUEST_HANDLE_PARAM(void) arg2, unsigned long arg3, > unsigned long arg4) > { > - return -ENOSYS; > + struct domain *currd = current->domain; > + long rc = -EFAULT; > + > + argo_dprintk("->do_argo_op(%u,%p,%p,%d,%d)\n", cmd, > + (void *)arg1.p, (void *)arg2.p, (int) arg3, (int) arg4); > + > + if ( unlikely(!opt_argo_enabled) ) > + { > + rc = -EOPNOTSUPP; > + return rc; Why not just 'return -EOPNOTSUPP;'? > + } > + > + domain_lock(currd); > + > + switch (cmd) > + { > + default: > + rc = -EOPNOTSUPP; > + break; > + } > + > + domain_unlock(currd); > + > + argo_dprintk("<-do_argo_op(%u)=%ld\n", cmd, rc); > + > + return rc; > +} > + > +static void > +argo_domain_init(struct argo_domain *argo) > +{ > + unsigned int i; > + > + rwlock_init(&argo->lock); > + spin_lock_init(&argo->send_lock); > + spin_lock_init(&argo->wildcard_lock); > + argo->ring_count = 0; > + > + for ( i = 0; i < ARGO_HTABLE_SIZE; ++i ) > + { > + INIT_HLIST_HEAD(&argo->ring_hash[i]); > + INIT_HLIST_HEAD(&argo->send_hash[i]); > + } > + INIT_HLIST_HEAD(&argo->wildcard_pend_list); > +} > + > +int > +argo_init(struct domain *d) > +{ > + struct argo_domain *argo; > + > + if ( !opt_argo_enabled ) > + { > + argo_dprintk("argo disabled, domid: %d\n", d->domain_id); > + return 0; > + } > + > + argo_dprintk("init: domid: %d\n", d->domain_id); > + > + argo = xmalloc(struct argo_domain); > + if ( !argo ) > + return -ENOMEM; > + > + write_lock(&argo_lock); > + > + argo_domain_init(argo); > + > + d->argo = argo; Where's the d->argo_lock initialization? > + > + write_unlock(&argo_lock); > + > + return 0; > +} > + > +void > +argo_destroy(struct domain *d) > +{ > + BUG_ON(!d->is_dying); > + > + write_lock(&argo_lock); > + > + argo_dprintk("destroy: domid %d d->argo=%p\n", d->domain_id, d->argo); > + > + if ( d->argo ) > + { > + domain_rings_remove_all(d); > + partner_rings_remove(d); > + wildcard_rings_pending_remove(d); > + xfree(d->argo); > + d->argo = NULL; > + } > + write_unlock(&argo_lock); > +} > + > +void > +argo_soft_reset(struct domain *d) > +{ > + write_lock(&argo_lock); > + > + argo_dprintk("soft reset d=%d d->argo=%p\n", d->domain_id, d->argo); > + > + if ( d->argo ) > + { > + domain_rings_remove_all(d); > + partner_rings_remove(d); > + wildcard_rings_pending_remove(d); > + > + if ( !opt_argo_enabled ) > + { > + xfree(d->argo); > + d->argo = NULL; Can opt_argo_enabled change during runtime? > + } > + else > + argo_domain_init(d->argo); > + } > + > + write_unlock(&argo_lock); > } > diff --git a/xen/common/domain.c b/xen/common/domain.c > index c623dae..9596840 100644 > --- a/xen/common/domain.c > +++ b/xen/common/domain.c > @@ -32,6 +32,7 @@ > #include <xen/grant_table.h> > #include <xen/xenoprof.h> > #include <xen/irq.h> > +#include <xen/argo.h> > #include <asm/debugger.h> > #include <asm/p2m.h> > #include <asm/processor.h> > @@ -277,6 +278,10 @@ static void _domain_destroy(struct domain *d) > > xfree(d->pbuf); > > +#ifdef CONFIG_ARGO > + argo_destroy(d); > +#endif Instead of adding such ifdefs you could provide dummy argo_destroy inline functions in argo.h when CONFIG_ARGO is not set. Thanks, Roger. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |