[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH v2 14/18] argo: implement the notify op
Queries for data about space availability in registered rings and causes notification to be sent when space has become available. The hypercall op populates a supplied data structure with information about ring state, and if insufficent space is currently available in a given ring, the hypervisor will record the domain's expressed interest and notify it when it observes that space has become available. Checks for free space occur when this notify op is invoked, so it may be intentionally invoked with no data structure to populate (ie. a NULL argument) to trigger such a check and consequent notifications. copy_field_from_guest_errno is added for guest access, performing the same operation as copy_field_from_guest, but returning -EFAULT if the copy is incomplete. Added to common code to simplify code at call sites. limit the max number of notify requests in a single operation with a simple fixed limit of 256. Signed-off-by: Christopher Clark <christopher.clark6@xxxxxxxxxxxxxx> --- Changes since v1: v1 #5 (#16) feedback Paul: notify op: use currd in do_argo_message_op v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify v1 #5 (#16) feedback Paul: notify op: use currd in argo_notify_check_pending v1 #5 (#16) feedback Paul: notify op: use currd in argo_fill_ring_data_array v1 #13 (#16) feedback Paul: notify op: do/while: reindent only v1 #13 (#16) feedback Paul: notify op: do/while: goto v1 : add compat xlat.lst entries v1: add definition for copy_field_from_guest_errno v1 #13 feedback Jan: make 'ring data' comment comply with single-line style v1 feedback #13 Jan: use __copy; so define and use __copy_field_to_guest_errno v1: #13 feedback Jan: public namespace: prefix with xen v1: #13 feedback Jan: add blank line after case in do_argo_message_op v1: self: rename ent id to domain_id v1: self: ent id-> domain_id v1: self: drop signal if domain_cookie mismatches v1. feedback #15 Jan: make loop i unsigned v1. self: drop unnecessary mb() in argo_notify_check_pending v1. self: add blank line v1 #16 feedback Jan: const domain arg to +argo_fill_ring_data v1. feedback #15 Jan: check unusued hypercall args are zero v1 feedback #16 Jan: add comment on space available signal policy v1. feedback #16 Jan: move declr, drop braces, lower indent v1. feedback #18 Jan: meld the resource limits into the main commit v1. feedback #16 Jan: clarify use of magic field v1. self: use single copy to read notify ring data struct v1: argo_fill_ring_data: fix dprintk types for port field v1: self: use %x for printing port as per other print sites v1. feedback Jan: add comments explaining ring full vs empty v1. following Jan: fix argo_ringbuf_payload_space calculation for empty ring xen/common/argo.c | 384 +++++++++++++++++++++++++++++++++++++ xen/include/asm-arm/guest_access.h | 5 + xen/include/asm-x86/guest_access.h | 5 + xen/include/public/argo.h | 67 +++++++ xen/include/xlat.lst | 2 + 5 files changed, 463 insertions(+) diff --git a/xen/common/argo.c b/xen/common/argo.c index ed50415..6fbd0a6 100644 --- a/xen/common/argo.c +++ b/xen/common/argo.c @@ -29,12 +29,15 @@ #include <public/argo.h> #define ARGO_MAX_RINGS_PER_DOMAIN 128U +#define ARGO_MAX_NOTIFY_COUNT 256U DEFINE_XEN_GUEST_HANDLE(xen_argo_page_descr_t); DEFINE_XEN_GUEST_HANDLE(xen_argo_addr_t); DEFINE_XEN_GUEST_HANDLE(xen_argo_iov_t); DEFINE_XEN_GUEST_HANDLE(xen_argo_send_addr_t); DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_t); +DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_t); +DEFINE_XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t); DECLARE_XEN_GUEST_HANDLE_NULL(uint8_t); /* pfn type: 64-bit on all architectures */ @@ -137,6 +140,10 @@ argo_hash(const struct xen_argo_ring_id *id) return ret; } +static struct argo_ring_info * +argo_ring_find_info_by_match(const struct domain *d, uint32_t port, + domid_t partner_id, uint64_t partner_cookie); + /* * locks */ @@ -197,6 +204,28 @@ argo_signal_domain(struct domain *d) send_guest_global_virq(d, VIRQ_ARGO_MESSAGE); } +static void +argo_signal_domid(domid_t domain_id, uint64_t domain_cookie) +{ + struct domain *d = get_domain_by_id(domain_id); + + if ( !d ) + return; + + if ( !d->argo ) + goto out; + /* + * The caller holds R(L1) which ensures that d->argo is stable. + * Since the domain_cookie is never modified while d->argo is valid + * we do not need to aquire R(L2) to read the cookie here. + */ + if ( d->argo->domain_cookie == domain_cookie ) + argo_signal_domain(d); + + out: + put_domain(d); +} + /* * ring buffer */ @@ -388,6 +417,60 @@ argo_ringbuf_get_rx_ptr(struct argo_ring_info *ring_info, uint32_t *rx_ptr) return 0; } +static uint32_t +argo_ringbuf_payload_space(struct domain *d, struct argo_ring_info *ring_info) +{ + xen_argo_ring_t ring; + int32_t ret; + + ASSERT(spin_is_locked(&ring_info->lock)); + + ring.len = ring_info->len; + if ( !ring.len ) + return 0; + + ring.tx_ptr = ring_info->tx_ptr; + + if ( argo_ringbuf_get_rx_ptr(ring_info, &ring.rx_ptr) ) + return 0; + + argo_dprintk("argo_ringbuf_payload_space: tx_ptr=%d rx_ptr=%d\n", + ring.tx_ptr, ring.rx_ptr); + + /* + * rx_ptr == tx_ptr means that the ring has been emptied, so return + * the maximum payload size that can be accepted -- see message size + * checking logic in the entry to argo_ringbuf_insert which ensures that + * there is always one message slot (of size XEN_ARGO_ROUNDUP(1)) left + * available, preventing a ring from being entirely filled. This ensures + * that matching ring indexes always indicate an empty ring and not a + * full one. + * The subtraction here will not underflow due to minimum size constraints + * enforced on ring size elsewhere. + */ + if ( ring.rx_ptr == ring.tx_ptr ) + return ring.len - sizeof(struct xen_argo_ring_message_header) + - XEN_ARGO_ROUNDUP(1); + + ret = ring.rx_ptr - ring.tx_ptr; + if ( ret < 0 ) + ret += ring.len; + + /* + * The maximum size payload for a message that will be accepted is: + * (the available space between the ring indexes) + * minus (space for a message header) + * minus (space for one message slot) + * since argo_ringbuf_insert requires that one message slot be left + * unfilled, to avoid filling the ring to capacity and confusing a full + * ring with an empty one. + */ + ret -= sizeof(struct xen_argo_ring_message_header); + ret -= XEN_ARGO_ROUNDUP(1); + + return (ret < 0) ? 0 : ret; +} + /* * argo_sanitize_ring creates a modified copy of the ring pointers * where the rx_ptr is rounded up to ensure it is aligned, and then @@ -835,6 +918,58 @@ argo_pending_remove_all(struct argo_ring_info *ring_info) argo_pending_remove_ent(pending_ent); } +static void +argo_pending_notify(struct hlist_head *to_notify) +{ + struct hlist_node *node, *next; + struct argo_pending_ent *ent; + + ASSERT(rw_is_locked(&argo_lock)); + + hlist_for_each_entry_safe(ent, node, next, to_notify, node) + { + hlist_del(&ent->node); + argo_signal_domid(ent->domain_id, ent->domain_cookie); + xfree(ent); + } +} + +static void +argo_pending_find(const struct domain *d, struct argo_ring_info *ring_info, + uint32_t payload_space, struct hlist_head *to_notify) +{ + struct hlist_node *node, *next; + struct argo_pending_ent *ent; + + ASSERT(rw_is_locked(&d->argo->lock)); + + /* + * TODO: Current policy here is to signal _all_ of the waiting domains + * interested in sending a message of size less than payload_space. + * + * This is likely to be suboptimal, since once one of them has added + * their message to the ring, there may well be insufficient room + * available for any of the others to transmit, meaning that they were + * woken in vain, which created extra work just to requeue their wait. + * + * Retain this simple policy for now since it at least avoids starving a + * domain of available space notifications because of a policy that only + * notified other domains instead. Improvement may be possible; + * investigation required. + */ + + spin_lock(&ring_info->lock); + hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node) + { + if ( payload_space >= ent->len ) + { + hlist_del(&ent->node); + hlist_add_head(&ent->node, to_notify); + } + } + spin_unlock(&ring_info->lock); +} + static int argo_pending_queue(struct argo_ring_info *ring_info, domid_t src_id, uint64_t src_cookie, unsigned int len) @@ -890,6 +1025,26 @@ argo_pending_requeue(struct argo_ring_info *ring_info, domid_t src_id, return argo_pending_queue(ring_info, src_id, src_cookie, len); } +static void +argo_pending_cancel(struct argo_ring_info *ring_info, domid_t src_id, + uint64_t src_cookie) +{ + struct hlist_node *node, *next; + struct argo_pending_ent *ent; + + ASSERT(spin_is_locked(&ring_info->lock)); + + hlist_for_each_entry_safe(ent, node, next, &ring_info->pending, node) + { + if ( (ent->domain_id == src_id) && + (ent->domain_cookie == src_cookie) ) + { + hlist_del(&ent->node); + xfree(ent); + } + } +} + static void argo_ring_remove_mfns(const struct domain *d, struct argo_ring_info *ring_info) { @@ -932,6 +1087,110 @@ argo_ring_remove_info(struct domain *d, struct argo_ring_info *ring_info) xfree(ring_info); } +/* ring data */ + +static int +argo_fill_ring_data(const struct domain *src_d, + XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_ent_hnd) +{ + xen_argo_ring_data_ent_t ent; + domid_t src_id; + struct domain *dst_d; + struct argo_ring_info *ring_info; + int ret; + + ASSERT(rw_is_locked(&argo_lock)); + + ret = copy_from_guest_errno(&ent, data_ent_hnd, 1); + if ( ret ) + return ret; + + argo_dprintk("argo_fill_ring_data: ent.ring.domain=%u,ent.ring.port=%x\n", + ent.ring.domain_id, ent.ring.port); + + src_id = src_d->domain_id; + ent.flags = 0; + + dst_d = get_domain_by_id(ent.ring.domain_id); + + if ( dst_d && dst_d->argo ) + { + read_lock(&dst_d->argo->lock); + + ring_info = argo_ring_find_info_by_match(dst_d, ent.ring.port, src_id, + src_d->argo->domain_cookie); + + if ( ring_info ) + { + uint32_t space_avail; + + ent.flags |= XEN_ARGO_RING_DATA_F_EXISTS; + ent.max_message_size = + ring_info->len - sizeof(struct xen_argo_ring_message_header) - + XEN_ARGO_ROUNDUP(1); + + spin_lock(&ring_info->lock); + + space_avail = argo_ringbuf_payload_space(dst_d, ring_info); + + argo_dprintk("argo_fill_ring_data: port=%x space_avail=%u" + " space_wanted=%u\n", + ring_info->id.addr.port, space_avail, + ent.space_required); + + if ( space_avail >= ent.space_required ) + { + argo_pending_cancel(ring_info, src_id, + src_d->argo->domain_cookie); + ent.flags |= XEN_ARGO_RING_DATA_F_SUFFICIENT; + } + else + { + argo_pending_requeue(ring_info, src_id, + src_d->argo->domain_cookie, + ent.space_required); + ent.flags |= XEN_ARGO_RING_DATA_F_PENDING; + } + + spin_unlock(&ring_info->lock); + + if ( space_avail == ent.max_message_size ) + ent.flags |= XEN_ARGO_RING_DATA_F_EMPTY; + + } + read_unlock(&dst_d->argo->lock); + } + + if ( dst_d ) + put_domain(dst_d); + + ret = __copy_field_to_guest_errno(data_ent_hnd, &ent, flags); + if ( ret ) + return ret; + ret = __copy_field_to_guest_errno(data_ent_hnd, &ent, max_message_size); + if ( ret ) + return ret; + + return 0; +} + +static int +argo_fill_ring_data_array(struct domain *currd, int nent, + XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) data_hnd) +{ + int ret = 0; + + ASSERT(rw_is_locked(&argo_lock)); + + while ( !ret && nent-- ) + { + ret = argo_fill_ring_data(currd, data_hnd); + guest_handle_add_offset(data_hnd, 1); + } + + return ret; +} + /* ring */ static int @@ -1499,6 +1758,116 @@ argo_register_ring(struct domain *currd, * io */ +static void +argo_notify_ring(struct domain *d, struct argo_ring_info *ring_info, + struct hlist_head *to_notify) +{ + uint32_t space; + + ASSERT(rw_is_locked(&argo_lock)); + ASSERT(rw_is_locked(&d->argo->lock)); + + spin_lock(&ring_info->lock); + + if ( ring_info->len ) + space = argo_ringbuf_payload_space(d, ring_info); + else + space = 0; + + spin_unlock(&ring_info->lock); + + if ( space ) + argo_pending_find(d, ring_info, space, to_notify); +} + +static void +argo_notify_check_pending(struct domain *currd) +{ + unsigned int i; + HLIST_HEAD(to_notify); + + ASSERT(rw_is_locked(&argo_lock)); + + read_lock(&currd->argo->lock); + + for ( i = 0; i < ARGO_HTABLE_SIZE; i++ ) + { + struct hlist_node *node, *next; + struct argo_ring_info *ring_info; + + hlist_for_each_entry_safe(ring_info, node, next, + &currd->argo->ring_hash[i], node) + { + argo_notify_ring(currd, ring_info, &to_notify); + } + } + + read_unlock(&currd->argo->lock); + + if ( !hlist_empty(&to_notify) ) + argo_pending_notify(&to_notify); +} + +static long +argo_notify(struct domain *currd, + XEN_GUEST_HANDLE_PARAM(xen_argo_ring_data_t) ring_data_hnd) +{ + XEN_GUEST_HANDLE(xen_argo_ring_data_ent_t) ent_hnd; + xen_argo_ring_data_t ring_data; + int ret = 0; + + read_lock(&argo_lock); + + if ( !currd->argo ) + { + argo_dprintk("!d->argo, ENODEV\n"); + ret = -ENODEV; + goto out; + } + + argo_notify_check_pending(currd); + + if ( !guest_handle_is_null(ring_data_hnd) ) + { + ret = copy_from_guest_errno(&ring_data, ring_data_hnd, 1); + if ( ret ) + goto out; + + /* + * Before performing a hypervisor write into guest memory, validate + * that it is memory that the guest expects these writes into by + * checking that the 'magic' field contains the expected value. + */ + if ( ring_data.magic != XEN_ARGO_RING_DATA_MAGIC ) + { + gprintk(XENLOG_ERR, + "argo: incorrect ring_data.magic(%"PRIx64") vs (%llx)\n", + ring_data.magic, XEN_ARGO_RING_DATA_MAGIC); + ret = -EINVAL; + goto out; + } + + if ( ring_data.nent > ARGO_MAX_NOTIFY_COUNT ) + { + gprintk(XENLOG_ERR, + "argo: notify entry count(%u) exceeds max(%u)\n", + ring_data.nent, ARGO_MAX_NOTIFY_COUNT); + ret = -EACCES; + goto out; + } + + ent_hnd = guest_handle_for_field(ring_data_hnd, + xen_argo_ring_data_ent_t, data[0]); + + ret = argo_fill_ring_data_array(currd, ring_data.nent, ent_hnd); + } + + out: + read_unlock(&argo_lock); + + return ret; +} + static long argo_sendv(struct domain *src_d, const xen_argo_addr_t *src_addr, const xen_argo_addr_t *dst_addr, @@ -1704,6 +2073,21 @@ do_argo_message_op(unsigned int cmd, XEN_GUEST_HANDLE_PARAM(void) arg1, break; } + case XEN_ARGO_MESSAGE_OP_notify: + { + XEN_GUEST_HANDLE_PARAM(xen_argo_ring_data_t) ring_data_hnd = + guest_handle_cast(arg1, xen_argo_ring_data_t); + + if ( unlikely((!guest_handle_is_null(arg2)) || arg3 || arg4) ) + { + rc = -EINVAL; + break; + } + + rc = argo_notify(currd, ring_data_hnd); + break; + } + default: rc = -EOPNOTSUPP; break; diff --git a/xen/include/asm-arm/guest_access.h b/xen/include/asm-arm/guest_access.h index 5456d81..fc73572 100644 --- a/xen/include/asm-arm/guest_access.h +++ b/xen/include/asm-arm/guest_access.h @@ -126,6 +126,11 @@ int access_guest_memory_by_ipa(struct domain *d, paddr_t ipa, void *buf, raw_copy_from_guest(_d, _s, sizeof(*_d)); \ }) +/* Errno-returning variant of copy_field_from_guest */ +#define copy_field_from_guest_errno(ptr, hnd, field) \ + (copy_field_from_guest((ptr), (hnd), field) ? \ + -EFAULT : 0) + /* * Pre-validate a guest handle. * Allows use of faster __copy_* functions. diff --git a/xen/include/asm-x86/guest_access.h b/xen/include/asm-x86/guest_access.h index 9176150..09b137a 100644 --- a/xen/include/asm-x86/guest_access.h +++ b/xen/include/asm-x86/guest_access.h @@ -131,6 +131,11 @@ raw_copy_from_guest(_d, _s, sizeof(*_d)); \ }) +/* Errno-returning variant of copy_field_from_guest */ +#define copy_field_from_guest_errno(ptr, hnd, field) \ + (copy_field_from_guest((ptr), (hnd), field) ? \ + -EFAULT : 0) + /* * Pre-validate a guest handle. * Allows use of faster __copy_* functions. diff --git a/xen/include/public/argo.h b/xen/include/public/argo.h index d075930..517f615 100644 --- a/xen/include/public/argo.h +++ b/xen/include/public/argo.h @@ -32,6 +32,7 @@ #include "xen.h" #define XEN_ARGO_RING_MAGIC 0xbd67e163e7777f2fULL +#define XEN_ARGO_RING_DATA_MAGIC 0xcce4d30fbc82e92aULL #define XEN_ARGO_DOMID_ANY DOMID_INVALID /* @@ -130,6 +131,45 @@ typedef struct xen_argo_ring */ #define XEN_ARGO_ROUNDUP(a) (((a) + 0xf) & ~(typeof(a))0xf) +/* + * Notify flags + */ +/* Ring is empty */ +#define XEN_ARGO_RING_DATA_F_EMPTY (1U << 0) +/* Ring exists */ +#define XEN_ARGO_RING_DATA_F_EXISTS (1U << 1) +/* Pending interrupt exists. Do not rely on this field - for profiling only */ +#define XEN_ARGO_RING_DATA_F_PENDING (1U << 2) +/* Sufficient space to queue space_required bytes exists */ +#define XEN_ARGO_RING_DATA_F_SUFFICIENT (1U << 3) + +typedef struct xen_argo_ring_data_ent +{ + xen_argo_addr_t ring; + uint16_t flags; + uint16_t pad; + uint32_t space_required; + uint32_t max_message_size; +} xen_argo_ring_data_ent_t; + +typedef struct xen_argo_ring_data +{ + /* + * Contents of the 'magic' field are inspected to verify that they contain + * an expected value before the hypervisor will perform writes into this + * structure in guest-supplied memory. + */ + uint64_t magic; + uint32_t nent; + uint32_t pad; + uint64_t reserved[4]; +#if defined(__STDC_VERSION__) && __STDC_VERSION__ >= 199901L + xen_argo_ring_data_ent_t data[]; +#elif defined(__GNUC__) + xen_argo_ring_data_ent_t data[0]; +#endif +} xen_argo_ring_data_t; + struct xen_argo_ring_message_header { uint32_t len; @@ -209,6 +249,33 @@ struct xen_argo_ring_message_header */ #define XEN_ARGO_MESSAGE_OP_sendv 5 +/* + * XEN_ARGO_MESSAGE_OP_notify + * + * Asks Xen for information about other rings in the system. + * + * ent->ring is the xen_argo_addr_t of the ring you want information on. + * Uses the same ring matching rules as XEN_ARGO_MESSAGE_OP_sendv. + * + * ent->space_required : if this field is not null then Xen will check + * that there is space in the destination ring for this many bytes of payload. + * If sufficient space is available, it will set XEN_ARGO_RING_DATA_F_SUFFICIENT + * and CANCEL any pending notification for that ent->ring; otherwise it + * will schedule a notification event and the flag will not be set. + * + * These flags are set by Xen when notify replies: + * XEN_ARGO_RING_DATA_F_EMPTY ring is empty + * XEN_ARGO_RING_DATA_F_PENDING notify event is pending *don't rely on this* + * XEN_ARGO_RING_DATA_F_SUFFICIENT sufficient space for space_required is there + * XEN_ARGO_RING_DATA_F_EXISTS ring exists + * + * arg1: XEN_GUEST_HANDLE(xen_argo_ring_data_t) ring_data (may be NULL) + * arg2: NULL + * arg3: 0 (ZERO) + * arg4: 0 (ZERO) + */ +#define XEN_ARGO_MESSAGE_OP_notify 4 + /* The maximum size of a guest message that may be sent on an Argo ring. */ #define XEN_ARGO_MAX_MSG_SIZE ((XEN_ARGO_MAX_RING_SIZE) - \ (sizeof(struct xen_argo_ring_message_header)) - \ diff --git a/xen/include/xlat.lst b/xen/include/xlat.lst index 742b546..53f2f50 100644 --- a/xen/include/xlat.lst +++ b/xen/include/xlat.lst @@ -153,3 +153,5 @@ ? argo_ring argo.h ? argo_iov argo.h ? argo_send_addr argo.h +? argo_ring_data_ent argo.h +? argo_ring_data argo.h -- 2.7.4 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |