[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 1/2] xen/memory : Add stats_table resource type


  • To: Matias Ezequiel Vara Larsen <matiasevara@xxxxxxxxx>
  • From: George Dunlap <George.Dunlap@xxxxxxxxxx>
  • Date: Fri, 17 Jun 2022 20:54:34 +0000
  • Accept-language: en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=XG96ehoVod90izL5wMNmxzinFhus3gqZDjAuShcvDNg=; b=gUbpYQbTtWvocH1s8JZ3GRbfDmbk5ivyZTHCzK6/jOEWBpPpL9MI1Afg39Eo/WIEe+VoUFvmaq+f+pJWYYb6OlUspnkh9JcTOwPy5ou44ntGlBc44rDslGrrL+67Sl7QW/Ffc91Jkr7BTBr19Ge0WU7RgKQdzjOwajf5Z8Wc4nDIlj17VLhTZpILFDuFu35a3DXgpnF/AaXp/5V7kFTecgr9k+soq1cllxM4cKy1/BPUZ0UIiZk7vflaE3qXMkgjxpbuKUJUzNNkp2ImSjIJ3HhiNrg52GNghwQhcKBRPEXH/pBeQn3oYJrk/fZvfqDYFI/z6qjT8+PPdogtVATqLw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=U/izcAVbivA92UdrijL39BYQSN2ZKbZhSxSY5jd/SA+l3T7UArUhtGikTHFUec7rhoL4i6r5/nQHVMjt4apQjLs+snOviulSThsiwPdgiWKT9UV4a6IjHDiEVlE7YAhzOFT7qHNSbcB+0y5vijoNf65zOCIZAYbCL/98wW+A930wsTp/+O+YGY7cly9R+89uP5qWyZNdOanfF1I6uYOGdjD29ClWNdZt9hUhPdADTfGoRzZE3O0qIaAc7bom95+IbjCBlyCEQNFiIkDqVgrHjKl+hWgT3w5XwyTZ1ZzZkL0OClhiGrlZVcV0aU77ZBE+uX55EtLHo4H8xtil/2L9hQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Matias Ezequiel Vara Larsen <matias.vara@xxxxxxxx>, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Dario Faggioli <dfaggioli@xxxxxxxx>
  • Delivery-date: Fri, 17 Jun 2022 20:54:55 +0000
  • Ironport-data: A9a23:abribamW5cKF1NaFDc2pFcHo5gzKJkRdPkR7XQ2eYbSJt16W5oA// 9YtKSrfbaHbJie3LscnK96GQXl2uJHXmoVnGQc6+Cg1Hy4U8JDPC9mUJRj5Z3OeIMGTFR854 Z5PY9fLJ5hrRXPQqkbzaLbv/HR32/yBTeOiUrLJakidKeMcpAIJ0HqPzMZl0t4AbaGFPj6wV fPOT+z3YA6r12YlbmsfsPPb9B415aqi6TpF4gJvP/oW4wXXyyNEUJ5HKa+PdHapGYM88sxW5 grgIBNV2kuDon/B3/v8yu6TnnUiG+KUZU7U4pZvc/DKbiJq/0Te6Y5mcqtGAatro2/RxYopl owT7cDYpToBZcUgpsxMC3G0LAkmVUF20Oevza+X6JH7I+XuKhMA8t02ZK0EFdRwFtVfWAmiw ccwOjEVBi1vssrtqF6NpkuAsex4RCXjFNt3VniNVlg1B95+KXzIa/2iCdO1QF7cLy2BdBrTT 5NxVNZhUPjPSxhGFHRIBIB5p8appXbELDp4t1SV5oNitgA/zCQpuFTsGPz8X4XQAOlwwAOfr G+A+HnlCBYHMtDZ0SCC7n+nmu7Im2X8RZ4WE7q7sPVthTV/xERKUEFQCQT9/Kj/0xHgMz5cA xV8Fi4GgqU17kOmCPXgWRmxuFaPvwIGWsoWGOo/gO2I4vWIuF7GXjJfJtJHQO0NrOF1SDYk7 Qeutve1ASBogb6STn3Io994qhv3Y0D5N1QqYCYYTAIe7sfquogbgRfGT9IlG6mw5vXlFDe1z z2UoSwWg7QIkdVNx6i95UrAgT+nut7OVAFdzgDeQmOs9UVnbZSsT5Kh9VXAq/haRK6bRFScu HkPm+CF8fsDS5qKkUSlQvgJHbyvz+aINnvbm1EHN4I66z2n9nqnfIZRyDJzPkFkNoADYzCBS FDXkRNc4tlUJnTCRaN5ao2+CsMuzID7CM/oEPvTa7JzjoNZcQaG+GRiYBCW1mW0ykw0y/hgZ 9GcbNqmCmscBeJ/1j2qSuwB0LgtgCcj2WfUQpO9xBOiuVaDWEOopX4+GAPmRogEAGms+W05L /432xO29ihi
  • Ironport-hdrordr: A9a23:P0pKwqMRaGXjasBcT23155DYdb4zR+YMi2TDiHoddfUFSKalfp 6V98jzjSWE8wr4WBkb6LO90dq7MAnhHP9OkMQs1NKZMDUO11HYS72KgbGC/9SkIVyHygc/79 YtT0EdMqyXMbESt6+Tj2eF+pQbsaC6GcuT9IXjJgJWPGVXgtZbnmJE42igcnFedU1jP94UBZ Cc7s1Iq36LYnIMdPm2AXEDQqzqu8DLvIiOW29JOzcXrC21yR+44r/zFBaVmj0EVSlU/Lsk+W /Z1yTk+6SYte2hwBO07R6T030Woqqg9jJwPr3PtiEnEESotu9uXvUkZ1S2hkF3nAho0idsrD CDmWZnAy050QKtQoj8m2qQ5+Cn6kdg15aq8y7nvVLz5cP+Xz40EMxHmMZQdQbY8VMpuJVm3L tMxH/xjeseMftR9B6NmOQgeisa4HZcm0BS2NL7TkYvI7c2eftUt8gS7UlVGJAPEGbz750mCv BnCIXZ6OxNeV2XYnjFti03qebcFEgbD1ODWAwPq8aV2z9ZkDRwyFYZ3tUWmjMF+IgmQ5dJ6u zYOuBjla1ITMURcaVhbd1xCvefGyjIW1bBIWiSKVPoGOUOPG/MsYf+5PEv6OSjaPUzvewPcV T6ISdlXEIJCjLT4Je1rex2Gzj2MRaAdCWozN1C7J5kvbC5TKb3MES4OSUTr/c=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHYafsc/WlHBvFB/kqWUrR4CBI56K1UROwA
  • Thread-topic: [RFC PATCH 1/2] xen/memory : Add stats_table resource type

Preface: It looks like this may be one of your first hypervisor patches.  So thank you!  FYI there’s a lot that needs fixing up here; please don’t read a tone of annoyance into it, I’m just trying to tell you what needs fixing in the most efficient manner. :-)

On 17 May 2022, at 15:33, Matias Ezequiel Vara Larsen <matiasevara@xxxxxxxxx> wrote:

Allow to map vcpu stats using acquire_resource().

This needs a lot more expansion in terms of what it is that you’re doing in this patch and why.


Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@xxxxxxxx>
---
xen/common/domain.c         | 42 +++++++++++++++++++++++++++++++++++++
xen/common/memory.c         | 29 +++++++++++++++++++++++++
xen/common/sched/core.c     |  5 +++++
xen/include/public/memory.h |  1 +
xen/include/xen/sched.h     |  5 +++++
5 files changed, 82 insertions(+)

diff --git a/xen/common/domain.c b/xen/common/domain.c
index 17cc32fde3..ddd9f88874 100644
--- a/xen/common/domain.c
+++ b/xen/common/domain.c
@@ -132,6 +132,42 @@ static void vcpu_info_reset(struct vcpu *v)
    v->vcpu_info_mfn = INVALID_MFN;
}

+static void stats_free_buffer(struct vcpu * v)
+{
+    struct page_info *pg = v->stats.pg;
+
+    if ( !pg )
+        return;
+
+    v->stats.va = NULL;
+
+    if ( v->stats.va )
+        unmap_domain_page_global(v->stats.va);
+
+    v->stats.va = NULL;

Looks like you meant to delete the first `va->stats.va = NULL` but forgot?

+
+    free_domheap_page(pg);

Pretty sure this will crash.

Unfortunately page allocation and freeing is somewhat complicated and requires a bit of boilerplate.  You can look at the vmtrace_alloc_buffer() code for a template, but the general sequence you want is as follows:

* On the allocate side:

1. Allocate the page

   pg = alloc_domheap_page(d, MEMF_no_refcount);

This will allocate a page with the PGC_allocated bit set and a single reference count.  (If you pass a page with PGC_allocated set to free_domheap_page(), it will crash; which is why I said the above.)

2.  Grab a general reference count for the vcpu struct’s reference to it; as well as a writable type count, so that it doesn’t get used as a special page

if (get_page_and_type(pg, d, PGT_writable_page)) {
  put_page_alloc_ref(p);
  /* failure path */
}

* On the free side, don’t call free_domheap_pages() directly.  Rather, drop the allocation, then drop your own type count, thus:

v->stats.va = NULL;

put_page_alloc_ref(pg);
put_page_and_type(pg);

The issue here is that we can’t free the page until all references have dropped; and the whole point of this exercise is to allow guest userspace in dom0 to gain a reference to the page.  We can’t actually free the page until *all* references are gone, including the userspace one in dom0.  The put_page() which brings the reference count to 0 will automatically free the page.


+}
+
+static int stats_alloc_buffer(struct vcpu *v)
+{
+    struct domain *d = v->domain;
+    struct page_info *pg;
+
+    pg = alloc_domheap_page(d, MEMF_no_refcount);
+
+    if ( !pg )
+        return -ENOMEM;
+
+    v->stats.va = __map_domain_page_global(pg);
+    if ( !v->stats.va )
+        return -ENOMEM;
+
+    v->stats.pg = pg;
+    clear_page(v->stats.va);
+    return 0;
+}

The other thing to say about this is that the memory is being allocated unconditionally, even if nobody is planning to read it.  The vast majority of Xen users will not be running xcp-rrd, so it will be pointless overheard.

At a basic level, you want to follow suit with the vmtrace buffers, which are only allocated if the proper domctl flag is set on domain creation.  (We could consider instead, or in addition, having a global Xen command-line parameter which would enable this feature for all domains.)

+
static void vmtrace_free_buffer(struct vcpu *v)
{
    const struct domain *d = v->domain;
@@ -203,6 +239,9 @@ static int vmtrace_alloc_buffer(struct vcpu *v)
 */
static int vcpu_teardown(struct vcpu *v)
{
+
+    stats_free_buffer(v);
+
    vmtrace_free_buffer(v);

    return 0;
@@ -269,6 +308,9 @@ struct vcpu *vcpu_create(struct domain *d, unsigned int vcpu_id)
    if ( vmtrace_alloc_buffer(v) != 0 )
        goto fail_wq;

+    if ( stats_alloc_buffer(v) != 0 )
+        goto fail_wq;
+
    if ( arch_vcpu_create(v) != 0 )
        goto fail_sched;

diff --git a/xen/common/memory.c b/xen/common/memory.c
index 297b98a562..39de6d9d05 100644
--- a/xen/common/memory.c
+++ b/xen/common/memory.c
@@ -1099,6 +1099,10 @@ static unsigned int resource_max_frames(const struct domain *d,
    case XENMEM_resource_vmtrace_buf:
        return d->vmtrace_size >> PAGE_SHIFT;

+    // WIP: to figure out the correct size of the resource
+    case XENMEM_resource_stats_table:
+        return 1;
+
    default:
        return -EOPNOTSUPP;
    }
@@ -1162,6 +1166,28 @@ static int acquire_vmtrace_buf(
    return nr_frames;
}

+static int acquire_stats_table(struct domain *d,
+                                unsigned int id,
+                                unsigned int frame,
+                                unsigned int nr_frames,
+                                xen_pfn_t mfn_list[])
+{
+    const struct vcpu *v = domain_vcpu(d, id);
+    mfn_t mfn;
+

Maybe I’m paranoid, but I might add an “ASSERT(nr_frames == 1)” here 

+    if ( !v )
+        return -ENOENT;
+
+    if ( !v->stats.pg )
+        return -EINVAL;
+
+    mfn = page_to_mfn(v->stats.pg);
+    mfn_list[0] = mfn_x(mfn);
+
+    printk("acquire_perf_table: id: %d, nr_frames: %d, %p, domainid: %d\n", id, nr_frames, v->stats.pg, d->domain_id);
+    return 1;
+}
+
/*
 * Returns -errno on error, or positive in the range [1, nr_frames] on
 * success.  Returning less than nr_frames contitutes a request for a
@@ -1182,6 +1208,9 @@ static int _acquire_resource(
    case XENMEM_resource_vmtrace_buf:
        return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list);

+    case XENMEM_resource_stats_table:
+        return acquire_stats_table(d, id, frame, nr_frames, mfn_list);
+
    default:
        return -EOPNOTSUPP;
    }
diff --git a/xen/common/sched/core.c b/xen/common/sched/core.c
index 8f4b1ca10d..2a8b534977 100644
--- a/xen/common/sched/core.c
+++ b/xen/common/sched/core.c
@@ -264,6 +264,7 @@ static inline void vcpu_runstate_change(
{
    s_time_t delta;
    struct sched_unit *unit = v->sched_unit;
+    uint64_t * runstate;

    ASSERT(spin_is_locked(get_sched_res(v->processor)->schedule_lock));
    if ( v->runstate.state == new_state )
@@ -287,6 +288,10 @@ static inline void vcpu_runstate_change(
    }

    v->runstate.state = new_state;
+
+    // WIP: use a different interface
+    runstate = (uint64_t*)v->stats.va;

I think you should look at xen.git/xen/include/public/hvm/ioreq.h:shared_iopage_t for inspiration.  Basically, you cast the void pointer to that type, and then just access `iopage->vcpu_ioreq[vcpuid]`.   Put it in a public header, and then both the userspace consumer and Xen can use the same structure.

As I said in my response to the cover letter, I think vcpu_runstate_info_t would be something to look at and gain inspiration from.

The other thing to say here is that this is a hot path; we don’t want to be copying lots of information around if it’s not going to be used.  So you should only allocate the buffers if specifically enabled, and you should only copy the information over if v->stats.va != NULL.

I think that should be enough to start with; we can nail down more when you send v1. 

Peace,
 -George

Attachment: signature.asc
Description: Message signed with OpenPGP


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.