[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[RFC PATCH] xen/docs: Document acquire resource interface



This commit creates a new doc to document the acquire resource interface. This
is a reference document.

Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@xxxxxxxx>
---
RFC: The current document still contains TODOs. I am not really sure why
different resources are implemented differently. I would like to understand it
better so I can document it and then easily build new resources. I structured
the document in two sections but I am not sure if that is the right way to do
it.

---
 .../acquire_resource_reference.rst            | 337 ++++++++++++++++++
 docs/hypervisor-guide/index.rst               |   2 +
 2 files changed, 339 insertions(+)
 create mode 100644 docs/hypervisor-guide/acquire_resource_reference.rst

diff --git a/docs/hypervisor-guide/acquire_resource_reference.rst 
b/docs/hypervisor-guide/acquire_resource_reference.rst
new file mode 100644
index 0000000000..a9944aae1d
--- /dev/null
+++ b/docs/hypervisor-guide/acquire_resource_reference.rst
@@ -0,0 +1,337 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Acquire resource reference
+==========================
+
+Acquire resource allows you to share a resource between a domain and a dom0 pv
+tool.  Resources are generally represented by pages that are mapped into the pv
+tool memory space. These pages are accessed by Xen and they may or may not be
+accessed by the DomU itself. This document describes the api to build pv tools.
+The document also describes the software components required to create and
+expose a domain's resource. This is not a tutorial or a how-to guide. It merely
+describes the machinery that is already described in the code itself.
+
+.. warning::
+
+    The code in this document may already be out of date, however it may
+    be enough to illustrate how the acquire resource interface works.
+
+
+PV tool API
+-----------
+
+This section describes the api to map a resource from a pv tool. The api is 
based
+on the following functions:
+
+* xenforeignmemory_open()
+
+* xenforeignmemory_resource_size()
+
+* xenforeignmemory_map_resource()
+
+* xenforeignmemory_unmap_resource()
+
+The ``xenforeignmemory_open()`` function gets the handler that is used by the
+rest of the functions:
+
+.. code-block:: c
+
+   fh = xenforeignmemory_open(NULL, 0);
+
+The ``xenforeignmemory_resource_size()`` function gets the size of the 
resource.
+For example, in the following code, we get the size of the
+``XENMEM_RESOURCE_VMTRACE_BUF``:
+
+.. code-block:: c
+
+    rc = xenforeignmemory_resource_size(fh, domid, 
XENMEM_resource_vmtrace_buf, vcpu, &size);
+
+The size of the resource is returned in ``size`` in bytes.
+
+The ``xenforeignmemory_map_resource()`` function maps a domain's resource. The
+function is declared as follows:
+
+.. code-block:: c
+
+    xenforeignmemory_resource_handle *xenforeignmemory_map_resource(
+        xenforeignmemory_handle *fmem, domid_t domid, unsigned int type,
+        unsigned int id, unsigned long frame, unsigned long nr_frames,
+        void **paddr, int prot, int flags);
+
+The size of the resource is in number of frames. For example, **QEMU** uses it
+to map the ioreq server between the domain and QEMU:
+
+.. code-block:: c
+
+    fres = xenforeignmemory_map_resource(xen_fmem, xen_domid, 
XENMEM_resource_ioreq_server,
+         state->ioservid, 0, 2, &addr, PROT_READ | PROT_WRITE, 0);
+
+
+The third parameter corresponds with the resource that we request from the
+domain, e.g., ``XENMEM_resource_ioreq_server``. The seventh parameter returns a
+point-to-pointer to the address of the mapped resource.
+
+Finally, the ``xenforeignmemory_unmap_resource()`` function unmaps the region:
+
+.. code-block:: c
+    :caption: tools/misc/xen-vmtrace.c
+
+    if ( fres && xenforeignmemory_unmap_resource(fh, fres) )
+        perror("xenforeignmemory_unmap_resource()");
+
+Sharing a resource with a pv tool
+---------------------------------
+
+In this section, we describe how to build a new resource and share it with a pv
+too. Resources are defined in ``xen/include/public/memory.h``. In Xen-4.16,
+there are three resources:
+
+.. code-block:: c
+    :caption: xen/include/public/memory.h
+
+    #define XENMEM_resource_ioreq_server 0
+    #define XENMEM_resource_grant_table 1
+    #define XENMEM_resource_vmtrace_buf 2
+
+The ``resource_max_frames()`` function returns the size of a resource. The
+resource may provide a handler to get the size. This is the definition of the
+``resource_max_frame()`` function:
+
+.. code-block:: c
+    :linenos:
+    :caption: xen/common/memory.c
+
+    static unsigned int resource_max_frames(const struct domain *d,
+                                            unsigned int type, unsigned int id)
+    {
+        switch ( type )
+        {
+        case XENMEM_resource_grant_table:
+            return gnttab_resource_max_frames(d, id);
+
+        case XENMEM_resource_ioreq_server:
+            return ioreq_server_max_frames(d);
+
+        case XENMEM_resource_vmtrace_buf:
+            return d->vmtrace_size >> PAGE_SHIFT;
+
+        default:
+            return -EOPNOTSUPP;
+        }
+    }
+
+The ``_acquire_resource()`` function invokes the corresponding handler that 
maps
+the resource. The handler relies on ``type`` to select the right handler:
+
+.. code-block:: c
+    :linenos:
+    :caption: xen/common/memory.c
+
+    static int _acquire_resource(
+        struct domain *d, unsigned int type, unsigned int id, unsigned int 
frame,
+        unsigned int nr_frames, xen_pfn_t mfn_list[])
+    {
+        switch ( type )
+        {
+        case XENMEM_resource_grant_table:
+            return gnttab_acquire_resource(d, id, frame, nr_frames, mfn_list);
+
+        case XENMEM_resource_ioreq_server:
+            return acquire_ioreq_server(d, id, frame, nr_frames, mfn_list);
+
+        case XENMEM_resource_vmtrace_buf:
+            return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list);
+
+        default:
+            return -EOPNOTSUPP;
+        }
+    }
+
+Note that if a new resource has to be added, these two functions need to be
+modified. These handlers have the common declaration:
+
+.. code-block:: c
+    :linenos:
+    :caption: xen/common/memory.c
+
+    static int acquire_vmtrace_buf(
+        struct domain *d, unsigned int id, unsigned int frame,
+        unsigned int nr_frames, xen_pfn_t mfn_list[])
+    {
+
+The function returns in ``mfn_list[]`` a number of ``nr_frames`` of pointers to
+mfn pages. For example, for the ``XENMEM_resource_vmtrace_buf`` resource, the
+handler is defined as follows:
+
+.. code-block:: c
+    :linenos:
+    :caption: xen/common/memory.c
+
+    static int acquire_vmtrace_buf(
+        struct domain *d, unsigned int id, unsigned int frame,
+        unsigned int nr_frames, xen_pfn_t mfn_list[])
+    {
+        const struct vcpu *v = domain_vcpu(d, id);
+        unsigned int i;
+        mfn_t mfn;
+
+        if ( !v )
+            return -ENOENT;
+
+        if ( !v->vmtrace.pg ||
+             (frame + nr_frames) > (d->vmtrace_size >> PAGE_SHIFT) )
+            return -EINVAL;
+
+        mfn = page_to_mfn(v->vmtrace.pg);
+
+        for ( i = 0; i < nr_frames; i++ )
+            mfn_list[i] = mfn_x(mfn) + frame + i;
+
+        return nr_frames;
+    }
+
+Note that the handler only returns the mfn pages that have been previously
+allocated in ``vmtrace.pg``. The allocation of the resource happens during the
+instantiation of the vcpu. A set of pages is allocated during the instantiation
+of each vcpu. For allocating the page, we use the domheap with the
+``MEMF_no_refcount`` flag:
+
+.. What do we require to set this flag?
+
+.. code-block:: c
+
+    v->vmtrace.pg = alloc_domheap_page(s->target, MEMF_no_refcount);
+
+To access the pages in the context of Xen, we are required to map the page by
+using:
+
+.. code-block:: c
+
+    va_page = __map_domain_page_global(page);
+
+The ``va_page`` pointer is used in the context of Xen. The function that
+allocates the pages runs the following verification after allocation. For
+example, the following code is from ``vmtrace_alloc_buffer()`` that allocates
+the page for vmtrace for a given vcpu:
+
+.. Why is this verification required after allocation?
+
+.. code-block:: c
+
+    for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ )
+        if ( unlikely(!get_page_and_type(&pg[i], d, PGT_writable_page)) )
+            /*
+             * The domain can't possibly know about this page yet, so failure
+             * here is a clear indication of something fishy going on.
+             */
+            goto refcnt_err;
+
+The allocated pages are released by first using ``unmap_domheap_page()`` and
+then using ``free_domheap_page()`` to finally release the page. Note that the
+releasing of these resources may vary depending on how there are allocated.
+
+Acquire Resources
+-----------------
+
+This section briefly describes the resources that rely on the acquire resource
+interface. These resources are mapped by pv tools like QEMU.
+
+Intel Processor Trace (IPT)
+```````````````````````````
+
+This resource is named ``XENMEM_resource_vmtrace_buf`` and its size in bytes is
+set in ``d->vmtrace_size``. It contains the traces generated by the IPT. These
+traces are generated by each vcpu. The pages are allocated during
+``vcpu_create()``. The pages are stored in the ``vcpu`` structure in
+``sched.h``:
+
+.. code-block:: c
+
+   struct {
+        struct page_info *pg; /* One contiguous allocation of d->vmtrace_size 
*/
+    } vmtrace;
+
+During ``vcpu_create()``, the pg is allocated by using the per-domain heap:
+
+.. code-block:: c
+
+    pg = alloc_domheap_pages(d, get_order_from_bytes(d->vmtrace_size), 
MEMF_no_refcount);
+
+For a given vcpu, the page is loaded into the guest at
+``vmx_restore_guest_msrs()``:
+
+.. code-block:: c
+    :caption: xen/arch/x86/hvm/vmx/vmx.c
+
+    wrmsrl(MSR_RTIT_OUTPUT_BASE, page_to_maddr(v->vmtrace.pg));
+
+The releasing of the pages happens during the vcpu teardown.
+
+Grant Table
+```````````
+
+The grant tables are represented by the ``XENMEM_resource_grant_table``
+resource. Grant tables are special since guests can map grant tables. Dom0 also
+needs to write into the grant table to set up the grants for xenstored and
+xenconsoled. When acquiring the resource, the pages are allocated from the xen
+heap in ``gnttab_get_shared_frame_mfn()``:
+
+.. code-block:: c
+    :linenos:
+    :caption: xen/common/grant_table.c
+
+    gt->shared_raw[i] = alloc_xenheap_page()
+    share_xen_page_with_guest(virt_to_page(gt->shared_raw[i]), d, SHARE_rw);
+
+Then, pages are shared with the guest. These pages are then converted from virt
+to mfn before returning:
+
+.. code-block:: c
+    :linenos:
+
+    for ( i = 0; i < nr_frames; ++i )
+         mfn_list[i] = virt_to_mfn(vaddrs[frame + i]);
+
+Ioreq server
+````````````
+
+The ioreq server is represented by the ``XENMEM_resource_ioreq_server``
+resource. An ioreq server provides emulated devices to HVM and PVH guests. The
+allocation is done in ``ioreq_server_alloc_mfn()``. The following code 
partially
+shows the allocation of the pages that represent the ioreq server:
+
+.. code-block:: c
+    :linenos:
+    :caption: xen/common/ioreq.c
+
+    page = alloc_domheap_page(s->target, MEMF_no_refcount);
+
+    iorp->va = __map_domain_page_global(page);
+    if ( !iorp->va )
+        goto fail;
+
+    iorp->page = page;
+    clear_page(iorp->va);
+    return 0;
+
+The function above is invoked from ``ioreq_server_get_frame()`` which is called
+from ``acquire_ioreq_server()``. For acquiring, the function returns the
+allocated pages as follows:
+
+.. code-block:: c
+
+    *mfn = page_to_mfn(s->bufioreq.page);
+
+The ``ioreq_server_free_mfn()`` function releases the pages as follows:
+
+.. code-block:: c
+    :linenos:
+    :caption: xen/common/ioreq.c
+
+    unmap_domain_page_global(iorp->va);
+    iorp->va = NULL;
+
+    put_page_alloc_ref(page);
+    put_page_and_type(page);
+
+.. TODO: Why unmap() and free() are not used instead?
diff --git a/docs/hypervisor-guide/index.rst b/docs/hypervisor-guide/index.rst
index e4393b0697..961a11525f 100644
--- a/docs/hypervisor-guide/index.rst
+++ b/docs/hypervisor-guide/index.rst
@@ -9,3 +9,5 @@ Hypervisor documentation
    code-coverage
 
    x86/index
+
+   acquire_resource_reference
-- 
2.25.1




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.