[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [RFC PATCH] xen/docs: Document acquire resource interface
This commit creates a new doc to document the acquire resource interface. This is a reference document. Signed-off-by: Matias Ezequiel Vara Larsen <matias.vara@xxxxxxxx> --- RFC: The current document still contains TODOs. I am not really sure why different resources are implemented differently. I would like to understand it better so I can document it and then easily build new resources. I structured the document in two sections but I am not sure if that is the right way to do it. --- .../acquire_resource_reference.rst | 337 ++++++++++++++++++ docs/hypervisor-guide/index.rst | 2 + 2 files changed, 339 insertions(+) create mode 100644 docs/hypervisor-guide/acquire_resource_reference.rst diff --git a/docs/hypervisor-guide/acquire_resource_reference.rst b/docs/hypervisor-guide/acquire_resource_reference.rst new file mode 100644 index 0000000000..a9944aae1d --- /dev/null +++ b/docs/hypervisor-guide/acquire_resource_reference.rst @@ -0,0 +1,337 @@ +.. SPDX-License-Identifier: CC-BY-4.0 + +Acquire resource reference +========================== + +Acquire resource allows you to share a resource between a domain and a dom0 pv +tool. Resources are generally represented by pages that are mapped into the pv +tool memory space. These pages are accessed by Xen and they may or may not be +accessed by the DomU itself. This document describes the api to build pv tools. +The document also describes the software components required to create and +expose a domain's resource. This is not a tutorial or a how-to guide. It merely +describes the machinery that is already described in the code itself. + +.. warning:: + + The code in this document may already be out of date, however it may + be enough to illustrate how the acquire resource interface works. + + +PV tool API +----------- + +This section describes the api to map a resource from a pv tool. The api is based +on the following functions: + +* xenforeignmemory_open() + +* xenforeignmemory_resource_size() + +* xenforeignmemory_map_resource() + +* xenforeignmemory_unmap_resource() + +The ``xenforeignmemory_open()`` function gets the handler that is used by the +rest of the functions: + +.. code-block:: c + + fh = xenforeignmemory_open(NULL, 0); + +The ``xenforeignmemory_resource_size()`` function gets the size of the resource. +For example, in the following code, we get the size of the +``XENMEM_RESOURCE_VMTRACE_BUF``: + +.. code-block:: c + + rc = xenforeignmemory_resource_size(fh, domid, XENMEM_resource_vmtrace_buf, vcpu, &size); + +The size of the resource is returned in ``size`` in bytes. + +The ``xenforeignmemory_map_resource()`` function maps a domain's resource. The +function is declared as follows: + +.. code-block:: c + + xenforeignmemory_resource_handle *xenforeignmemory_map_resource( + xenforeignmemory_handle *fmem, domid_t domid, unsigned int type, + unsigned int id, unsigned long frame, unsigned long nr_frames, + void **paddr, int prot, int flags); + +The size of the resource is in number of frames. For example, **QEMU** uses it +to map the ioreq server between the domain and QEMU: + +.. code-block:: c + + fres = xenforeignmemory_map_resource(xen_fmem, xen_domid, XENMEM_resource_ioreq_server, + state->ioservid, 0, 2, &addr, PROT_READ | PROT_WRITE, 0); + + +The third parameter corresponds with the resource that we request from the +domain, e.g., ``XENMEM_resource_ioreq_server``. The seventh parameter returns a +point-to-pointer to the address of the mapped resource. + +Finally, the ``xenforeignmemory_unmap_resource()`` function unmaps the region: + +.. code-block:: c + :caption: tools/misc/xen-vmtrace.c + + if ( fres && xenforeignmemory_unmap_resource(fh, fres) ) + perror("xenforeignmemory_unmap_resource()"); + +Sharing a resource with a pv tool +--------------------------------- + +In this section, we describe how to build a new resource and share it with a pv +too. Resources are defined in ``xen/include/public/memory.h``. In Xen-4.16, +there are three resources: + +.. code-block:: c + :caption: xen/include/public/memory.h + + #define XENMEM_resource_ioreq_server 0 + #define XENMEM_resource_grant_table 1 + #define XENMEM_resource_vmtrace_buf 2 + +The ``resource_max_frames()`` function returns the size of a resource. The +resource may provide a handler to get the size. This is the definition of the +``resource_max_frame()`` function: + +.. code-block:: c + :linenos: + :caption: xen/common/memory.c + + static unsigned int resource_max_frames(const struct domain *d, + unsigned int type, unsigned int id) + { + switch ( type ) + { + case XENMEM_resource_grant_table: + return gnttab_resource_max_frames(d, id); + + case XENMEM_resource_ioreq_server: + return ioreq_server_max_frames(d); + + case XENMEM_resource_vmtrace_buf: + return d->vmtrace_size >> PAGE_SHIFT; + + default: + return -EOPNOTSUPP; + } + } + +The ``_acquire_resource()`` function invokes the corresponding handler that maps +the resource. The handler relies on ``type`` to select the right handler: + +.. code-block:: c + :linenos: + :caption: xen/common/memory.c + + static int _acquire_resource( + struct domain *d, unsigned int type, unsigned int id, unsigned int frame, + unsigned int nr_frames, xen_pfn_t mfn_list[]) + { + switch ( type ) + { + case XENMEM_resource_grant_table: + return gnttab_acquire_resource(d, id, frame, nr_frames, mfn_list); + + case XENMEM_resource_ioreq_server: + return acquire_ioreq_server(d, id, frame, nr_frames, mfn_list); + + case XENMEM_resource_vmtrace_buf: + return acquire_vmtrace_buf(d, id, frame, nr_frames, mfn_list); + + default: + return -EOPNOTSUPP; + } + } + +Note that if a new resource has to be added, these two functions need to be +modified. These handlers have the common declaration: + +.. code-block:: c + :linenos: + :caption: xen/common/memory.c + + static int acquire_vmtrace_buf( + struct domain *d, unsigned int id, unsigned int frame, + unsigned int nr_frames, xen_pfn_t mfn_list[]) + { + +The function returns in ``mfn_list[]`` a number of ``nr_frames`` of pointers to +mfn pages. For example, for the ``XENMEM_resource_vmtrace_buf`` resource, the +handler is defined as follows: + +.. code-block:: c + :linenos: + :caption: xen/common/memory.c + + static int acquire_vmtrace_buf( + struct domain *d, unsigned int id, unsigned int frame, + unsigned int nr_frames, xen_pfn_t mfn_list[]) + { + const struct vcpu *v = domain_vcpu(d, id); + unsigned int i; + mfn_t mfn; + + if ( !v ) + return -ENOENT; + + if ( !v->vmtrace.pg || + (frame + nr_frames) > (d->vmtrace_size >> PAGE_SHIFT) ) + return -EINVAL; + + mfn = page_to_mfn(v->vmtrace.pg); + + for ( i = 0; i < nr_frames; i++ ) + mfn_list[i] = mfn_x(mfn) + frame + i; + + return nr_frames; + } + +Note that the handler only returns the mfn pages that have been previously +allocated in ``vmtrace.pg``. The allocation of the resource happens during the +instantiation of the vcpu. A set of pages is allocated during the instantiation +of each vcpu. For allocating the page, we use the domheap with the +``MEMF_no_refcount`` flag: + +.. What do we require to set this flag? + +.. code-block:: c + + v->vmtrace.pg = alloc_domheap_page(s->target, MEMF_no_refcount); + +To access the pages in the context of Xen, we are required to map the page by +using: + +.. code-block:: c + + va_page = __map_domain_page_global(page); + +The ``va_page`` pointer is used in the context of Xen. The function that +allocates the pages runs the following verification after allocation. For +example, the following code is from ``vmtrace_alloc_buffer()`` that allocates +the page for vmtrace for a given vcpu: + +.. Why is this verification required after allocation? + +.. code-block:: c + + for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ ) + if ( unlikely(!get_page_and_type(&pg[i], d, PGT_writable_page)) ) + /* + * The domain can't possibly know about this page yet, so failure + * here is a clear indication of something fishy going on. + */ + goto refcnt_err; + +The allocated pages are released by first using ``unmap_domheap_page()`` and +then using ``free_domheap_page()`` to finally release the page. Note that the +releasing of these resources may vary depending on how there are allocated. + +Acquire Resources +----------------- + +This section briefly describes the resources that rely on the acquire resource +interface. These resources are mapped by pv tools like QEMU. + +Intel Processor Trace (IPT) +``````````````````````````` + +This resource is named ``XENMEM_resource_vmtrace_buf`` and its size in bytes is +set in ``d->vmtrace_size``. It contains the traces generated by the IPT. These +traces are generated by each vcpu. The pages are allocated during +``vcpu_create()``. The pages are stored in the ``vcpu`` structure in +``sched.h``: + +.. code-block:: c + + struct { + struct page_info *pg; /* One contiguous allocation of d->vmtrace_size */ + } vmtrace; + +During ``vcpu_create()``, the pg is allocated by using the per-domain heap: + +.. code-block:: c + + pg = alloc_domheap_pages(d, get_order_from_bytes(d->vmtrace_size), MEMF_no_refcount); + +For a given vcpu, the page is loaded into the guest at +``vmx_restore_guest_msrs()``: + +.. code-block:: c + :caption: xen/arch/x86/hvm/vmx/vmx.c + + wrmsrl(MSR_RTIT_OUTPUT_BASE, page_to_maddr(v->vmtrace.pg)); + +The releasing of the pages happens during the vcpu teardown. + +Grant Table +``````````` + +The grant tables are represented by the ``XENMEM_resource_grant_table`` +resource. Grant tables are special since guests can map grant tables. Dom0 also +needs to write into the grant table to set up the grants for xenstored and +xenconsoled. When acquiring the resource, the pages are allocated from the xen +heap in ``gnttab_get_shared_frame_mfn()``: + +.. code-block:: c + :linenos: + :caption: xen/common/grant_table.c + + gt->shared_raw[i] = alloc_xenheap_page() + share_xen_page_with_guest(virt_to_page(gt->shared_raw[i]), d, SHARE_rw); + +Then, pages are shared with the guest. These pages are then converted from virt +to mfn before returning: + +.. code-block:: c + :linenos: + + for ( i = 0; i < nr_frames; ++i ) + mfn_list[i] = virt_to_mfn(vaddrs[frame + i]); + +Ioreq server +```````````` + +The ioreq server is represented by the ``XENMEM_resource_ioreq_server`` +resource. An ioreq server provides emulated devices to HVM and PVH guests. The +allocation is done in ``ioreq_server_alloc_mfn()``. The following code partially +shows the allocation of the pages that represent the ioreq server: + +.. code-block:: c + :linenos: + :caption: xen/common/ioreq.c + + page = alloc_domheap_page(s->target, MEMF_no_refcount); + + iorp->va = __map_domain_page_global(page); + if ( !iorp->va ) + goto fail; + + iorp->page = page; + clear_page(iorp->va); + return 0; + +The function above is invoked from ``ioreq_server_get_frame()`` which is called +from ``acquire_ioreq_server()``. For acquiring, the function returns the +allocated pages as follows: + +.. code-block:: c + + *mfn = page_to_mfn(s->bufioreq.page); + +The ``ioreq_server_free_mfn()`` function releases the pages as follows: + +.. code-block:: c + :linenos: + :caption: xen/common/ioreq.c + + unmap_domain_page_global(iorp->va); + iorp->va = NULL; + + put_page_alloc_ref(page); + put_page_and_type(page); + +.. TODO: Why unmap() and free() are not used instead? diff --git a/docs/hypervisor-guide/index.rst b/docs/hypervisor-guide/index.rst index e4393b0697..961a11525f 100644 --- a/docs/hypervisor-guide/index.rst +++ b/docs/hypervisor-guide/index.rst @@ -9,3 +9,5 @@ Hypervisor documentation code-coverage x86/index + + acquire_resource_reference -- 2.25.1
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |