|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [PATCH v4 10/10] docs/guest-guide: document the memory claim hypercalls
Add guest-guide documentation for Xen’s memory-claim mechanism and
the two hypercalls for it to the docs:
- The legacy XENMEM_claim_pages (only for global host-wide claims)
- The new XEN_DOMCTL_claim_memory which adds NUMA-aware claims
Also document the implementation of claims in the hypervisor.
Signed-off-by: Bernhard Kaindl <bernhard.kaindl@xxxxxxxxxx>
---
.readthedocs.yaml | 13 +-
docs/conf.py | 6 +-
.../dom/DOMCTL_claim_memory-classes.mmd | 51 +++++++
.../dom/DOMCTL_claim_memory-seqdia.mmd | 23 ++++
.../dom/DOMCTL_claim_memory-workflow.mmd | 23 ++++
docs/guest-guide/dom/DOMCTL_claim_memory.rst | 125 ++++++++++++++++++
docs/guest-guide/dom/index.rst | 14 ++
docs/guest-guide/index.rst | 23 ++++
docs/guest-guide/mem/XENMEM_claim_pages.rst | 68 ++++++++++
docs/guest-guide/mem/index.rst | 12 ++
docs/hypervisor-guide/index.rst | 5 +
docs/hypervisor-guide/mm/claims.rst | 114 ++++++++++++++++
docs/hypervisor-guide/mm/index.rst | 10 ++
13 files changed, 485 insertions(+), 2 deletions(-)
create mode 100644 docs/guest-guide/dom/DOMCTL_claim_memory-classes.mmd
create mode 100644 docs/guest-guide/dom/DOMCTL_claim_memory-seqdia.mmd
create mode 100644 docs/guest-guide/dom/DOMCTL_claim_memory-workflow.mmd
create mode 100644 docs/guest-guide/dom/DOMCTL_claim_memory.rst
create mode 100644 docs/guest-guide/dom/index.rst
create mode 100644 docs/guest-guide/mem/XENMEM_claim_pages.rst
create mode 100644 docs/guest-guide/mem/index.rst
create mode 100644 docs/hypervisor-guide/mm/claims.rst
create mode 100644 docs/hypervisor-guide/mm/index.rst
diff --git a/.readthedocs.yaml b/.readthedocs.yaml
index d3aff7662ebf..3be7334c7527 100644
--- a/.readthedocs.yaml
+++ b/.readthedocs.yaml
@@ -8,11 +8,22 @@ build:
tools:
python: "latest"
+ nodejs: "20"
jobs:
post_install:
+ # Required for rendering the mermaid diagrams in the offline
+ # documentation (PDF & ePub) formats.
+ - npm install -g @mermaid-js/mermaid-cli
# Instead of needing a separate requirements.txt
- - python -m pip install --upgrade --no-cache-dir sphinx-rtd-theme
+ - >
+ python -m pip install --upgrade --no-cache-dir sphinx-rtd-theme
+ sphinxcontrib-mermaid
sphinx:
configuration: docs/conf.py
+
+# Build PDF & ePub
+formats:
+ - epub
+ - pdf
\ No newline at end of file
diff --git a/docs/conf.py b/docs/conf.py
index 2fb8bafe6589..9316202d3318 100644
--- a/docs/conf.py
+++ b/docs/conf.py
@@ -61,7 +61,11 @@ needs_sphinx = '1.4'
# Add any Sphinx extension module names here, as strings. They can be
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
-extensions = []
+extensions = ['sphinxcontrib.mermaid']
+
+mermaid_init_js = """
+mermaid.initialize({ theme: 'Neo', startOnLoad: true });
+"""
# Add any paths that contain templates here, relative to this directory.
templates_path = ['_templates']
diff --git a/docs/guest-guide/dom/DOMCTL_claim_memory-classes.mmd
b/docs/guest-guide/dom/DOMCTL_claim_memory-classes.mmd
new file mode 100644
index 000000000000..1406a4919442
--- /dev/null
+++ b/docs/guest-guide/dom/DOMCTL_claim_memory-classes.mmd
@@ -0,0 +1,51 @@
+%% SPDX-License-Identifier: CC-BY-4.0
+classDiagram
+
+class xen_domctl {
+ +uint32_t cmd
+ +uint32_t interface_version
+ +uint32_t domain
+ +xen_domctl_claim_memory
+}
+
+class xen_domctl_claim_memory {
+ +memory_claim_t* claims
+ +uint32_t nr_claims
+ +uint32_t pad
+}
+
+class memory_claim_t {
+ +uint64_aligned_t pages
+ +uint32_t node
+ +uint32_t pad
+}
+
+class xc_domain_claim_memory["xc_domain_claim_memory()"] {
+ +xc_interface* xch
+ +uint32_t domid
+ +uint32_t nr_claims
+ +memory_claim_t* claims
+}
+
+class page_alloc_globals["xen/common/page_alloc.c"] {
+ +unsigned long outstanding_claims
+ +unsigned long node_outstanding_claims[]
+}
+
+class claim["DOMCTL_claim_memory"] {
+ +int claim_memory(d, uinfo)
+ +int domain_set_outstanding_pages(d, pages, node)
+}
+
+class domain["struct domain"] {
+ +unsigned_int outstanding_pages
+ +nodeid_t claim_node
+}
+
+xen_domctl_claim_memory o--> memory_claim_t
+xen_domctl o--> xen_domctl_claim_memory
+xc_domain_claim_memory ..> xen_domctl : populates
+xc_domain_claim_memory ..> claim : calls via <tt>do_domctl()</tt>
+claim ..> xen_domctl_claim_memory : reads
+claim ..> domain : sets
+claim ..> page_alloc_globals : updates outstanding claims
diff --git a/docs/guest-guide/dom/DOMCTL_claim_memory-seqdia.mmd
b/docs/guest-guide/dom/DOMCTL_claim_memory-seqdia.mmd
new file mode 100644
index 000000000000..05d688c59f13
--- /dev/null
+++ b/docs/guest-guide/dom/DOMCTL_claim_memory-seqdia.mmd
@@ -0,0 +1,23 @@
+%% SPDX-License-Identifier: CC-BY-4.0
+sequenceDiagram
+
+actor DomainBuilder
+participant OcamlStub as OCaml stub for<br>xc_domain<br>claim_memory
+participant Libxc as xc_domain<br>claim_memory
+participant Domctl as XEN_DOMCTL<br>claim_memory
+#participant DomainLogic as claim_memory
+participant Alloc as domain<br>set<br>outstanding_pages
+
+DomainBuilder->>OcamlStub: claims
+OcamlStub->>OcamlStub: marshall claims -----> OCaml to C
+OcamlStub->>Libxc: claims
+
+Libxc->>Domctl: do_domctl
+
+Domctl->>Domctl: copy_from_guest(claim)
+Domctl->>Domctl: validate claim
+Domctl->>Alloc: set<br>outstanding_pages
+Alloc-->>Domctl: result
+Domctl-->>Libxc: rc
+Libxc-->>OcamlStub: rc
+OcamlStub-->>DomainBuilder: claim_result
\ No newline at end of file
diff --git a/docs/guest-guide/dom/DOMCTL_claim_memory-workflow.mmd
b/docs/guest-guide/dom/DOMCTL_claim_memory-workflow.mmd
new file mode 100644
index 000000000000..372f2bb7a616
--- /dev/null
+++ b/docs/guest-guide/dom/DOMCTL_claim_memory-workflow.mmd
@@ -0,0 +1,23 @@
+%% SPDX-License-Identifier: CC-BY-4.0
+sequenceDiagram
+
+participant Toolstack
+participant Xen
+participant NUMA Node memory
+
+Toolstack->>Xen: XEN_DOMCTL_createdomain
+Toolstack->>Xen: XEN_DOMCTL_max_mem(max_pages)
+
+Toolstack->>Xen: XEN_DOMCTL_claim_memory(pages, node)
+Xen->>NUMA Node memory: Claim pages on node
+Xen-->>Toolstack: Claim granted
+
+Toolstack->>Xen: XEN_DOMCTL_set_nodeaffinity(node)
+
+loop Populate domain memory
+ Toolstack->>Xen: XENMEM_populate_physmap(memflags:node)
+ Xen->>NUMA Node memory: alloc from claimed node
+end
+
+Toolstack->>Xen: XEN_DOMCTL_claim_memory(0, NO_NODE)
+Xen-->>Toolstack: Remaining claims released
diff --git a/docs/guest-guide/dom/DOMCTL_claim_memory.rst
b/docs/guest-guide/dom/DOMCTL_claim_memory.rst
new file mode 100644
index 000000000000..8be37585f02a
--- /dev/null
+++ b/docs/guest-guide/dom/DOMCTL_claim_memory.rst
@@ -0,0 +1,125 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+.. _XEN_DOMCTL_claim_memory:
+
+XEN_DOMCTL_claim_memory
+=======================
+
+This **domctl** command allows a privileged domain to stake a memory claim for
+a domain identical to :ref:`XENMEM_claim_pages`, but with support for
+NUMA-aware memory claims.
+
+A claim entry with a node value of ``XEN_DOMCTL_CLAIM_MEMORY_NO_NODE`` stakes
+a claim for host memory, exactly like :ref:`XENMEM_claim_pages` does.
+
+NUMA-aware memory claims
+------------------------
+
+Memory locality is an important factor for performance in NUMA systems.
+Allocating memory close to the CPU that will use it can reduce latency
+and improve overall performance.
+
+By claiming memory on specific NUMA nodes, toolstacks can ensure that they
+will be able to allocate memory for the domain on those nodes. This is
+particularly beneficial for workloads that are sensitive to memory latency,
+such as in-memory databases.
+
+**Note:** The ABI supports multiple claims for future expansion. At the moment,
+Xen accepts a single claim entry (either a NUMA-aware or host-wide claim).
+
+Implementation notes
+--------------------
+
+As described in :ref:`XENMEM_claim_pages`, Xen keeps track of the number
+of claimed pages in the domain's ``d->outstanding_pages`` counter.
+
+Xen declares a NUMA-aware claim by assigning ``d->claim_node`` to a NUMA node,
+which declares that ``d->outstanding_pages`` is claimed on ``d->claim_node``.
+
+See :ref:`hypervisor-guide` > :ref:`memory_management` > :ref:`memory_claims`
+for more details on the implementation on the details how claims are handled
+by the buddy allocator, and how a toolstack can populate the memory of a domain
+from the claimed node, even if it needs to wait for scrubbing to complete.
+
+Used functions & data structures
+--------------------------------
+
+This diagram illustrates the key functions and data structures involved in the
+implementation of the ``domctl`` hypercall command ``XEN_DOMCTL_claim_memory``:
+
+.. mermaid:: DOMCTL_claim_memory-classes.mmd
+ :caption: Diagram: Function and data relationships of XEN_DOMCTL_claim_memory
+
+Call sequence diagram
+---------------------
+
+The following sequence diagram illustrates the call flow for claiming memory
+for a domain using this hypercall command from an OCaml toolstack:
+
+.. mermaid:: DOMCTL_claim_memory-seqdia.mmd
+ :caption: Sequence diagram: Call flow for claiming memory for a domain
+
+Claim workflow
+--------------
+
+The following diagram illustrates a workflow for claiming and populating
memory:
+
+.. mermaid:: DOMCTL_claim_memory-workflow.mmd
+ :caption: Workflow diagram: Claiming and populating memory for a domain
+
+API example (libxc)
+-------------------
+The following example demonstrates how a toolstack can claim memory before
+building the domain and then releasing the claim once the memory population
+is complete.
+
+Note: ``memory_claim_t`` contains padding to allow for future expansion.
+Thus, the structure must be zero-initialised to ensure forward compatibility.
+This can be achieved by using the ``XEN_NODE_CLAIM_INIT`` macro, which sets the
+pages and node fields while zero-initialising the padding of the structure,
+zero-initialising the entire structure, or by using a compound literal with
+designated initialisers to set the pages and node fields while
zero-initialising
+the padding of the structure.
+
+.. code-block:: C
+
+ #include <xenctrl.h>
+
+ int claim_guest_memory(xc_interface *xch, uint32_t domid,
+ uint64_t pages)
+ {
+ memory_claim_t claim[] = {
+ /*
+ * Example 1:
+ * Uses the ``XEN_NODE_CLAIM_INIT`` macro to zero-initialise the
padding
+ * and set the pages and node fields for a NUMA-aware claim on node 0.
+ */
+ XEN_NODE_CLAIM_INIT(pages, 0) /* Claim memory on NUMA node 0 */
+ };
+
+ /* Claim memory from NUMA node 0 for the domain build. */
+ return xc_domain_claim_memory(xch, domid, 1, claim);
+ }
+
+ int release_claim(xc_interface *xch, uint32_t domid)
+ {
+ memory_claim_t claim[] = {
+ /*
+ * Example 2:
+ * Uses a compound literal with designated initialisers to set the
+ * fields to release the claim while zero-initialising the rest
+ * of the structure for forward compatibility.
+ */
+ (memory_claim_t){
+ /*
+ * pages == 0 releases any outstanding claim.
+ * The node field is not used in this case, but must be set to
+ * XEN_DOMCTL_CLAIM_MEMORY_NO_NODE for forward compatibility.
+ */
+ .pages = 0,
+ .node = XEN_DOMCTL_CLAIM_MEMORY_NO_NODE,
+ }
+ };
+
+ /* Release any remaining claim once population is done. */
+ return xc_domain_claim_memory(xch, domid, 1, claim);
+ }
diff --git a/docs/guest-guide/dom/index.rst b/docs/guest-guide/dom/index.rst
new file mode 100644
index 000000000000..445ccf599047
--- /dev/null
+++ b/docs/guest-guide/dom/index.rst
@@ -0,0 +1,14 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Domctl Hypercall
+================
+
+Through domctl hypercalls, toolstacks in privileged domains can perform
+operations related to domain management. This includes operations such as
+creating, destroying, and modifying domains, as well as querying domain
+information.
+
+.. toctree::
+ :maxdepth: 2
+
+ DOMCTL_claim_memory
diff --git a/docs/guest-guide/index.rst b/docs/guest-guide/index.rst
index 5455c67479cf..d9611cd7504d 100644
--- a/docs/guest-guide/index.rst
+++ b/docs/guest-guide/index.rst
@@ -3,6 +3,29 @@
Guest documentation
===================
+Xen exposes a set of hypercalls that allow domains and toolstacks in
+privileged contexts (such as Dom0) to request services from the hypervisor.
+
+Through these hypercalls, privileged domains can perform privileged operations
+such as querying system information, memory and domain management,
+and enabling inter-domain communication via shared memory and event channels.
+
+These hypercalls are documented in the following sections, grouped by their
+functionality. Each section provides an overview of the hypercalls, their
+parameters, and examples of how to use them.
+
+Hypercall API documentation
+---------------------------
+
+.. toctree::
+ :maxdepth: 2
+
+ dom/index
+ mem/index
+
+Hypercall ABI documentation
+---------------------------
+
.. toctree::
:maxdepth: 2
diff --git a/docs/guest-guide/mem/XENMEM_claim_pages.rst
b/docs/guest-guide/mem/XENMEM_claim_pages.rst
new file mode 100644
index 000000000000..7d465d2a87fe
--- /dev/null
+++ b/docs/guest-guide/mem/XENMEM_claim_pages.rst
@@ -0,0 +1,68 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+.. _XENMEM_claim_pages:
+
+XENMEM_claim_pages
+==================
+
+This **xenmem** command allows a privileged guest to stake a memory claim for a
+domain, identical to :ref:`XEN_DOMCTL_claim_memory`, but without support for
+NUMA-aware memory claims.
+
+Memory claims in Xen
+--------------------
+
+The Xen hypervisor maintains a counter of outstanding pages for each domain
+which maintains a number of pages claimed, but not allocated for that domain.
+
+If the outstanding pages counter is zero, this hypercall allows a privileged
+guest to stake a claim for a specified number of pages of system memory for the
+domain.
+
+If the claim is successful, Xen updates the domain's outstanding pages counter
+to reflect the new claim, Xen allocates from the pool of claimed memory only
+for allocations for domains with a claim for this memory.
+
+A domain builder (toolstack in a privileged domain) building the domain can
then
+allocate the guest memory for the domain, which converts the outstanding claim
+into actual memory of the new domain, backed by physical pages.
+
+Note that the resulting claim is relative to the already allocated pages for
the
+domain, so the **pages** argument of this hypercall is absolute and must
+correspond to the total number expected to be allocated for the domain,
+and not incremental to the already allocated pages.
+
+Memory allocations by Xen for the domain also consume the claim, so toolstacks
+should stake a claim that is larger than the guest memory requirement to
+account for Xen's own memory usage. The exact amount of extra memory required
+depends on the configuration and features used by the domain, the host
+architecture and the features enabled by the Xen hypervisor on the host.
+
+Life-cycle of a claim
+---------------------
+
+The Domain's maximum memory limit must be set prior to staking a claim as
+the sum of the already allocated pages and the claim must be within that limit.
+
+To release the claim after the domain build is complete, call this hypercall
+command with the pages argument set to zero. This releases any remaining claim.
+`libxenguest` does this after the guest memory has been allocated for the
domain
+and Xen does this also when it kills the domain.
+
+API example (libxc)
+-------------------
+The following example demonstrates how a toolstack can claim memory before
+building the domain and then releasing the claim once the memory population
+is complete.
+
+.. code-block:: C
+
+ #include <xenctrl.h>
+ ...
+ /* Claim memory for the domain build. */
+ int ret = xc_domain_claim_pages(xch, domid, nr_pages);
+
+ /* Build the domain and allocate memory for it. */
+ ...
+
+ /* Release any remaining claim after populating the domain memory. */
+ int ret = xc_domain_claim_pages(xch, domid, 0);
diff --git a/docs/guest-guide/mem/index.rst b/docs/guest-guide/mem/index.rst
new file mode 100644
index 000000000000..dabd1fd0153e
--- /dev/null
+++ b/docs/guest-guide/mem/index.rst
@@ -0,0 +1,12 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+
+Memctl Hypercall
+================
+
+The memctl hypercall interface allows guests to perform various control
+operations related to memory management.
+
+.. toctree::
+ :maxdepth: 2
+
+ XENMEM_claim_pages
diff --git a/docs/hypervisor-guide/index.rst b/docs/hypervisor-guide/index.rst
index 520fe01554ab..fef35a1ac4fe 100644
--- a/docs/hypervisor-guide/index.rst
+++ b/docs/hypervisor-guide/index.rst
@@ -1,12 +1,17 @@
.. SPDX-License-Identifier: CC-BY-4.0
+.. _hypervisor-guide:
Hypervisor documentation
========================
+See :ref:`memory_claims` for more details on the implementation of the claims
+mechanism in the Hypervisor and its interaction with the buddy allocator.
+
.. toctree::
:maxdepth: 2
code-coverage
+ mm/index
x86/index
arm/index
\ No newline at end of file
diff --git a/docs/hypervisor-guide/mm/claims.rst
b/docs/hypervisor-guide/mm/claims.rst
new file mode 100644
index 000000000000..97eb8a68fb1e
--- /dev/null
+++ b/docs/hypervisor-guide/mm/claims.rst
@@ -0,0 +1,114 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+.. _memory_claims:
+
+Memory Claims
+=============
+
+Overview
+--------
+
+Xen's page allocator supports a **claims** mechanism that allows a domain
+builder to reserve memory before allocation begins, preventing concurrent
+allocations from exhausting available pages mid-build.
+A claim can be global (host-wide) or target a specific NUMA node, ensuring
+that a domain's memory is allocated locally on the same node as its vCPUs.
+
+The host-wide claims check subtracts global claims from total available pages.
+If the domain has claims, its ``d->outstanding_pages`` are added back as
+available (simplified pseudo-code):
+
+.. code:: C
+
+ ASSERT(spin_is_locked(&heap_lock));
+ unsigned long global_avail = total_avail_pages - outstanding_claims
+ + d->outstanding_pages;
+ return alloc_request <= global_avail;
+
+Similarly, the per-node check enforces node-level claims by subtracting
+outstanding node claims from available node pages, and adding back the domain's
+claim if allocating from the claimed node:
+
+.. code:: C
+
+ ASSERT(spin_is_locked(&heap_lock));
+ unsigned long avail = node_avail_pages(node)
+ - node_outstanding_claims(node)
+ + (node == d->claim_node ? d->outstanding_pages : 0);
+ return alloc_request <= avail;
+
+Simplified pseudo-code for the claims checks in the buddy allocator:
+
+.. code:: C
+
+ struct page_info *get_free_buddy(order, memflags, d) {
+ for ( ; ; ) {
+ node = preferred_node_or_next_node();
+ if (!node_allocatable_request(d, memflags, 1 << order, node))
+ goto try_next_node;
+ /* Find a zone on this node with a suitable buddy */
+ for (zone = highest_zone; zone >= lowest_zone; zone--)
+ for (j = order; j <= MAX_ORDER; j++)
+ if (pg = remove_head(&heap(node, zone, j)))
+ return pg;
+ try_next_node:
+ if (req_node != NUMA_NO_NODE && memflags & MEMF_exact_node)
+ return NULL;
+ /* Fall back to the next node and repeat. */
+ }
+ }
+
+ struct page_info *alloc_heap_pages(d, order, memflags) {
+ if (!host_allocatable_request(d, memflags, 1 << order))
+ return NULL;
+ pg = get_free_buddy(order, memflags, d);
+ if (!pg) /* Retry allowing unscrubbed pages */
+ pg = get_free_buddy(order, memflags|MEMF_no_scrub, d);
+ if (!pg)
+ return NULL;
+ if (pg has dirty pages)
+ scrub_dirty_pages(pg);
+ return pg;
+ }
+
+.. note:: The first ``get_free_buddy()`` pass skips unscrubbed pages and may
+ fall back to other nodes. With ``memflags & MEMF_exact_node``, no fallback
+ occurs, so the first pass may return ``NULL``.
+ The 2nd pass with ``MEMF_no_scrub`` will consider the unscrubbed pages.
+ ``alloc_heap_pages()`` then scrubs them before returning, guaranteeing the
+ domain gets the desired node-local pages even when scrubbing is pending.
+
+ Therefore, toolstacks should set ``MEMF_exact_node`` in ``memflags`` when
+ allocating for a domain with a NUMA-aware claim to with
+ ``XENMEMF_exact_node(node)``.
+
+ For efficient scrubbing, toolstacks might want to run domain builds
+ pinned on a CPU of the target NUMA node to scrub the pages on that node
+ without cross-node traffic and lower latency to speed up domain build.
+
+Data Structures
+---------------
+
+The following diagram shows the relationships between global, per-node,
+and per-domain claim counters, all protected by the global ``heap_lock``.
+
+.. mermaid::
+
+ graph TB
+ subgraph "Protected by the heap_lock"
+ direction TB
+ Global --Sum of--> Per-node
+ Per-node --Sum of--> Per-domain
+ end
+ subgraph Per-domain
+ direction LR
+ claim_node["d->claim_node"]
+ claim_node --claims on--> outstanding_pages["d->outstanding_pages"]
+ end
+ subgraph Per-node
+ direction LR
+ node_outstanding_claims--constrains-->node_avail_pages
+ end
+ subgraph Global
+ direction LR
+ outstanding_claims--constrains-->total_avail_pages
+ end
diff --git a/docs/hypervisor-guide/mm/index.rst
b/docs/hypervisor-guide/mm/index.rst
new file mode 100644
index 000000000000..9b5d60e3181a
--- /dev/null
+++ b/docs/hypervisor-guide/mm/index.rst
@@ -0,0 +1,10 @@
+.. SPDX-License-Identifier: CC-BY-4.0
+.. _memory_management:
+
+Memory Management
+=================
+
+.. toctree::
+ :maxdepth: 2
+
+ claims
--
2.39.5
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |