[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH v3 2/7] docs: Improve documentation and parsing for iommu=



Update parse_iommu_param() to uniformly use parse_boolean(), so the sub
booleans behave like other Xen boolean options.  Reposition the
custom_param() to avoid a forward declaration of parse_iommu_param().

Rewrite the command line documentation almost from scratch, including
far more detail.

Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
---
CC: Jan Beulich <JBeulich@xxxxxxxx>
CC: Wei Liu <wei.liu2@xxxxxxxxxx>
CC: Roger Pau Monné <roger.pau@xxxxxxxxxx>
CC: Stefano Stabellini <sstabellini@xxxxxxxxxx>
CC: Julien Grall <julien.grall@xxxxxxx>
CC: Juergen Gross <jgross@xxxxxxxx>

v3:
 * New
---
 docs/misc/xen-command-line.pandoc | 153 ++++++++++++++++++++------------------
 xen/drivers/passthrough/iommu.c   |  63 +++++-----------
 2 files changed, 99 insertions(+), 117 deletions(-)

diff --git a/docs/misc/xen-command-line.pandoc 
b/docs/misc/xen-command-line.pandoc
index 243193d..ab486e0 100644
--- a/docs/misc/xen-command-line.pandoc
+++ b/docs/misc/xen-command-line.pandoc
@@ -1146,104 +1146,113 @@ detection of systems known to misbehave upon accesses 
to that port.
 > Default: `new` unless directed-EOI is supported
 
 ### iommu
-> `= List of [ <boolean> | force | required | intremap | intpost | qinval | 
snoop | sharept | dom0-passthrough | dom0-strict | amd-iommu-perdev-intremap | 
workaround_bios_bug | igfx | verbose | debug ]`
+    = List of [ <bool>, verbose, debug, force, required,
+                sharept, intremap, intpost,
+                snoop, qinval, igfx, workaround_bios_bug,
+                amd-iommu-perdev-intremap,
+                dom0-{passthrough,strict} ]
 
-> Sub-options:
+    All sub-options are boolean in nature.
 
-> `<boolean>`
+I/O Memory Memory Units perform a function similar to the CPU MMU (hence the
+name), but typically exist as a discrete device, integrated as part of a PCI
+Root Complex.  The most common configuration is to have one IOMMU per package
+(for on-die PCIe devices and directly attached PCIe lanes), and one IOMMU
+covering the remaining I/O in the system.
 
-> Default: `on`
-
->> Control the use of IOMMU(s) in the system.
-
-> All other sub-options are of boolean kind and can be prefixed with `no-` to
-> effect the inverse meaning.
-
-> `force` or `required`
+The functionality in an IOMMU commonly falls into two orthogonal categories:
 
-> Default: `false`
-
->> Don't continue booting unless IOMMU support is found and can be initialized
->> successfully.
+1.  DMA remapping which uses a pagetable-like hierarchical structure and maps
+    I/O Virtual Addresses (DFNs - Device Frame Numbers in Xen's terminology)
+    to System Physical Addresses (MFNs - Machine Frame Numbers in Xen's
+    terminology).
 
-> `intremap`
+2.  Interrupt Remapping, which controls incoming Message Signalled Interrupt
+    requests, including their routing to specific CPUs.
 
-> Default: `true`
+IOMMU functionality can be used either to provide a translation which the
+hardware device driver isn't aware of (e.g. PCI Passthrough and a native
+driver inside the guest) or to enforce fine-grained control over the memory
+and interrupts which a device is attempting to access.
 
->> Control the use of interrupt remapping (DMA remapping will always be enabled
->> if IOMMU functionality is enabled).
+By default, IOMMUs are configured for use if they are available.  An overall
+boolean (e.g. `iommu=no`) can override this and leave the IOMMUs disabled.
 
-> `intpost`
+*   The `verbose` and `debug` booleans can be used to print additional
+    diagnostic information.  Neither are active by default.
 
-> Default: `false`
+*   The `force` and `required` booleans are synonymous and, when requested, 
will
+    prevent Xen from booting if IOMMUs aren't discovered and enabled
+    successfully.
 
->> Control the use of interrupt posting, which depends on the availability of
->> interrupt remapping.
-
-> `qinval` (VT-d)
-
-> Default: `true`
-
->> Control the use of Queued Invalidation.
-
-> `snoop` (Intel)
-
-> Default: `true`
+*   The `sharept` boolean controls whether the IOMMU pagetables are shared with
+    the CPU-side HAP pagetables, or allocated separately.  Sharing reduces the
+    memory overhead, but doesn't work in combination with CPU-side
+    pagefault-based features, e.g. dirty VRAM tracking when a PCI device is
+    assigned.
 
->> Control the use of Snoop Control.
-
-> `sharept`
-
-> Default: `true`
-
->> Control whether CPU and IOMMU page tables should be shared.
-
-> `dom0-passthrough`
-
-> **WARNING: This command line option is deprecated, and superseded by
-> _dom0-iommu=passthrough_ - using both options in combination is undefined.**
-
-> `dom0-strict`
+    Due to implementation choices, sharing pagetables doesn't work on AMD
+    hardware, and this option is ignored.  It is enabled by default on Intel
+    systems.
 
-> **WARNING: This command line option is deprecated, and superseded by
-> _dom0-iommu=strict_ - using both options in combination is undefined.**
+    This option is ignored on ARM, and the pagetables are always shared.
 
-> `amd-iommu-perdev-intremap`
+*   The `intremap` boolean controls the Interrupt Remapping sub-feature, and is
+    active by default on compatible hardware.  On x86 systems, the first
+    generation of IOMMUs only supported DMA remapping, and Interrupt Remapping
+    appeared in the second generation.
 
-> Default: `true`
+*   The `intpost` boolean controls the Posted Interrupt sub-feature.  In
+    combination with APIC acceleration (VT-x APICV, SVM AVIC), the IOMMU can
+    be configured to deliver interrupts from assigned PCI devices directly
+    into the guest, without trapping out into hypervisor context.
 
->> Control whether to set up interrupt remapping data structures per device
->> rather that once for the entire system. Turning this off is making PCI
->> device pass-through insecure and hence unsupported.
+    This option depends on `intremap`, and is disabled by default due to some
+    corner cases in the implementation which have yet to be resolved.
 
-> `workaround_bios_bug` (VT-d)
+The following options are specific to Intel VT-d hardware:
 
-> Default: `false`
+*   The `snoop` boolean controls the Snoop Control sub-feature, and is
+    active by default on compatible hardware.
 
->> Causes DRHD entries without any PCI discoverable devices under them to be
->> ignored (normally IOMMU setup fails if any of the devices listed by a DRHD
->> entry aren't PCI discoverable).
+    An incomming DMA request may specify _Snooped_ (query the CPU caches
+    for the appropriate lines) or _Non-Snooped_ (don't query the CPU
+    caches).  _Non-Snooped_ accesses incur less latency, but
+    behind-the-scenes hypervisor activity can invalidate the
+    expectations of the device driver, and Snoop Control allows the
+    hypervisor to force DMA requests to be _Snooped_ when they would
+    otherwise not be.
 
-> `igfx` (VT-d)
+*   The `qinval` boolean controls the Queued Invalidation sub-feature, and is
+    active by default on compatible hardware.  Queued Invalidation is a
+    feature in second-generation IOMMUs and is a functional prerequisite for
+    Interrupt Remapping.
 
-> Default: `true`
+*   The `igfx` boolean is active by default, and controls whether the
+    IOMMU in front of an Intel Graphics Device is enabled or not.
 
->> Enable IOMMU for Intel graphics devices. The intended usage of this option
->> is `no-igfx`, which is similar to Linux `intel_iommu=igfx_off` option used
->> to workaround graphics issues. If adding `no-igfx` fixes anything, you
->> should file a bug reporting the problem.
+    It is intended as a debugging mechanism for graphics issues, and to
+    be similar to Linux's `intel_iommu=igfx_off` option.  If specifying
+    `no-igfx` fixes anything, please report the problem.
 
-> `verbose`
+*   The `workaround_bios_bug` boolean is disabled by default.  It can be
+    used to ignore errors when parsing the ACPI tables, and finding a
+    listed PCI device which doesn't appear to exist in the system.
 
-> Default: `false`
-
->> Increase IOMMU code's verbosity.
+The following options are specific to AMD-Vi hardware:
 
-> `debug`
+*   The `amd-iommu-perdev-intremap` boolean controls whether the interrupt
+    remapping table is per device (the default), or a single global
+    table for the entire system.
 
-> Default: `false`
+    Using a global table is not security supported as it allows all
+    devices to impersonate each other as far as interrupts as concerned
+    (see XSA-36), but it is a workaround for SP5100 Erratum 28.
 
->> Enable IOMMU debugging code (implies `verbose`).
+**WARNING: The `dom0-passthrough` and `dom0-strict` booleans are both
+deprecated, and superseded by _dom0-iommu={passthrough,strict}_
+respectively - using both the old and new command line options in
+combination is undefined.**
 
 ### iommu_dev_iotlb_timeout
 > `= <integer>`
diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
index bd1af35..9ac9e05 100644
--- a/xen/drivers/passthrough/iommu.c
+++ b/xen/drivers/passthrough/iommu.c
@@ -21,34 +21,11 @@
 #include <xen/keyhandler.h>
 #include <xsm/xsm.h>
 
-static int parse_iommu_param(const char *s);
 static void iommu_dump_p2m_table(unsigned char key);
 
 unsigned int __read_mostly iommu_dev_iotlb_timeout = 1000;
 integer_param("iommu_dev_iotlb_timeout", iommu_dev_iotlb_timeout);
 
-/*
- * The 'iommu' parameter enables the IOMMU.  Optional comma separated
- * value may contain:
- *
- *   off|no|false|disable       Disable IOMMU (default)
- *   force|required             Don't boot unless IOMMU is enabled
- *   no-intremap                Disable interrupt remapping
- *   intpost                    Enable VT-d Interrupt posting
- *   verbose                    Be more verbose
- *   debug                      Enable debugging messages and checks
- *   workaround_bios_bug        Workaround some bios issue to still enable
- *                              VT-d, don't guarantee security
- *   dom0-passthrough           No DMA translation at all for Dom0
- *   dom0-strict                No 1:1 memory mapping for Dom0
- *   no-sharept                 Don't share VT-d and EPT page tables
- *   no-snoop                   Disable VT-d Snoop Control
- *   no-qinval                  Disable VT-d Queued Invalidation
- *   no-igfx                    Disable VT-d for IGD devices (insecure)
- *   no-amd-iommu-perdev-intremap Don't use per-device interrupt remapping
- *                              tables (insecure)
- */
-custom_param("iommu", parse_iommu_param);
 bool_t __initdata iommu_enable = 1;
 bool_t __read_mostly iommu_enabled;
 bool_t __read_mostly force_iommu;
@@ -84,50 +61,45 @@ static struct tasklet iommu_pt_cleanup_tasklet;
 static int __init parse_iommu_param(const char *s)
 {
     const char *ss;
-    int val, b, rc = 0;
+    int val, rc = 0;
 
     do {
-        val = !!strncmp(s, "no-", 3);
-        if ( !val )
-            s += 3;
-
         ss = strchr(s, ',');
         if ( !ss )
             ss = strchr(s, '\0');
 
-        b = parse_bool(s, ss);
-        if ( b >= 0 )
-            iommu_enable = b;
-        else if ( !cmdline_strcmp(s, "force") ||
-                  !cmdline_strcmp(s, "required") )
+        if ( (val = parse_bool(s, ss)) >= 0 )
+            iommu_enable = val;
+        else if ( (val = parse_boolean("force", s, ss)) >= 0 ||
+                  (val = parse_boolean("required", s, ss)) >= 0 )
             force_iommu = val;
-        else if ( !cmdline_strcmp(s, "workaround_bios_bug") )
+        else if ( (val = parse_boolean("workaround_bios_bug", s, ss)) >= 0 )
             iommu_workaround_bios_bug = val;
-        else if ( !cmdline_strcmp(s, "igfx") )
+        else if ( (val = parse_boolean("igfx", s, ss)) >= 0 )
             iommu_igfx = val;
-        else if ( !cmdline_strcmp(s, "verbose") )
+        else if ( (val = parse_boolean("verbose", s, ss)) >= 0 )
             iommu_verbose = val;
-        else if ( !cmdline_strcmp(s, "snoop") )
+        else if ( (val = parse_boolean("snoop", s, ss)) >= 0 )
             iommu_snoop = val;
-        else if ( !cmdline_strcmp(s, "qinval") )
+        else if ( (val = parse_boolean("qinval", s, ss)) >= 0 )
             iommu_qinval = val;
-        else if ( !cmdline_strcmp(s, "intremap") )
+        else if ( (val = parse_boolean("intremap", s, ss)) >= 0 )
             iommu_intremap = val;
-        else if ( !cmdline_strcmp(s, "intpost") )
+        else if ( (val = parse_boolean("intpost", s, ss)) >= 0 )
             iommu_intpost = val;
-        else if ( !cmdline_strcmp(s, "debug") )
+        else if ( (val = parse_boolean("debug", s, ss)) >= 0 )
         {
             iommu_debug = val;
             if ( val )
                 iommu_verbose = 1;
         }
-        else if ( !cmdline_strcmp(s, "amd-iommu-perdev-intremap") )
+        else if ( (val = parse_boolean("amd-iommu-perdev-intremap", s, ss)) >= 
0 )
             amd_iommu_perdev_intremap = val;
-        else if ( !cmdline_strcmp(s, "dom0-passthrough") )
+        else if ( (val = parse_boolean("dom0-passthrough", s, ss)) >= 0 )
             iommu_hwdom_passthrough = val;
-        else if ( !cmdline_strcmp(s, "dom0-strict") )
+        else if ( (val = parse_boolean("dom0-strict", s, ss)) >= 0 )
             iommu_hwdom_strict = val;
-        else if ( !cmdline_strcmp(s, "sharept") )
+        else if ( (val = parse_boolean("sharept", s, ss)) >= 0 )
             iommu_hap_pt_share = val;
         else
             rc = -EINVAL;
@@ -137,6 +109,7 @@ static int __init parse_iommu_param(const char *s)
 
     return rc;
 }
+custom_param("iommu", parse_iommu_param);
 
 static int __init parse_dom0_iommu_param(const char *s)
 {
-- 
2.1.4


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.