[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-devel] [PATCH v7 6/6] x86/iommu: add map-reserved dom0-iommu option to map reserved memory ranges
Several people have reported hardware issues (malfunctioning USB controllers) due to iommu page faults on Intel hardware. Those faults are caused by missing RMRR (VTd) entries in the ACPI tables. Those can be worked around on VTd hardware by manually adding RMRR entries on the command line, this is however limited to Intel hardware and quite cumbersome to do. In order to solve those issues add a new dom0-iommu=map-reserved option that identity maps all regions marked as reserved in the memory map. Note that regions used by devices emulated by Xen (LAPIC, IO-APIC or PCIe MCFG regions) are specifically avoided. Note that this option is available to all Dom0 modes (as opposed to the inclusive option which only works for PV Dom0). Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx> --- Changes since v6: - Reword the map-reserved help to make it clear it's available to both PV and PVH Dom0. - Assign type inside of the switch expression. - Remove the comment about IO-APIC MMIO relocation, this is not supported ATM. Changes since v5: - Merge with the vpci MMCFG helper patch. - Add a TODO item about the issues with relocating the LAPIC or IOAPIC MMIO regions. - Use the newly introduced page_get_ram_type that returns all the types that fall between a page. - Use paging_mode_translate instead of iommu_use_hap_pt when deciding whether to use set_identity_p2m_entry or iommu_map_page. Changes since v4: - Use pfn_to_paddr. - Rebase on top of previous changes. - Change the default option setting to use if instead of a ternary operator. - Rename to map-reserved. Changes since v3: - Add mappings if the iommu page tables are shared. Changes since v2: - Fix comment regarding dom0-strict. - Change documentation style of xen command line. - Rename iommu_map to hwdom_iommu_map. - Move all the checks to hwdom_iommu_map. Changes since v1: - Introduce a new reserved option instead of abusing the inclusive option. - Use the same helper function for PV and PVH in order to decide if a page should be added to the domain page tables. - Use the data inside of the domain struct to detect overlaps with emulated MMIO regions. --- Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> Cc: Jan Beulich <jbeulich@xxxxxxxx> Cc: Julien Grall <julien.grall@xxxxxxx> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> Cc: Tim Deegan <tim@xxxxxxx> Cc: Wei Liu <wei.liu2@xxxxxxxxxx> Cc: Paul Durrant <paul.durrant@xxxxxxxxxx> Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx> Cc: Brian Woods <brian.woods@xxxxxxx> Cc: Kevin Tian <kevin.tian@xxxxxxxxx> --- docs/misc/xen-command-line.markdown | 9 ++++ xen/arch/x86/hvm/io.c | 5 ++ xen/drivers/passthrough/amd/pci_amd_iommu.c | 3 ++ xen/drivers/passthrough/arm/smmu.c | 1 + xen/drivers/passthrough/iommu.c | 3 ++ xen/drivers/passthrough/vtd/iommu.c | 3 ++ xen/drivers/passthrough/x86/iommu.c | 53 ++++++++++++++++++--- xen/include/asm-x86/hvm/io.h | 3 ++ xen/include/xen/iommu.h | 2 +- 9 files changed, 75 insertions(+), 7 deletions(-) diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen-command-line.markdown index 98f0f3b68b..1ffd586224 100644 --- a/docs/misc/xen-command-line.markdown +++ b/docs/misc/xen-command-line.markdown @@ -704,6 +704,15 @@ This list of booleans controls the iommu usage by Dom0: option is only applicable to a PV Dom0 and is enabled by default on Intel hardware. +* `map-reserved`: sets up DMA remapping for all the reserved regions in the + memory map for Dom0. Use this to work around firmware issues providing + incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU + accesses for Dom0, all memory regions marked as reserved in the memory map + that don't overlap with any MMIO region from emulated devices will be + identity mapped. This option maps a subset of the memory that would be + mapped when using the `map-inclusive` option. This option is available to all + Dom0 modes and is enabled by default on Intel hardware. + ### dom0\_ioports\_disable (x86) > `= List of <hex>-<hex>` diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c index bf4d8748d3..1f8fe36168 100644 --- a/xen/arch/x86/hvm/io.c +++ b/xen/arch/x86/hvm/io.c @@ -404,6 +404,11 @@ static const struct hvm_mmcfg *vpci_mmcfg_find(const struct domain *d, return NULL; } +bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr) +{ + return vpci_mmcfg_find(d, addr); +} + static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg *mmcfg, paddr_t addr, pci_sbdf_t *sbdf) { diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c b/xen/drivers/passthrough/amd/pci_amd_iommu.c index 27eb49619d..49d934e1ac 100644 --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c @@ -256,6 +256,9 @@ static void __hwdom_init amd_iommu_hwdom_init(struct domain *d) /* Inclusive IOMMU mappings are disabled by default on AMD hardware. */ if ( iommu_hwdom_inclusive == -1 ) iommu_hwdom_inclusive = false; + /* Reserved IOMMU mappings are disabled by default on AMD hardware. */ + if ( iommu_hwdom_reserved == -1 ) + iommu_hwdom_reserved = false; if ( allocate_domain_resources(dom_iommu(d)) ) BUG(); diff --git a/xen/drivers/passthrough/arm/smmu.c b/xen/drivers/passthrough/arm/smmu.c index b142677b8c..8ea39659d1 100644 --- a/xen/drivers/passthrough/arm/smmu.c +++ b/xen/drivers/passthrough/arm/smmu.c @@ -2729,6 +2729,7 @@ static void __hwdom_init arm_smmu_iommu_hwdom_init(struct domain *d) { /* Set to false options not supported on ARM. */ iommu_hwdom_inclusive = false; + iommu_hwdom_reserved = false; } static void arm_smmu_iommu_domain_teardown(struct domain *d) diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c index 5798142730..95a471aa89 100644 --- a/xen/drivers/passthrough/iommu.c +++ b/xen/drivers/passthrough/iommu.c @@ -62,6 +62,7 @@ bool_t __read_mostly iommu_intremap = 1; bool __hwdom_initdata iommu_hwdom_strict; bool __read_mostly iommu_hwdom_passthrough; int8_t __hwdom_initdata iommu_hwdom_inclusive = -1; +int8_t __hwdom_initdata iommu_hwdom_reserved = -1; /* * In the current implementation of VT-d posted interrupts, in some extreme @@ -155,6 +156,8 @@ static int __init parse_dom0_iommu_param(const char *s) iommu_hwdom_strict = val; else if ( (val = parse_boolean("map-inclusive", s, ss)) >= 0 ) iommu_hwdom_inclusive = val; + else if ( (val = parse_boolean("map-reserved", s, ss)) >= 0 ) + iommu_hwdom_inclusive = val; else rc = -EINVAL; diff --git a/xen/drivers/passthrough/vtd/iommu.c b/xen/drivers/passthrough/vtd/iommu.c index a09e02c8db..4152c59713 100644 --- a/xen/drivers/passthrough/vtd/iommu.c +++ b/xen/drivers/passthrough/vtd/iommu.c @@ -1307,6 +1307,9 @@ static void __hwdom_init intel_iommu_hwdom_init(struct domain *d) /* Inclusive mappings are enabled by default on Intel hardware for PV. */ if ( iommu_hwdom_inclusive == -1 ) iommu_hwdom_inclusive = is_pv_domain(d); + /* Reserved IOMMU mappings are enabled by default on Intel hardware. */ + if ( iommu_hwdom_reserved == -1 ) + iommu_hwdom_reserved = true; setup_hwdom_pci_devices(d, setup_hwdom_device); setup_hwdom_rmrr(d); diff --git a/xen/drivers/passthrough/x86/iommu.c b/xen/drivers/passthrough/x86/iommu.c index b947ed6043..c85bf0bcd7 100644 --- a/xen/drivers/passthrough/x86/iommu.c +++ b/xen/drivers/passthrough/x86/iommu.c @@ -20,6 +20,7 @@ #include <xen/softirq.h> #include <xsm/xsm.h> +#include <asm/hvm/io.h> #include <asm/setup.h> void iommu_update_ire_from_apic( @@ -138,16 +139,23 @@ static bool __hwdom_init hwdom_iommu_map(const struct domain *d, unsigned long pfn, unsigned long max_pfn) { + unsigned int i, type; + /* * Set up 1:1 mapping for dom0. Default to include only conventional RAM * areas and let RMRRs include needed reserved regions. When set, the * inclusive mapping additionally maps in every pfn up to 4GB except those - * that fall in unusable ranges. + * that fall in unusable ranges for PV Dom0. */ - if ( (pfn > max_pfn && !mfn_valid(_mfn(pfn))) || xen_in_range(pfn) ) + if ( (pfn > max_pfn && !mfn_valid(_mfn(pfn))) || xen_in_range(pfn) || + /* + * Ignore any address below 1MB, that's already identity mapped by the + * Dom0 builder for HVM. + */ + (!d->domain_id && is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ) return false; - switch ( page_get_ram_type(pfn) ) + switch ( type = page_get_ram_type(pfn) ) { case RAM_TYPE_UNUSABLE: return false; @@ -158,10 +166,40 @@ static bool __hwdom_init hwdom_iommu_map(const struct domain *d, break; default: - if ( !iommu_hwdom_inclusive || pfn > max_pfn ) + if ( type & RAM_TYPE_RESERVED ) + { + if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved ) + return false; + } + else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn > max_pfn ) return false; } + /* + * Check that it doesn't overlap with the LAPIC + * TODO: if the guest relocates the MMIO area of the LAPIC Xen should make + * sure there's nothing in the new address that would prevent trapping. + */ + if ( has_vlapic(d) ) + { + const struct vcpu *v; + + for_each_vcpu(d, v) + if ( pfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(v))) ) + return false; + } + /* ... or the IO-APIC */ + for ( i = 0; has_vioapic(d) && i < d->arch.hvm_domain.nr_vioapics; i++ ) + if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) ) + return false; + /* + * ... or the PCIe MCFG regions. + * TODO: runtime added MMCFG regions are not checked to make sure they + * don't overlap with already mapped regions, thus preventing trapping. + */ + if ( has_vpci(d) && vpci_is_mmcfg_address(d, pfn_to_paddr(pfn)) ) + return false; + return true; } @@ -171,7 +209,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d) BUG_ON(!is_hardware_domain(d)); - if ( iommu_hwdom_passthrough || !is_pv_domain(d) ) + if ( iommu_hwdom_passthrough ) return; max_pfn = (GB(4) >> PAGE_SHIFT) - 1; @@ -185,7 +223,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain *d) if ( !hwdom_iommu_map(d, pfn, max_pfn) ) continue; - rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable); + if ( paging_mode_translate(d) ) + rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0); + else + rc = iommu_map_page(d, pfn, pfn, IOMMUF_readable|IOMMUF_writable); if ( rc ) printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n", d->domain_id, rc); diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h index e6b6ed0b92..83431b44f2 100644 --- a/xen/include/asm-x86/hvm/io.h +++ b/xen/include/asm-x86/hvm/io.h @@ -180,6 +180,9 @@ int register_vpci_mmcfg_handler(struct domain *d, paddr_t addr, /* Destroy tracked MMCFG areas. */ void destroy_vpci_mmcfg(struct domain *d); +/* Check if an address is between a MMCFG region for a domain. */ +bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr); + #endif /* __ASM_X86_HVM_IO_H__ */ diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h index 89c6830689..57c4e81ec6 100644 --- a/xen/include/xen/iommu.h +++ b/xen/include/xen/iommu.h @@ -37,7 +37,7 @@ extern bool_t iommu_debug; extern bool_t amd_iommu_perdev_intremap; extern bool iommu_hwdom_strict, iommu_hwdom_passthrough; -extern int8_t iommu_hwdom_inclusive; +extern int8_t iommu_hwdom_inclusive, iommu_hwdom_reserved; extern unsigned int iommu_dev_iotlb_timeout; -- 2.18.0 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |