[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v8 6/6] x86/iommu: add map-reserved dom0-iommu option to map reserved memory ranges
> -----Original Message----- > From: Roger Pau Monne [mailto:roger.pau@xxxxxxxxxx] > Sent: 07 September 2018 10:08 > To: xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: Roger Pau Monne <roger.pau@xxxxxxxxxx>; Andrew Cooper > <Andrew.Cooper3@xxxxxxxxxx>; George Dunlap > <George.Dunlap@xxxxxxxxxx>; Ian Jackson <Ian.Jackson@xxxxxxxxxx>; Jan > Beulich <jbeulich@xxxxxxxx>; Julien Grall <julien.grall@xxxxxxx>; Konrad > Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>; Stefano Stabellini > <sstabellini@xxxxxxxxxx>; Tim (Xen.org) <tim@xxxxxxx>; Wei Liu > <wei.liu2@xxxxxxxxxx>; Paul Durrant <Paul.Durrant@xxxxxxxxxx>; Suravee > Suthikulpanit <suravee.suthikulpanit@xxxxxxx>; Brian Woods > <brian.woods@xxxxxxx>; Kevin Tian <kevin.tian@xxxxxxxxx> > Subject: [PATCH v8 6/6] x86/iommu: add map-reserved dom0-iommu option > to map reserved memory ranges > > Several people have reported hardware issues (malfunctioning USB > controllers) due to iommu page faults on Intel hardware. Those faults > are caused by missing RMRR (VTd) entries in the ACPI tables. Those can > be worked around on VTd hardware by manually adding RMRR entries on > the command line, this is however limited to Intel hardware and quite > cumbersome to do. > > In order to solve those issues add a new dom0-iommu=map-reserved > option that identity maps all regions marked as reserved in the memory > map. Note that regions used by devices emulated by Xen (LAPIC, IO-APIC > or PCIe MCFG regions) are specifically avoided. Note that this option > is available to all Dom0 modes (as opposed to the inclusive option > which only works for PV Dom0). > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > Reviewed-by: Kevin Tian <kevin.tian@xxxxxxxxx> > Reviewed-by: Wei Liu <wei.liu2@xxxxxxxxxx> > Acked-by: Jan Beulich <jbeulich@xxxxxxxx> Reviewed-by: Paul Durrant <paul.durrant@xxxxxxxxxx> > --- > Changes since v7: > - Don't use true/false with int8_t. > - Print a warning message if map-reserved is set on ARM. > > Changes since v6: > - Reword the map-reserved help to make it clear it's available to > both PV and PVH Dom0. > - Assign type inside of the switch expression. > - Remove the comment about IO-APIC MMIO relocation, this is not > supported ATM. > > Changes since v5: > - Merge with the vpci MMCFG helper patch. > - Add a TODO item about the issues with relocating the LAPIC or > IOAPIC MMIO regions. > - Use the newly introduced page_get_ram_type that returns all the > types that fall between a page. > - Use paging_mode_translate instead of iommu_use_hap_pt when deciding > whether to use set_identity_p2m_entry or iommu_map_page. > > Changes since v4: > - Use pfn_to_paddr. > - Rebase on top of previous changes. > - Change the default option setting to use if instead of a ternary > operator. > - Rename to map-reserved. > > Changes since v3: > - Add mappings if the iommu page tables are shared. > > Changes since v2: > - Fix comment regarding dom0-strict. > - Change documentation style of xen command line. > - Rename iommu_map to hwdom_iommu_map. > - Move all the checks to hwdom_iommu_map. > > Changes since v1: > - Introduce a new reserved option instead of abusing the inclusive > option. > - Use the same helper function for PV and PVH in order to decide if a > page should be added to the domain page tables. > - Use the data inside of the domain struct to detect overlaps with > emulated MMIO regions. > --- > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Julien Grall <julien.grall@xxxxxxx> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> > Cc: Tim Deegan <tim@xxxxxxx> > Cc: Wei Liu <wei.liu2@xxxxxxxxxx> > Cc: Paul Durrant <paul.durrant@xxxxxxxxxx> > Cc: Suravee Suthikulpanit <suravee.suthikulpanit@xxxxxxx> > Cc: Brian Woods <brian.woods@xxxxxxx> > Cc: Kevin Tian <kevin.tian@xxxxxxxxx> > --- > docs/misc/xen-command-line.markdown | 9 ++++ > xen/arch/x86/hvm/io.c | 5 ++ > xen/drivers/passthrough/amd/pci_amd_iommu.c | 3 ++ > xen/drivers/passthrough/arm/smmu.c | 4 ++ > xen/drivers/passthrough/iommu.c | 5 +- > xen/drivers/passthrough/vtd/iommu.c | 3 ++ > xen/drivers/passthrough/x86/iommu.c | 52 ++++++++++++++++++--- > xen/include/asm-x86/hvm/io.h | 3 ++ > xen/include/xen/iommu.h | 2 +- > 9 files changed, 78 insertions(+), 8 deletions(-) > > diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen- > command-line.markdown > index 98f0f3b68b..1ffd586224 100644 > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -704,6 +704,15 @@ This list of booleans controls the iommu usage by > Dom0: > option is only applicable to a PV Dom0 and is enabled by default on Intel > hardware. > > +* `map-reserved`: sets up DMA remapping for all the reserved regions in > the > + memory map for Dom0. Use this to work around firmware issues providing > + incorrect RMRR/IVMD entries. Rather than only mapping RAM pages for > IOMMU > + accesses for Dom0, all memory regions marked as reserved in the memory > map > + that don't overlap with any MMIO region from emulated devices will be > + identity mapped. This option maps a subset of the memory that would be > + mapped when using the `map-inclusive` option. This option is available to > all > + Dom0 modes and is enabled by default on Intel hardware. > + > ### dom0\_ioports\_disable (x86) > > `= List of <hex>-<hex>` > > diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c > index 47d6c850ca..a5b0a23f06 100644 > --- a/xen/arch/x86/hvm/io.c > +++ b/xen/arch/x86/hvm/io.c > @@ -404,6 +404,11 @@ static const struct hvm_mmcfg > *vpci_mmcfg_find(const struct domain *d, > return NULL; > } > > +bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr) > +{ > + return vpci_mmcfg_find(d, addr); > +} > + > static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg > *mmcfg, > paddr_t addr, pci_sbdf_t *sbdf) > { > diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c > b/xen/drivers/passthrough/amd/pci_amd_iommu.c > index 073d18bd10..330f9ce386 100644 > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -256,6 +256,9 @@ static void __hwdom_init > amd_iommu_hwdom_init(struct domain *d) > /* Inclusive IOMMU mappings are disabled by default on AMD hardware. > */ > if ( iommu_hwdom_inclusive == -1 ) > iommu_hwdom_inclusive = 0; > + /* Reserved IOMMU mappings are disabled by default on AMD > hardware. */ > + if ( iommu_hwdom_reserved == -1 ) > + iommu_hwdom_reserved = 0; > > if ( allocate_domain_resources(dom_iommu(d)) ) > BUG(); > diff --git a/xen/drivers/passthrough/arm/smmu.c > b/xen/drivers/passthrough/arm/smmu.c > index a5158b0bdf..43ece42a50 100644 > --- a/xen/drivers/passthrough/arm/smmu.c > +++ b/xen/drivers/passthrough/arm/smmu.c > @@ -2732,6 +2732,10 @@ static void __hwdom_init > arm_smmu_iommu_hwdom_init(struct domain *d) > printk(XENLOG_WARNING > "map-inclusive dom0-iommu option is not supported on > ARM\n"); > iommu_hwdom_inclusive = 0; > + if ( iommu_hwdom_reserved == 1 ) > + printk(XENLOG_WARNING > + "map-reserved dom0-iommu option is not supported on > ARM\n"); > + iommu_hwdom_reserved = 0; > } > > static void arm_smmu_iommu_domain_teardown(struct domain *d) > diff --git a/xen/drivers/passthrough/iommu.c > b/xen/drivers/passthrough/iommu.c > index 9552464bdc..a29bc13f8a 100644 > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -62,6 +62,7 @@ bool_t __read_mostly iommu_intremap = 1; > bool __hwdom_initdata iommu_hwdom_strict; > bool __read_mostly iommu_hwdom_passthrough; > int8_t __hwdom_initdata iommu_hwdom_inclusive = -1; > +int8_t __hwdom_initdata iommu_hwdom_reserved = -1; > > /* > * In the current implementation of VT-d posted interrupts, in some > extreme > @@ -155,6 +156,8 @@ static int __init parse_dom0_iommu_param(const > char *s) > iommu_hwdom_strict = val; > else if ( (val = parse_boolean("map-inclusive", s, ss)) >= 0 ) > iommu_hwdom_inclusive = val; > + else if ( (val = parse_boolean("map-reserved", s, ss)) >= 0 ) > + iommu_hwdom_inclusive = val; > else > rc = -EINVAL; > > @@ -236,7 +239,7 @@ void __hwdom_init iommu_hwdom_init(struct > domain *d) > > hd->platform_ops->hwdom_init(d); > > - ASSERT(iommu_hwdom_inclusive != -1); > + ASSERT(iommu_hwdom_inclusive != -1 && iommu_hwdom_inclusive != - > 1); > if ( iommu_hwdom_inclusive && !is_pv_domain(d) ) > { > printk(XENLOG_WARNING > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index a09e02c8db..1121f5ff5b 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -1307,6 +1307,9 @@ static void __hwdom_init > intel_iommu_hwdom_init(struct domain *d) > /* Inclusive mappings are enabled by default on Intel hardware for PV. */ > if ( iommu_hwdom_inclusive == -1 ) > iommu_hwdom_inclusive = is_pv_domain(d); > + /* Reserved IOMMU mappings are enabled by default on Intel hardware. > */ > + if ( iommu_hwdom_reserved == -1 ) > + iommu_hwdom_reserved = 1; > > setup_hwdom_pci_devices(d, setup_hwdom_device); > setup_hwdom_rmrr(d); > diff --git a/xen/drivers/passthrough/x86/iommu.c > b/xen/drivers/passthrough/x86/iommu.c > index 5809027573..47a078272a 100644 > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -20,6 +20,7 @@ > #include <xen/softirq.h> > #include <xsm/xsm.h> > > +#include <asm/hvm/io.h> > #include <asm/setup.h> > > void iommu_update_ire_from_apic( > @@ -139,17 +140,23 @@ static bool __hwdom_init > hwdom_iommu_map(const struct domain *d, > unsigned long max_pfn) > { > mfn_t mfn = _mfn(pfn); > + unsigned int i, type; > > /* > * Set up 1:1 mapping for dom0. Default to include only conventional RAM > * areas and let RMRRs include needed reserved regions. When set, the > * inclusive mapping additionally maps in every pfn up to 4GB except > those > - * that fall in unusable ranges. > + * that fall in unusable ranges for PV Dom0. > */ > - if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) ) > + if ( (pfn > max_pfn && !mfn_valid(mfn)) || xen_in_range(pfn) || > + /* > + * Ignore any address below 1MB, that's already identity mapped by > the > + * Dom0 builder for HVM. > + */ > + (!d->domain_id && is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) ) > return false; > > - switch ( page_get_ram_type(mfn) ) > + switch ( type = page_get_ram_type(mfn) ) > { > case RAM_TYPE_UNUSABLE: > return false; > @@ -160,10 +167,40 @@ static bool __hwdom_init > hwdom_iommu_map(const struct domain *d, > break; > > default: > - if ( !iommu_hwdom_inclusive || pfn > max_pfn ) > + if ( type & RAM_TYPE_RESERVED ) > + { > + if ( !iommu_hwdom_inclusive && !iommu_hwdom_reserved ) > + return false; > + } > + else if ( is_hvm_domain(d) || !iommu_hwdom_inclusive || pfn > > max_pfn ) > return false; > } > > + /* > + * Check that it doesn't overlap with the LAPIC > + * TODO: if the guest relocates the MMIO area of the LAPIC Xen should > make > + * sure there's nothing in the new address that would prevent trapping. > + */ > + if ( has_vlapic(d) ) > + { > + const struct vcpu *v; > + > + for_each_vcpu(d, v) > + if ( pfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(v))) ) > + return false; > + } > + /* ... or the IO-APIC */ > + for ( i = 0; has_vioapic(d) && i < d->arch.hvm.nr_vioapics; i++ ) > + if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) ) > + return false; > + /* > + * ... or the PCIe MCFG regions. > + * TODO: runtime added MMCFG regions are not checked to make sure > they > + * don't overlap with already mapped regions, thus preventing trapping. > + */ > + if ( has_vpci(d) && vpci_is_mmcfg_address(d, pfn_to_paddr(pfn)) ) > + return false; > + > return true; > } > > @@ -173,7 +210,7 @@ void __hwdom_init arch_iommu_hwdom_init(struct > domain *d) > > BUG_ON(!is_hardware_domain(d)); > > - if ( iommu_hwdom_passthrough || !is_pv_domain(d) ) > + if ( iommu_hwdom_passthrough ) > return; > > max_pfn = (GB(4) >> PAGE_SHIFT) - 1; > @@ -187,7 +224,10 @@ void __hwdom_init arch_iommu_hwdom_init(struct > domain *d) > if ( !hwdom_iommu_map(d, pfn, max_pfn) ) > continue; > > - rc = iommu_map_page(d, pfn, pfn, > IOMMUF_readable|IOMMUF_writable); > + if ( paging_mode_translate(d) ) > + rc = set_identity_p2m_entry(d, pfn, p2m_access_rw, 0); > + else > + rc = iommu_map_page(d, pfn, pfn, > IOMMUF_readable|IOMMUF_writable); > if ( rc ) > printk(XENLOG_WARNING " d%d: IOMMU mapping failed: %d\n", > d->domain_id, rc); > diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h > index 8c83fd0c8b..7ceb119b64 100644 > --- a/xen/include/asm-x86/hvm/io.h > +++ b/xen/include/asm-x86/hvm/io.h > @@ -185,6 +185,9 @@ int register_vpci_mmcfg_handler(struct domain *d, > paddr_t addr, > /* Destroy tracked MMCFG areas. */ > void destroy_vpci_mmcfg(struct domain *d); > > +/* Check if an address is between a MMCFG region for a domain. */ > +bool vpci_is_mmcfg_address(const struct domain *d, paddr_t addr); > + > #endif /* __ASM_X86_HVM_IO_H__ */ > > > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h > index 89c6830689..57c4e81ec6 100644 > --- a/xen/include/xen/iommu.h > +++ b/xen/include/xen/iommu.h > @@ -37,7 +37,7 @@ extern bool_t iommu_debug; > extern bool_t amd_iommu_perdev_intremap; > > extern bool iommu_hwdom_strict, iommu_hwdom_passthrough; > -extern int8_t iommu_hwdom_inclusive; > +extern int8_t iommu_hwdom_inclusive, iommu_hwdom_reserved; > > extern unsigned int iommu_dev_iotlb_timeout; > > -- > 2.18.0 _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |