[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 4/4] x86/iommu: add reserved dom0-iommu option to map reserved memory ranges
> -----Original Message----- > From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf > Of Roger Pau Monne > Sent: 07 August 2018 15:03 > To: xen-devel@xxxxxxxxxxxxxxxxxxxx > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>; Wei Liu > <wei.liu2@xxxxxxxxxx>; George Dunlap <George.Dunlap@xxxxxxxxxx>; > Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>; Ian Jackson > <Ian.Jackson@xxxxxxxxxx>; Tim (Xen.org) <tim@xxxxxxx>; Julien Grall > <julien.grall@xxxxxxx>; Jan Beulich <jbeulich@xxxxxxxx>; Roger Pau > Monne <roger.pau@xxxxxxxxxx> > Subject: [Xen-devel] [PATCH v3 4/4] x86/iommu: add reserved dom0-iommu > option to map reserved memory ranges > > Several people have reported hardware issues (malfunctioning USB > controllers) due to iommu page faults on Intel hardware. Those faults > are caused by missing RMRR (VTd) entries in the ACPI tables. Those can > be worked around on VTd hardware by manually adding RMRR entries on > the command line, this is however limited to Intel hardware and quite > cumbersome to do. > > In order to solve those issues add a new dom0-iommu=reserved option > that identity maps all regions marked as reserved in the memory map. > Note that regions used by devices emulated by Xen (LAPIC, IO-APIC or > PCIe MCFG regions) are specifically avoided. Note that this option is > available to a PVH Dom0 (as opposed to the inclusive option which only > works for PV Dom0). > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> Reviewed-by: Paul Durrant <paul.durrant@xxxxxxxxxx> > --- > Changes since v2: > - Fix comment regarding dom0-strict. > - Change documentation style of xen command line. > - Rename iommu_map to hwdom_iommu_map. > - Move all the checks to hwdom_iommu_map. > > Changes since v1: > - Introduce a new reserved option instead of abusing the inclusive > option. > - Use the same helper function for PV and PVH in order to decide if a > page should be added to the domain page tables. > - Use the data inside of the domain struct to detect overlaps with > emulated MMIO regions. > --- > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Julien Grall <julien.grall@xxxxxxx> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> > Cc: Tim Deegan <tim@xxxxxxx> > Cc: Wei Liu <wei.liu2@xxxxxxxxxx> > --- > docs/misc/xen-command-line.markdown | 11 ++- > xen/arch/x86/hvm/io.c | 5 ++ > xen/drivers/passthrough/amd/pci_amd_iommu.c | 3 + > xen/drivers/passthrough/iommu.c | 3 + > xen/drivers/passthrough/vtd/iommu.c | 3 + > xen/drivers/passthrough/x86/iommu.c | 86 ++++++++++++++------- > xen/include/asm-x86/hvm/io.h | 3 + > xen/include/xen/iommu.h | 2 +- > 8 files changed, 85 insertions(+), 31 deletions(-) > > diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen- > command-line.markdown > index 90b32fe3f0..59ec2afc5d 100644 > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -1205,7 +1205,7 @@ detection of systems known to misbehave upon > accesses to that port. > >> Enable IOMMU debugging code (implies `verbose`). > > ### dom0-iommu > -> `= List of [ none | strict | relaxed | inclusive ]` > +> `= List of [ none | strict | relaxed | inclusive | reserved ]` > > * `none`: disables DMA remapping for Dom0. > > @@ -1233,6 +1233,15 @@ meaning: > option is only applicable to a PV Dom0 and is enabled by default on Intel > hardware. > > +* `reserved`: sets up DMA remapping for all the reserved regions in the > memory > + map for Dom0. Use this to work around firmware issues providing incorrect > + RMRR/IVMD entries. Rather than only mapping RAM pages for IOMMU > accesses > + for Dom0, all memory regions marked as reserved in the memory map that > don't > + overlap with any MMIO region from emulated devices will be identity > mapped. > + This option maps a subset of the memory that would be mapped when > using the > + `inclusive` option. This option is available to a PVH Dom0 and is enabled > by > + default on Intel hardware. > + > ### iommu\_dev\_iotlb\_timeout > > `= <integer>` > > diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c > index bf4d8748d3..5e01c33890 100644 > --- a/xen/arch/x86/hvm/io.c > +++ b/xen/arch/x86/hvm/io.c > @@ -404,6 +404,11 @@ static const struct hvm_mmcfg > *vpci_mmcfg_find(const struct domain *d, > return NULL; > } > > +bool vpci_mmcfg_address(const struct domain *d, paddr_t addr) > +{ > + return vpci_mmcfg_find(d, addr); > +} > + > static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg > *mmcfg, > paddr_t addr, pci_sbdf_t *sbdf) > { > diff --git a/xen/drivers/passthrough/amd/pci_amd_iommu.c > b/xen/drivers/passthrough/amd/pci_amd_iommu.c > index 0e0c99c942..2c2867d088 100644 > --- a/xen/drivers/passthrough/amd/pci_amd_iommu.c > +++ b/xen/drivers/passthrough/amd/pci_amd_iommu.c > @@ -256,6 +256,9 @@ static void __hwdom_init > amd_iommu_hwdom_init(struct domain *d) > /* Inclusive IOMMU mappings are disabled by default on AMD hardware. > */ > iommu_dom0_inclusive = iommu_dom0_inclusive == -1 ? false > : iommu_dom0_inclusive; > + /* Reserved IOMMU mappings are disabled by default on AMD > hardware. */ > + iommu_dom0_reserved = iommu_dom0_reserved == -1 ? false > + : iommu_dom0_reserved; > > if ( allocate_domain_resources(dom_iommu(d)) ) > BUG(); > diff --git a/xen/drivers/passthrough/iommu.c > b/xen/drivers/passthrough/iommu.c > index f15c94be42..9c991bd2cf 100644 > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -75,6 +75,7 @@ custom_param("dom0-iommu", > parse_dom0_iommu_param); > bool __hwdom_initdata iommu_dom0_strict; > bool __read_mostly iommu_dom0_passthrough; > int8_t __hwdom_initdata iommu_dom0_inclusive = -1; > +int8_t __hwdom_initdata iommu_dom0_reserved = -1; > > DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb); > > @@ -162,6 +163,8 @@ static int __init parse_dom0_iommu_param(const > char *s) > iommu_dom0_strict = false; > else if ( !strncmp(s, "inclusive", ss - s) ) > iommu_dom0_inclusive = val; > + else if ( !strncmp(s, "reserved", ss - s) ) > + iommu_dom0_reserved = val; > else > rc = -EINVAL; > > diff --git a/xen/drivers/passthrough/vtd/iommu.c > b/xen/drivers/passthrough/vtd/iommu.c > index 7c7e15755d..77a076215b 100644 > --- a/xen/drivers/passthrough/vtd/iommu.c > +++ b/xen/drivers/passthrough/vtd/iommu.c > @@ -1307,6 +1307,9 @@ static void __hwdom_init > intel_iommu_hwdom_init(struct domain *d) > /* Inclusive mappings are enabled by default on Intel hardware for PV. */ > iommu_dom0_inclusive = iommu_dom0_inclusive == -1 ? > is_pv_domain(d) > : iommu_dom0_inclusive; > + /* Reserved IOMMU mappings are enabled by default on Intel hardware. > */ > + iommu_dom0_reserved = iommu_dom0_reserved == -1 ? true > + : iommu_dom0_reserved; > > setup_hwdom_pci_devices(d, setup_hwdom_device); > setup_hwdom_rmrr(d); > diff --git a/xen/drivers/passthrough/x86/iommu.c > b/xen/drivers/passthrough/x86/iommu.c > index 5a7a765e9d..6aec43ed1a 100644 > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -20,6 +20,7 @@ > #include <xen/softirq.h> > #include <xsm/xsm.h> > > +#include <asm/hvm/io.h> > #include <asm/setup.h> > > void iommu_update_ire_from_apic( > @@ -134,13 +135,67 @@ void arch_iommu_domain_destroy(struct domain > *d) > { > } > > +static bool __hwdom_init hwdom_iommu_map(const struct domain *d, > unsigned long pfn, > + unsigned long max_pfn) > +{ > + unsigned int i; > + > + /* > + * Ignore any address below 1MB, that's already identity mapped by the > + * domain builder for HVM. > + */ > + if ( (is_hvm_domain(d) && pfn < PFN_DOWN(MB(1))) || > + /* Exclude Xen bits. */ > + xen_in_range(pfn) || (pfn > max_pfn && !mfn_valid(_mfn(pfn))) ) > + return false; > + > + /* > + * If dom0-strict mode is enabled or the guest type is PVH/HVM then > exclude > + * conventional RAM and let the common code map dom0's pages. > + */ > + if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) && > + (iommu_dom0_strict || is_hvm_domain(d)) ) > + return false; > + if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) && > + !iommu_dom0_reserved && !iommu_dom0_inclusive ) > + return false; > + if ( !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE) && > + !page_is_ram_type(pfn, RAM_TYPE_RESERVED) && > + !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) && > + (!iommu_dom0_inclusive || pfn > max_pfn) ) > + return false; > + > + /* Check that it doesn't overlap with the LAPIC */ > + if ( has_vlapic(d) ) > + { > + const struct vcpu *v; > + > + for_each_vcpu(d, v) > + if ( pfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(v))) ) > + return false; > + } > + /* ... or the IO-APIC */ > + for ( i = 0; has_vioapic(d) && i < d->arch.hvm_domain.nr_vioapics; i++ ) > + if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) ) > + return false; > + /* > + * ... or the PCIe MCFG regions. > + * TODO: runtime added MMCFG regions are not checked to make sure > they > + * don't overlap with already mapped regions, thus preventing trapping. > + */ > + if ( has_vpci(d) && vpci_mmcfg_address(d, pfn << PAGE_SHIFT) ) > + return false; > + > + return true; > +} > + > void __hwdom_init arch_iommu_hwdom_init(struct domain *d) > { > unsigned long i, top, max_pfn; > > BUG_ON(!is_hardware_domain(d)); > > - if ( iommu_dom0_passthrough || !is_pv_domain(d) ) > + if ( iommu_dom0_passthrough ) > return; > > max_pfn = (GB(4) >> PAGE_SHIFT) - 1; > @@ -149,36 +204,9 @@ void __hwdom_init arch_iommu_hwdom_init(struct > domain *d) > for ( i = 0; i < top; i++ ) > { > unsigned long pfn = pdx_to_pfn(i); > - bool map; > int rc; > > - /* > - * Set up 1:1 mapping for dom0. Default to include only > - * conventional RAM areas and let RMRRs include needed reserved > - * regions. When set, the inclusive mapping additionally maps in > - * every pfn up to 4GB except those that fall in unusable ranges. > - */ > - if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) ) > - continue; > - > - if ( iommu_dom0_inclusive && pfn <= max_pfn ) > - map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE); > - else > - map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL); > - > - if ( !map ) > - continue; > - > - /* Exclude Xen bits */ > - if ( xen_in_range(pfn) ) > - continue; > - > - /* > - * If dom0-strict mode is enabled then exclude conventional RAM > - * and let the common code map dom0's pages. > - */ > - if ( iommu_dom0_strict && > - page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) ) > + if ( !hwdom_iommu_map(d, pfn, max_pfn) ) > continue; > > rc = iommu_map_page(d, pfn, pfn, > IOMMUF_readable|IOMMUF_writable); > diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm-x86/hvm/io.h > index e6b6ed0b92..8cca456b55 100644 > --- a/xen/include/asm-x86/hvm/io.h > +++ b/xen/include/asm-x86/hvm/io.h > @@ -180,6 +180,9 @@ int register_vpci_mmcfg_handler(struct domain *d, > paddr_t addr, > /* Destroy tracked MMCFG areas. */ > void destroy_vpci_mmcfg(struct domain *d); > > +/* Check if an address is between a MMCFG region for a domain. */ > +bool vpci_mmcfg_address(const struct domain *d, paddr_t addr); > + > #endif /* __ASM_X86_HVM_IO_H__ */ > > > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h > index 99e5b89c0f..fed1b1ea7a 100644 > --- a/xen/include/xen/iommu.h > +++ b/xen/include/xen/iommu.h > @@ -37,7 +37,7 @@ extern bool_t iommu_debug; > extern bool_t amd_iommu_perdev_intremap; > > extern bool iommu_dom0_strict, iommu_dom0_passthrough; > -extern int8_t iommu_dom0_inclusive; > +extern int8_t iommu_dom0_inclusive, iommu_dom0_reserved; > > extern unsigned int iommu_dev_iotlb_timeout; > > -- > 2.18.0 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxxx > https://lists.xenproject.org/mailman/listinfo/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |