Xen project Mailing List

Re: [Xen-devel] [PATCH v2 5/5] x86/iommu: add PVH support to the inclusive options

To: Roger Pau Monne <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: "Tian, Kevin" <kevin.tian@xxxxxxxxx>

Date: Thu, 2 Aug 2018 08:03:53 +0000

Accept-language: en-US

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>

Delivery-date: Thu, 02 Aug 2018 08:04:13 +0000

Dlp-product: dlpe-windows

Dlp-reaction: no-action

Dlp-version: 11.0.400.15

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHUKYe1/cPfdji/GEqEyEjV+JrdwqSsG1Tw

Thread-topic: [Xen-devel] [PATCH v2 5/5] x86/iommu: add PVH support to the inclusive options

> From: Roger Pau Monne > Sent: Wednesday, August 1, 2018 7:04 PM > > Several people have reported hardware issues (malfunctioning USB > controllers) due to iommu page faults on Intel hardware. Those faults > are caused by missing RMRR (VTd) entries in the ACPI tables. Those can > be worked around on VTd hardware by manually adding RMRR entries on > the command line, this is however limited to Intel hardware and quite > cumbersome to do. > > In order to solve those issues add a new dom0-iommu=reserved option > that identity maps all regions marked as reserved in the memory map. > Note that regions used by devices emulated by Xen (LAPIC, IO-APIC or > PCIe MCFG regions) are specifically avoided. Note that this option is > available to a PVH Dom0 (as opposed to the inclusive option which only > works for PV Dom0). > > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> > --- > Changes since v1: > - Introduce a new reserved option instead of abusing the inclusive > option. > - Use the same helper function for PV and PVH in order to decide if a > page should be added to the domain page tables. > - Use the data inside of the domain struct to detect overlaps with > emulated MMIO regions. > --- > Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx> > Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx> > Cc: Jan Beulich <jbeulich@xxxxxxxx> > Cc: Julien Grall <julien.grall@xxxxxxx> > Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> > Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx> > Cc: Tim Deegan <tim@xxxxxxx> > Cc: Wei Liu <wei.liu2@xxxxxxxxxx> > --- > docs/misc/xen-command-line.markdown | 9 +++ > xen/arch/x86/hvm/io.c | 5 ++ > xen/drivers/passthrough/iommu.c | 3 + > xen/drivers/passthrough/x86/iommu.c | 88 +++++++++++++++++++++------- > - > xen/include/asm-x86/hvm/io.h | 3 + > xen/include/xen/iommu.h | 2 +- > 6 files changed, 86 insertions(+), 24 deletions(-) > > diff --git a/docs/misc/xen-command-line.markdown b/docs/misc/xen- > command-line.markdown > index 30d970bc2e..526a96ffc5 100644 > --- a/docs/misc/xen-command-line.markdown > +++ b/docs/misc/xen-command-line.markdown > @@ -1241,6 +1241,15 @@ detection of systems known to misbehave upon > accesses to that port. > >> applicable to a PV dom0. Also note that if `strict` mode is enabled > >> then conventional RAM pages not assigned to dom0 will not be mapped. > > +> `reserved` > + > +> Default: `true` on Intel hardware, `false` otherwise > + > +>> Use this to work around firmware issues providing incorrect RMRR or > IVMD > +>> entries. Rather than only mapping RAM pages for IOMMU accesses for > Dom0, > +>> all memory regions marked as reserved in the memory map that don't > overlap > +>> with any MMIO region from emulated devices will be identity mapped. > + > ### iommu\_dev\_iotlb\_timeout > > `= <integer>` > > diff --git a/xen/arch/x86/hvm/io.c b/xen/arch/x86/hvm/io.c > index bf4d8748d3..5e01c33890 100644 > --- a/xen/arch/x86/hvm/io.c > +++ b/xen/arch/x86/hvm/io.c > @@ -404,6 +404,11 @@ static const struct hvm_mmcfg > *vpci_mmcfg_find(const struct domain *d, > return NULL; > } > > +bool vpci_mmcfg_address(const struct domain *d, paddr_t addr) > +{ > + return vpci_mmcfg_find(d, addr); > +} > + > static unsigned int vpci_mmcfg_decode_addr(const struct hvm_mmcfg > *mmcfg, > paddr_t addr, pci_sbdf_t *sbdf) > { > diff --git a/xen/drivers/passthrough/iommu.c > b/xen/drivers/passthrough/iommu.c > index 6611e13cc2..a3eb7c5b7f 100644 > --- a/xen/drivers/passthrough/iommu.c > +++ b/xen/drivers/passthrough/iommu.c > @@ -75,6 +75,7 @@ custom_param("dom0-iommu", > parse_dom0_iommu_param); > bool __hwdom_initdata iommu_dom0_strict; > bool __read_mostly iommu_dom0_passthrough; > int8_t __hwdom_initdata iommu_dom0_inclusive = -1; > +int8_t __hwdom_initdata iommu_dom0_reserved = -1; > > DEFINE_PER_CPU(bool_t, iommu_dont_flush_iotlb); > > @@ -161,6 +162,8 @@ static int __init parse_dom0_iommu_param(const > char *s) > iommu_dom0_strict = !val; > else if ( !strncmp(s, "inclusive", ss - s) ) > iommu_dom0_inclusive = val; > + else if ( !strncmp(s, "reserved", ss - s) ) > + iommu_dom0_reserved = val; > else > rc = -EINVAL; > > diff --git a/xen/drivers/passthrough/x86/iommu.c > b/xen/drivers/passthrough/x86/iommu.c > index bf6edf4c04..66c5cc28ed 100644 > --- a/xen/drivers/passthrough/x86/iommu.c > +++ b/xen/drivers/passthrough/x86/iommu.c > @@ -20,6 +20,7 @@ > #include <xen/softirq.h> > #include <xsm/xsm.h> > > +#include <asm/hvm/io.h> > #include <asm/setup.h> > > void iommu_update_ire_from_apic( > @@ -134,15 +135,75 @@ void arch_iommu_domain_destroy(struct > domain *d) > { > } > > +static bool __hwdom_init iommu_map(const struct domain *d, unsigned > long pfn, > + unsigned long max_pfn) since the logic is limited to dom0, call "dom0_iommu_map" is clearer. > +{ > + unsigned int i; > + > + /* > + * Ignore any address below 1MB, that's already identity mapped by the > + * domain builder for HVM. > + */ > + if ( is_hvm_domain(d) && pfn < PFN_DOWN(MB(1)) ) > + return false; > + > + /* > + * If dom0-strict mode is enabled then exclude conventional RAM and > let the > + * common code map dom0's pages. > + */ > + if ( page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) && > + (iommu_dom0_strict || is_hvm_domain(d)) ) > + return false; > + if ( page_is_ram_type(pfn, RAM_TYPE_RESERVED) && > + (!iommu_dom0_reserved || !iommu_dom0_inclusive) ) > + return false; > + if ( !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE) && > + !page_is_ram_type(pfn, RAM_TYPE_RESERVED) && > + !page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) && > + (!iommu_dom0_inclusive || pfn > max_pfn) ) > + return false; > + > + /* Check that it doesn't overlap with the LAPIC */ > + if ( has_vlapic(d) ) > + { > + const struct vcpu *v; > + > + for_each_vcpu(d, v) > + if ( pfn == PFN_DOWN(vlapic_base_address(vcpu_vlapic(v))) ) > + return false; > + } > + /* ... or the IO-APIC */ > + for ( i = 0; has_vioapic(d) && i < d->arch.hvm_domain.nr_vioapics; i++ ) > + if ( pfn == PFN_DOWN(domain_vioapic(d, i)->base_address) ) > + return false; > + /* > + * ... or the PCIe MCFG regions. > + * TODO: runtime added MMCFG regions are not checked to make sure > they > + * don't overlap with already mapped regions, thus preventing trapping. > + */ > + if ( has_vpci(d) && vpci_mmcfg_address(d, pfn << PAGE_SHIFT) ) > + return false; > + > + return true; > +} > + > void __hwdom_init arch_iommu_hwdom_init(struct domain *d) > { > unsigned long i, top, max_pfn; > > + if ( iommu_dom0_passthrough ) > + return; > + > BUG_ON(!is_hardware_domain(d)); > > - /* Set the default value of inclusive depending on the hardware. */ > + /* > + * Set the default value of inclusive and reserved depending on the > + * hardware. > + */ > if ( iommu_dom0_inclusive == -1 ) > iommu_dom0_inclusive = boot_cpu_data.x86_vendor == > X86_VENDOR_INTEL; > + if ( iommu_dom0_reserved == -1 ) > + iommu_dom0_reserved = boot_cpu_data.x86_vendor == > X86_VENDOR_INTEL; same comment as for 4/5 > > max_pfn = (GB(4) >> PAGE_SHIFT) - 1; > top = max(max_pdx, pfn_to_pdx(max_pfn) + 1); > @@ -150,7 +211,6 @@ void __hwdom_init > arch_iommu_hwdom_init(struct domain *d) > for ( i = 0; i < top; i++ ) > { > unsigned long pfn = pdx_to_pfn(i); > - bool map; > int rc; > > /* > @@ -159,27 +219,9 @@ void __hwdom_init > arch_iommu_hwdom_init(struct domain *d) > * regions. When set, the inclusive mapping additionally maps in > * every pfn up to 4GB except those that fall in unusable ranges. > */ > - if ( pfn > max_pfn && !mfn_valid(_mfn(pfn)) ) > - continue; > - > - if ( iommu_dom0_inclusive && pfn <= max_pfn ) > - map = !page_is_ram_type(pfn, RAM_TYPE_UNUSABLE); > - else > - map = page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL); > - > - if ( !map ) > - continue; > - > - /* Exclude Xen bits */ > - if ( xen_in_range(pfn) ) > - continue; > - > - /* > - * If dom0-strict mode is enabled then exclude conventional RAM > - * and let the common code map dom0's pages. > - */ > - if ( iommu_dom0_strict && > - page_is_ram_type(pfn, RAM_TYPE_CONVENTIONAL) ) > + if ( (pfn > max_pfn && !mfn_valid(_mfn(pfn))) || > + /* Exclude Xen bits */ > + xen_in_range(pfn) || !iommu_map(d, pfn, max_pfn) ) > continue; > > rc = iommu_map_page(d, pfn, pfn, > IOMMUF_readable|IOMMUF_writable); > diff --git a/xen/include/asm-x86/hvm/io.h b/xen/include/asm- > x86/hvm/io.h > index e6b6ed0b92..8cca456b55 100644 > --- a/xen/include/asm-x86/hvm/io.h > +++ b/xen/include/asm-x86/hvm/io.h > @@ -180,6 +180,9 @@ int register_vpci_mmcfg_handler(struct domain *d, > paddr_t addr, > /* Destroy tracked MMCFG areas. */ > void destroy_vpci_mmcfg(struct domain *d); > > +/* Check if an address is between a MMCFG region for a domain. */ > +bool vpci_mmcfg_address(const struct domain *d, paddr_t addr); > + > #endif /* __ASM_X86_HVM_IO_H__ */ > > > diff --git a/xen/include/xen/iommu.h b/xen/include/xen/iommu.h > index 99e5b89c0f..fed1b1ea7a 100644 > --- a/xen/include/xen/iommu.h > +++ b/xen/include/xen/iommu.h > @@ -37,7 +37,7 @@ extern bool_t iommu_debug; > extern bool_t amd_iommu_perdev_intremap; > > extern bool iommu_dom0_strict, iommu_dom0_passthrough; > -extern int8_t iommu_dom0_inclusive; > +extern int8_t iommu_dom0_inclusive, iommu_dom0_reserved; > > extern unsigned int iommu_dev_iotlb_timeout; > > -- > 2.18.0 > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxxx > https://lists.xenproject.org/mailman/listinfo/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.