[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 08/10] remove late (on-demand) construction of IOMMU page tables



On Fri, Aug 16, 2019 at 06:19:59PM +0100, Paul Durrant wrote:
> Now that there is a per-domain IOMMU enable flag, which should be enabled if
> any device is going to be passed through, stop deferring page table
> construction until the assignment is done. Also don't tear down the tables
> again when the last device is de-assigned; defer that task until domain
> destruction.
> 
> This allows the has_iommu_pt() helper and iommu_status enumeration to be
> removed. Calls to has_iommu_pt() are simply replaced by calls to
> is_iommu_enabled(). Remaining open-code tests of iommu_hap_pt_share can also
> be replaced by calls to iommu_use_hap_pt().
> The arch_iommu_populate_page_table() and iommu_construct() functions become
> redundant, as does the 'strict mode' dom0 page_list mapping code in
> iommu_hwdom_init(), and iommu_teardown() can be made static is its only
> remaining caller, iommu_domain_destroy(), is within the same source
> module.
> 
> All in all, about 220 lines of code are removed.
> 
> NOTE: This patch will cause a small amount of extra resource to be used
>       to accommodate IOMMU page tables that may never be used, since the
>       per-domain IOMMU flag enable flag is currently set to the value
>       of the global iommu_enable flag. A subsequent patch will add an
>       option to the toolstack to allow it to be turned off if there is
>       no intention to assign passthrough hardware to the domain.
> 
> Signed-off-by: Paul Durrant <paul.durrant@xxxxxxxxxx>
> Reviewed-by: Alexandru Isaila <aisaila@xxxxxxxxxxxxxxx>
> ---
> Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>
> Cc: Julien Grall <julien.grall@xxxxxxx>
> Cc: Volodymyr Babchuk <Volodymyr_Babchuk@xxxxxxxx>
> Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
> Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
> Cc: Jan Beulich <jbeulich@xxxxxxxx>
> Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
> Cc: Tim Deegan <tim@xxxxxxx>
> Cc: Wei Liu <wl@xxxxxxx>
> Cc: "Roger Pau Monné" <roger.pau@xxxxxxxxxx>
> Cc: Tamas K Lengyel <tamas@xxxxxxxxxxxxx>
> Cc: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> Cc: Razvan Cojocaru <rcojocaru@xxxxxxxxxxxxxxx>
> Cc: Petre Pircalabu <ppircalabu@xxxxxxxxxxxxxxx>
> 
> Previously part of series 
> https://lists.xenproject.org/archives/html/xen-devel/2019-07/msg02267.html
> 
> v5:
>  - Minor style fixes
> ---
>  xen/arch/arm/p2m.c                    |   2 +-
>  xen/arch/x86/dom0_build.c             |   2 +-
>  xen/arch/x86/hvm/mtrr.c               |   5 +-
>  xen/arch/x86/mm/mem_sharing.c         |   2 +-
>  xen/arch/x86/mm/paging.c              |   2 +-
>  xen/arch/x86/x86_64/mm.c              |   2 +-
>  xen/common/memory.c                   |   4 +-
>  xen/common/vm_event.c                 |   2 +-
>  xen/drivers/passthrough/device_tree.c |  11 ---
>  xen/drivers/passthrough/iommu.c       | 134 ++++++--------------------
>  xen/drivers/passthrough/pci.c         |  12 ---
>  xen/drivers/passthrough/vtd/iommu.c   |  10 +-
>  xen/drivers/passthrough/x86/iommu.c   |  95 ------------------
>  xen/include/asm-arm/iommu.h           |   2 +-
>  xen/include/asm-x86/iommu.h           |   2 +-
>  xen/include/xen/iommu.h               |  16 ---
>  xen/include/xen/sched.h               |   2 -
>  17 files changed, 42 insertions(+), 263 deletions(-)
> 
> diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
> index 7f1442932a..692565757e 100644
> --- a/xen/arch/arm/p2m.c
> +++ b/xen/arch/arm/p2m.c
> @@ -1056,7 +1056,7 @@ static int __p2m_set_entry(struct p2m_domain *p2m,
>           !mfn_eq(lpae_get_mfn(*entry), lpae_get_mfn(orig_pte)) )
>          p2m_free_entry(p2m, orig_pte, level);
>  
> -    if ( has_iommu_pt(p2m->domain) &&
> +    if ( is_iommu_enabled(p2m->domain) &&
>           (lpae_is_valid(orig_pte) || lpae_is_valid(*entry)) )
>      {
>          unsigned int flush_flags = 0;
> diff --git a/xen/arch/x86/dom0_build.c b/xen/arch/x86/dom0_build.c
> index d381784edd..7cfab2dc25 100644
> --- a/xen/arch/x86/dom0_build.c
> +++ b/xen/arch/x86/dom0_build.c
> @@ -365,7 +365,7 @@ unsigned long __init dom0_compute_nr_pages(
>      }
>  
>      need_paging = is_hvm_domain(d) &&
> -        (!iommu_hap_pt_share || !paging_mode_hap(d));
> +        (!iommu_use_hap_pt(d) || !paging_mode_hap(d));

I'm not sure this is correct, at the point where dom0_compute_nr_pages
gets called the iommu has not been initialized yet (the call to
iommu_hwdom_init is done afterwards), so the iommu status field which
is used by iommu_use_hap_pt is not yet initialized.

>      for ( ; ; need_paging = false )
>      {
>          nr_pages = get_memsize(&dom0_size, avail);
> diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
> index 7ccd85bcea..5ad15eafe0 100644
> --- a/xen/arch/x86/hvm/mtrr.c
> +++ b/xen/arch/x86/hvm/mtrr.c
> @@ -783,7 +783,8 @@ HVM_REGISTER_SAVE_RESTORE(MTRR, hvm_save_mtrr_msr, 
> hvm_load_mtrr_msr, 1,
>  
>  void memory_type_changed(struct domain *d)
>  {
> -    if ( (has_iommu_pt(d) || cache_flush_permitted(d)) && d->vcpu && 
> d->vcpu[0] )
> +    if ( (is_iommu_enabled(d) || cache_flush_permitted(d)) &&
> +         d->vcpu && d->vcpu[0] )
>      {
>          p2m_memory_type_changed(d);
>          flush_all(FLUSH_CACHE);
> @@ -831,7 +832,7 @@ int epte_get_entry_emt(struct domain *d, unsigned long 
> gfn, mfn_t mfn,
>          return MTRR_TYPE_UNCACHABLE;
>      }
>  
> -    if ( !has_iommu_pt(d) && !cache_flush_permitted(d) )
> +    if ( !is_iommu_enabled(d) && !cache_flush_permitted(d) )
>      {
>          *ipat = 1;
>          return MTRR_TYPE_WRBACK;
> diff --git a/xen/arch/x86/mm/mem_sharing.c b/xen/arch/x86/mm/mem_sharing.c
> index a5fe89e339..efb8821768 100644
> --- a/xen/arch/x86/mm/mem_sharing.c
> +++ b/xen/arch/x86/mm/mem_sharing.c
> @@ -1664,7 +1664,7 @@ int mem_sharing_domctl(struct domain *d, struct 
> xen_domctl_mem_sharing_op *mec)
>          case XEN_DOMCTL_MEM_SHARING_CONTROL:
>          {
>              rc = 0;
> -            if ( unlikely(has_iommu_pt(d) && mec->u.enable) )
> +            if ( unlikely(is_iommu_enabled(d) && mec->u.enable) )
>                  rc = -EXDEV;
>              else
>                  d->arch.hvm.mem_sharing_enabled = mec->u.enable;
> diff --git a/xen/arch/x86/mm/paging.c b/xen/arch/x86/mm/paging.c
> index 69aa228e46..d9a52c4db4 100644
> --- a/xen/arch/x86/mm/paging.c
> +++ b/xen/arch/x86/mm/paging.c
> @@ -213,7 +213,7 @@ int paging_log_dirty_enable(struct domain *d, bool_t 
> log_global)
>  {
>      int ret;
>  
> -    if ( has_iommu_pt(d) && log_global )
> +    if ( is_iommu_enabled(d) && log_global )
>      {
>          /*
>           * Refuse to turn on global log-dirty mode
> diff --git a/xen/arch/x86/x86_64/mm.c b/xen/arch/x86/x86_64/mm.c
> index 1919cae18b..827b3f5e27 100644
> --- a/xen/arch/x86/x86_64/mm.c
> +++ b/xen/arch/x86/x86_64/mm.c
> @@ -1433,7 +1433,7 @@ int memory_add(unsigned long spfn, unsigned long epfn, 
> unsigned int pxm)
>       * shared or being kept in sync then newly added memory needs to be
>       * mapped here.
>       */
> -    if ( has_iommu_pt(hardware_domain) &&
> +    if ( is_iommu_enabled(hardware_domain) &&
>           !iommu_use_hap_pt(hardware_domain) &&
>           !need_iommu_pt_sync(hardware_domain) )
>      {
> diff --git a/xen/common/memory.c b/xen/common/memory.c
> index d9b35a608c..71445c2f53 100644
> --- a/xen/common/memory.c
> +++ b/xen/common/memory.c
> @@ -823,7 +823,7 @@ int xenmem_add_to_physmap(struct domain *d, struct 
> xen_add_to_physmap *xatp,
>      xatp->gpfn += start;
>      xatp->size -= start;
>  
> -    if ( has_iommu_pt(d) )
> +    if ( is_iommu_enabled(d) )
>         this_cpu(iommu_dont_flush_iotlb) = 1;
>  
>      while ( xatp->size > done )
> @@ -844,7 +844,7 @@ int xenmem_add_to_physmap(struct domain *d, struct 
> xen_add_to_physmap *xatp,
>          }
>      }
>  
> -    if ( has_iommu_pt(d) )
> +    if ( is_iommu_enabled(d) )
>      {
>          int ret;
>  
> diff --git a/xen/common/vm_event.c b/xen/common/vm_event.c
> index 2a1c87e44b..3b18195ebf 100644
> --- a/xen/common/vm_event.c
> +++ b/xen/common/vm_event.c
> @@ -630,7 +630,7 @@ int vm_event_domctl(struct domain *d, struct 
> xen_domctl_vm_event_op *vec)
>  
>              /* No paging if iommu is used */
>              rc = -EMLINK;
> -            if ( unlikely(has_iommu_pt(d)) )
> +            if ( unlikely(is_iommu_enabled(d)) )
>                  break;
>  
>              rc = -EXDEV;
> diff --git a/xen/drivers/passthrough/device_tree.c 
> b/xen/drivers/passthrough/device_tree.c
> index 12f2c4c3f2..ea9fd54e3b 100644
> --- a/xen/drivers/passthrough/device_tree.c
> +++ b/xen/drivers/passthrough/device_tree.c
> @@ -40,17 +40,6 @@ int iommu_assign_dt_device(struct domain *d, struct 
> dt_device_node *dev)
>      if ( !list_empty(&dev->domain_list) )
>          goto fail;
>  
> -    /*
> -     * The hwdom is forced to use IOMMU for protecting assigned
> -     * device. Therefore the IOMMU data is already set up.
> -     */
> -    ASSERT(!is_hardware_domain(d) ||
> -           hd->status == IOMMU_STATUS_initialized);
> -
> -    rc = iommu_construct(d);
> -    if ( rc )
> -        goto fail;
> -
>      /* The flag field doesn't matter to DT device. */
>      rc = hd->platform_ops->assign_device(d, 0, dt_to_dev(dev), 0);
>  
> diff --git a/xen/drivers/passthrough/iommu.c b/xen/drivers/passthrough/iommu.c
> index 27c75e0786..dc7b75fab6 100644
> --- a/xen/drivers/passthrough/iommu.c
> +++ b/xen/drivers/passthrough/iommu.c
> @@ -146,6 +146,17 @@ static int __init parse_dom0_iommu_param(const char *s)
>  }
>  custom_param("dom0-iommu", parse_dom0_iommu_param);
>  
> +static void __hwdom_init check_hwdom_reqs(struct domain *d)
> +{
> +    if ( iommu_hwdom_none || !paging_mode_translate(d) )
> +        return;
> +
> +    arch_iommu_check_autotranslated_hwdom(d);
> +
> +    iommu_hwdom_passthrough = false;
> +    iommu_hwdom_strict = true;
> +}
> +
>  int iommu_domain_init(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> @@ -159,129 +170,44 @@ int iommu_domain_init(struct domain *d)
>          return ret;
>  
>      hd->platform_ops = iommu_get_ops();
> -    return hd->platform_ops->init(d);
> -}
> +    ret = hd->platform_ops->init(d);
> +    if ( ret )
> +        return ret;
>  
> -static void __hwdom_init check_hwdom_reqs(struct domain *d)
> -{
> -    if ( iommu_hwdom_none || !paging_mode_translate(d) )
> -        return;
> +    /*
> +     * NB: 'relaxed' h/w domains don't need the IOMMU mappings to be kept
> +     *     in-sync with their assigned pages because all host RAM will be
> +     *     mapped during hwdom_init().
> +     */
> +    if ( is_hardware_domain(d) )
> +        check_hwdom_reqs(d); /* may modify iommu_hwdom_strict */
>  
> -    arch_iommu_check_autotranslated_hwdom(d);
> +    if ( !is_hardware_domain(d) || iommu_hwdom_strict )
> +        hd->need_sync = !iommu_use_hap_pt(d);
>  
> -    iommu_hwdom_passthrough = false;
> -    iommu_hwdom_strict = true;
> +    return 0;
>  }
>  
>  void __hwdom_init iommu_hwdom_init(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
>  
> -    check_hwdom_reqs(d);
> -
>      if ( !is_iommu_enabled(d) )
>          return;
>  
>      register_keyhandler('o', &iommu_dump_p2m_table, "dump iommu p2m table", 
> 0);
>  
> -    hd->status = IOMMU_STATUS_initializing;
> -    /*
> -     * NB: relaxed hw domains don't need sync because all ram is already
> -     * mapped in the iommu page tables.
> -     */
> -    hd->need_sync = iommu_hwdom_strict && !iommu_use_hap_pt(d);
> -    if ( need_iommu_pt_sync(d) )
> -    {
> -        struct page_info *page;
> -        unsigned int i = 0, flush_flags = 0;
> -        int rc = 0;
> -
> -        page_list_for_each ( page, &d->page_list )
> -        {
> -            unsigned long mfn = mfn_x(page_to_mfn(page));
> -            unsigned long dfn = mfn_to_gmfn(d, mfn);
> -            unsigned int mapping = IOMMUF_readable;
> -            int ret;
> -
> -            if ( ((page->u.inuse.type_info & PGT_count_mask) == 0) ||
> -                 ((page->u.inuse.type_info & PGT_type_mask)
> -                  == PGT_writable_page) )
> -                mapping |= IOMMUF_writable;
> -
> -            ret = iommu_map(d, _dfn(dfn), _mfn(mfn), 0, mapping,
> -                            &flush_flags);
> -
> -            if ( !rc )
> -                rc = ret;
> -
> -            if ( !(i++ & 0xfffff) )
> -                process_pending_softirqs();
> -        }

Don't you need to add the domain pages to the page-tables?

iommu_hwdom_init is called after the memory for the domain has been
added. Maybe this is fine because the iommu would be enabled earlier
and thus pages added to the domain would already be added to the iommu
page-tables if required?

Thanks, Roger.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.