[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 5/5] x86 / iommu: create a dedicated pool of page-table pages



On 05.10.2020 11:49, Paul Durrant wrote:
> --- a/xen/arch/x86/domain.c
> +++ b/xen/arch/x86/domain.c
> @@ -2304,7 +2304,9 @@ int domain_relinquish_resources(struct domain *d)
>  
>      PROGRESS(iommu_pagetables):
>  
> -        ret = iommu_free_pgtables(d);
> +        iommu_free_pgtables(d);
> +
> +        ret = iommu_set_allocation(d, 0);
>          if ( ret )
>              return ret;

There doesn't look to be a need to call iommu_free_pgtables()
more than once - how about you move it immediately ahead of
the (extended) case label?

> +static int set_allocation(struct domain *d, unsigned int nr_pages,
> +                          bool allow_preempt)

Why the allow_preempt parameter when the sole caller passes
"true"?

> +/*
> + * Some IOMMU mappings are set up during domain_create() before the tool-
> + * stack has a chance to calculate and set the appropriate page-table
> + * allocation. A hard-coded initial allocation covers this gap.
> + */
> +#define INITIAL_ALLOCATION 256

How did you arrive at this number? IOW how many pages do we
need in reality, and how much leeway have you added in?

As to the tool stack - why would it "have a chance" to do the
necessary calculations only pretty late? I wonder whether the
intended allocation wouldn't better be part of struct
xen_domctl_createdomain, without the need for a new sub-op.

> @@ -265,38 +350,45 @@ void __hwdom_init arch_iommu_hwdom_init(struct domain 
> *d)
>          return;
>  }
>  
> -int iommu_free_pgtables(struct domain *d)
> +void iommu_free_pgtables(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> -    struct page_info *pg;
> -    unsigned int done = 0;
>  
> -    while ( (pg = page_list_remove_head(&hd->arch.pgtables.list)) )
> -    {
> -        free_domheap_page(pg);
> +    spin_lock(&hd->arch.pgtables.lock);
>  
> -        if ( !(++done & 0xff) && general_preempt_check() )
> -            return -ERESTART;
> -    }
> +    page_list_splice(&hd->arch.pgtables.list, &hd->arch.pgtables.free_list);
> +    INIT_PAGE_LIST_HEAD(&hd->arch.pgtables.list);
>  
> -    return 0;
> +    spin_unlock(&hd->arch.pgtables.lock);
>  }
>  
>  struct page_info *iommu_alloc_pgtable(struct domain *d)
>  {
>      struct domain_iommu *hd = dom_iommu(d);
> -    unsigned int memflags = 0;
>      struct page_info *pg;
>      void *p;
>  
> -#ifdef CONFIG_NUMA
> -    if ( hd->node != NUMA_NO_NODE )
> -        memflags = MEMF_node(hd->node);
> -#endif
> +    spin_lock(&hd->arch.pgtables.lock);
>  
> -    pg = alloc_domheap_page(NULL, memflags);
> + again:
> +    pg = page_list_remove_head(&hd->arch.pgtables.free_list);
>      if ( !pg )
> +    {
> +        /*
> +         * The hardware and quarantine domains are not subject to a quota
> +         * so create page-table pages on demand.
> +         */
> +        if ( is_hardware_domain(d) || d == dom_io )
> +        {
> +            int rc = create_pgtable(d);
> +
> +            if ( !rc )
> +                goto again;

This gives the appearance of a potentially infinite loop; it's
not because the lock is being held, but I still wonder whether
the impression this gives couldn't be avoided by a slightly
different code structure.

Also the downside of this is that the amount of pages used by
hwdom will now never shrink anymore.

> @@ -306,7 +398,6 @@ struct page_info *iommu_alloc_pgtable(struct domain *d)
>  
>      unmap_domain_page(p);
>  
> -    spin_lock(&hd->arch.pgtables.lock);
>      page_list_add(pg, &hd->arch.pgtables.list);
>      spin_unlock(&hd->arch.pgtables.lock);

You want to drop the lock before the map/clear/unmap, and then
re-acquire it. Or, on the assumption that putting it on the
list earlier is fine (which I think it is), move the other two
lines here up as well.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.