[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping



On Fri, Apr 29, 2016 at 10:25 AM, Quan Xu <quan.xu@xxxxxxxxx> wrote:
> When IOMMU mapping is failed, we issue a best effort rollback, stopping
> IOMMU mapping, unmapping the previous IOMMU maps and then reporting the
> error up to the call trees. When rollback is not feasible (in early
> initialization phase or trade-off of complexity) for the hardware domain,
> we do things on a best effort basis, only throwing out an error message.
>
> IOMMU unmapping should perhaps continue despite an error, in an attempt
> to do best effort cleanup.
>
> Signed-off-by: Quan Xu <quan.xu@xxxxxxxxx>
>
> CC: Keir Fraser <keir@xxxxxxx>
> CC: Jan Beulich <jbeulich@xxxxxxxx>
> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> CC: Jun Nakajima <jun.nakajima@xxxxxxxxx>
> CC: Kevin Tian <kevin.tian@xxxxxxxxx>
> CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> ---
>  xen/arch/x86/mm.c               | 13 ++++++++-----
>  xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
>  xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
>  xen/arch/x86/mm/p2m.c           | 11 +++++++++--
>  xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
>  5 files changed, 75 insertions(+), 14 deletions(-)
>
> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c
> index a42097f..427097d 100644
> --- a/xen/arch/x86/mm.c
> +++ b/xen/arch/x86/mm.c
> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info *page, 
> unsigned long type,
>                             int preemptible)
>  {
>      unsigned long nx, x, y = page->u.inuse.type_info;
> -    int rc = 0;
> +    int rc = 0, ret = 0;
>
>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
>
> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info *page, 
> unsigned long type,
>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
>          {
>              if ( (x & PGT_type_mask) == PGT_writable_page )
> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
>              else if ( type == PGT_writable_page )
> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> -                               page_to_mfn(page),
> -                               IOMMUF_readable|IOMMUF_writable);
> +                ret = iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> +                                     page_to_mfn(page),
> +                                     IOMMUF_readable|IOMMUF_writable);
>          }
>      }
>
> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info *page, 
> unsigned long type,
>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
>          put_page(page);
>
> +    if ( !rc )
> +        rc = ret;
> +
>      return rc;
>  }
>
> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> index 1ed5b47..df87944 100644
> --- a/xen/arch/x86/mm/p2m-ept.c
> +++ b/xen/arch/x86/mm/p2m-ept.c
> @@ -821,6 +821,8 @@ out:
>      if ( needs_sync )
>          ept_sync_domain(p2m);
>
> +    ret = 0;
> +
>      /* For host p2m, may need to change VT-d page table.*/
>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
>           need_modify_vtd_table )
> @@ -831,11 +833,29 @@ out:
>          {
>              if ( iommu_flags )
>                  for ( i = 0; i < (1 << order); i++ )
> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, iommu_flags);
> +                {
> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, 
> iommu_flags);
> +
> +                    if ( !ret && unlikely(rc) )
> +                    {
> +                        while ( i-- )
> +                            iommu_unmap_page(d, gfn + i);
> +
> +                        ret = rc;
> +                        break;
> +                    }
> +                }
>              else
>                  for ( i = 0; i < (1 << order); i++ )
> -                    iommu_unmap_page(d, gfn + i);
> +                {
> +                    rc = iommu_unmap_page(d, gfn + i);
> +
> +                    if ( !ret && unlikely(rc) )
> +                        ret = rc;
> +                }
>          }
> +
> +        rc = 0;
>      }

So the reason for doing this thing with setting ret=0, then using rc,
then setting rc to zero, is to make sure that the change is propagated
to the altp2m if necessary?

I'm not a fan of this sort of "implied signalling".  The check at the
altp2m conditional is meant to say, "everything went fine, go ahead
and propagate the change".  But with your changes, that's not what we
want anymore -- we want, "If the change actually made it into the
hostp2m, propagate it to the altp2m."

As such, I think it would be a lot clearer to just make a new boolean
variable, maybe "entry_written", which we initialize to false and then
set to true when the entry is actually written; and then change the
condition re the altp2m to "if ( entry_written && p2m_is_hostp2m(..)
)".

Then I think you could make the ret / rc use mirror what you do
elsewhere, where ret is used to store the return value of the function
call, and it's propagated to rc only if rc is not already set.

> @@ -680,11 +680,27 @@ p2m_pt_set_entry(struct p2m_domain *p2m, unsigned long 
> gfn, mfn_t mfn,
>          }
>          else if ( iommu_pte_flags )
>              for ( i = 0; i < (1UL << page_order); i++ )
> -                iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
> -                               iommu_pte_flags);
> +            {
> +                ret = iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i,
> +                                    iommu_pte_flags);
> +
> +                if ( !rc && unlikely(ret) )
> +                {
> +                    while ( i-- )
> +                        iommu_unmap_page(p2m->domain, gfn + i);
> +
> +                    rc = ret;
> +                    break;
> +                }

Hmm, is this conditional here right?  What the code actually says to
do is to always map pages, but to only unmap them on an error if there
have been no other errors in the function so far.

As it happens, at the moment rc cannot be non-zero here since any time
it's non-zero the code jumps down to the 'out' label, skipping this.
But if that ever changed, this would fail to unmap when it probably
should.

It seems like this be:

if ( unlikely(ret) )
{
  while ( i-- )
    iommu_unmap_page(p2m->domain, gfn + i);
  if ( !rc )
    rc = ret;
  break;
}

Or if you want to assume that rc will never be zero, then put an
ASSERT() just before the for loop here.

Everything else looks good to me.  Thanks again for your work on this.

 -George

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.