[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping



On May 05, 2016 7:03 PM, George Dunlap <george.dunlap@xxxxxxxxxx> wrote:
> On 05/05/16 07:53, Xu, Quan wrote:
> > On May 04, 2016 9:49 PM, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
> wrote:
> >> On Fri, Apr 29, 2016 at 10:25 AM, Quan Xu <quan.xu@xxxxxxxxx> wrote:
> >>> When IOMMU mapping is failed, we issue a best effort rollback,
> >>> stopping IOMMU mapping, unmapping the previous IOMMU maps and
> >> then
> >>> reporting the error up to the call trees. When rollback is not
> >>> feasible (in early initialization phase or trade-off of complexity)
> >>> for the hardware domain, we do things on a best effort basis, only
> >>> throwing
> >> out an error message.
> >>>
> >>> IOMMU unmapping should perhaps continue despite an error, in an
> >>> attempt to do best effort cleanup.
> >>>
> >>> Signed-off-by: Quan Xu <quan.xu@xxxxxxxxx>
> >>>
> >>> CC: Keir Fraser <keir@xxxxxxx>
> >>> CC: Jan Beulich <jbeulich@xxxxxxxx>
> >>> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> >>> CC: Jun Nakajima <jun.nakajima@xxxxxxxxx>
> >>> CC: Kevin Tian <kevin.tian@xxxxxxxxx>
> >>> CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx>
> >>> ---
> >>>  xen/arch/x86/mm.c               | 13 ++++++++-----
> >>>  xen/arch/x86/mm/p2m-ept.c       | 27 +++++++++++++++++++++++++--
> >>>  xen/arch/x86/mm/p2m-pt.c        | 24 ++++++++++++++++++++----
> >>>  xen/arch/x86/mm/p2m.c           | 11 +++++++++--
> >>>  xen/drivers/passthrough/iommu.c | 14 +++++++++++++-
> >>>  5 files changed, 75 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index
> >>> a42097f..427097d 100644
> >>> --- a/xen/arch/x86/mm.c
> >>> +++ b/xen/arch/x86/mm.c
> >>> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>                             int preemptible)  {
> >>>      unsigned long nx, x, y = page->u.inuse.type_info;
> >>> -    int rc = 0;
> >>> +    int rc = 0, ret = 0;
> >>>
> >>>      ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2)));
> >>>
> >>> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>          if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) )
> >>>          {
> >>>              if ( (x & PGT_type_mask) == PGT_writable_page )
> >>> -                iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page)));
> >>> +                ret = iommu_unmap_page(d, mfn_to_gmfn(d,
> >> page_to_mfn(page)));
> >>>              else if ( type == PGT_writable_page )
> >>> -                iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)),
> >>> -                               page_to_mfn(page),
> >>> -                               IOMMUF_readable|IOMMUF_writable);
> >>> +                ret = iommu_map_page(d, mfn_to_gmfn(d,
> page_to_mfn(page)),
> >>> +                                     page_to_mfn(page),
> >>> +
> >>> + IOMMUF_readable|IOMMUF_writable);
> >>>          }
> >>>      }
> >>>
> >>> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info
> >> *page, unsigned long type,
> >>>      if ( (x & PGT_partial) && !(nx & PGT_partial) )
> >>>          put_page(page);
> >>>
> >>> +    if ( !rc )
> >>> +        rc = ret;
> >>> +
> >>>      return rc;
> >>>  }
> >>>
> >>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c
> >>> index 1ed5b47..df87944 100644
> >>> --- a/xen/arch/x86/mm/p2m-ept.c
> >>> +++ b/xen/arch/x86/mm/p2m-ept.c
> >>> @@ -821,6 +821,8 @@ out:
> >>>      if ( needs_sync )
> >>>          ept_sync_domain(p2m);
> >>>
> >>> +    ret = 0;
> >>> +
> >>>      /* For host p2m, may need to change VT-d page table.*/
> >>>      if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) &&
> >>>           need_modify_vtd_table )
> >>> @@ -831,11 +833,29 @@ out:
> >>>          {
> >>>              if ( iommu_flags )
> >>>                  for ( i = 0; i < (1 << order); i++ )
> >>> -                    iommu_map_page(d, gfn + i, mfn_x(mfn) + i, 
> >>> iommu_flags);
> >>> +                {
> >>> +                    rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i,
> >>> + iommu_flags);
> >>> +
> >>> +                    if ( !ret && unlikely(rc) )
> >>> +                    {
> >>> +                        while ( i-- )
> >>> +                            iommu_unmap_page(d, gfn + i);
> >>> +
> >>> +                        ret = rc;
> >>> +                        break;
> >>> +                    }
> >>> +                }
> >>>              else
> >>>                  for ( i = 0; i < (1 << order); i++ )
> >>> -                    iommu_unmap_page(d, gfn + i);
> >>> +                {
> >>> +                    rc = iommu_unmap_page(d, gfn + i);
> >>> +
> >>> +                    if ( !ret && unlikely(rc) )
> >>> +                        ret = rc;
> >>> +                }
> >>>          }
> >>> +
> >>> +        rc = 0;
> >>>      }
> >>
> >> So the reason for doing this thing with setting ret=0, then using rc,
> >> then setting rc to zero, is to make sure that the change is
> >> propagated to the altp2m if necessary?
> >>
> >> I'm not a fan of this sort of "implied signalling".  The check at the
> >> altp2m conditional is meant to say, "everything went fine, go ahead
> >> and propagate the change".  But with your changes, that's not what we
> >> want anymore -- we want, "If the change actually made it into the
> >> hostp2m, propagate it to the altp2m."
> >>
> >> As such, I think it would be a lot clearer to just make a new boolean
> >> variable, maybe "entry_written", which we initialize to false and
> >> then set to true when the entry is actually written; and then change
> >> the condition re the altp2m to "if ( entry_written &&
> >> p2m_is_hostp2m(..) )".
> >>
> >> Then I think you could make the ret / rc use mirror what you do
> >> elsewhere, where ret is used to store the return value of the
> >> function call, and it's propagated to rc only if rc is not already set.
> >>
> >
> > George,
> >
> > Thanks for your detailed explanation. This seems an another matter of
> preference.
> > Of course, I'd follow your suggestion in p2m  field, while I need to
> > hear the other maintainers' options as well (especially when it comes from
> Kevin/Jan, as they do spend a lot of time for me).
> >
> > A little bit of difference here, IMO, an int 'entry_written' is much
> > better, as we also need  to propagate the 'entry_written' up to callers, 
> > i.e. the
> errors returned from atomic_write_ept_entry() are '-EINVAL', '-ESRCH' and '-
> EBUSY'. Then, the follow is my modification (feel free to correct me):
> 
> But it doesn't look like you're actually taking advantage of the fact that
> entry_written is non-boolean -- other than immediately copying the value into
> rc, all the places you're only checking if it's zero or not (where "zero" in 
> this
> case actually means "the entry was written").
> 
> The code you had before was, as far as I can tell, functionally correct.
>  The point of introducing the extra variable is to decrease cognitive load on
> people coming along to modify the code later.  To do that, you want to
> maximize the things you do that are within expectation, and minimize the
> things you do outside of that.
> 
> The name "entry_written" sounds like a boolean; developers will expect
> "0 / false" to mean "not written" and "1/true" to mean "entry written".
>  Furthermore, rc looks like a return value; if coders see "rc =
> atomic_write_ept_entry()" they immediately have a framework for what's
> going on; whereas if they see "entry_written == [...]; if
> (entry_written) { ... rc= entry_written; }", it takes some thinking to figure 
> out.
> Not much, but every little bit of unnecessary thinking adds up.
> 
> My suggestion remains to:
> 1. Declare entry_written as a bool, initialized to false 2. In both
> atomic_write_ept_entry() calls, use rc to collect the return value 3. In the
> second one (the one which may fail), add an else { entry_written = true; } 4. 
> In
> the conditional deciding whether to update the iommu, use "entry_written", I
> think.  At this point rc == 0 and entry_written = true will be identical, but 
> if we
> ever insert something in between, we want the iommu update to be
> attempted any time the entry is successfully written.
> 5. Use "if ( entry_written && ...)" when deciding whether to propagate the
> change to the altp2m.
> 

George, 
Thanks. It is very clear now, and I will  follow it in next v4. 

Quan

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.