[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 03/10] IOMMU/MMU: enhance the call trees of IOMMU unmapping and mapping
On May 05, 2016 7:03 PM, George Dunlap <george.dunlap@xxxxxxxxxx> wrote: > On 05/05/16 07:53, Xu, Quan wrote: > > On May 04, 2016 9:49 PM, George Dunlap <George.Dunlap@xxxxxxxxxxxxx> > wrote: > >> On Fri, Apr 29, 2016 at 10:25 AM, Quan Xu <quan.xu@xxxxxxxxx> wrote: > >>> When IOMMU mapping is failed, we issue a best effort rollback, > >>> stopping IOMMU mapping, unmapping the previous IOMMU maps and > >> then > >>> reporting the error up to the call trees. When rollback is not > >>> feasible (in early initialization phase or trade-off of complexity) > >>> for the hardware domain, we do things on a best effort basis, only > >>> throwing > >> out an error message. > >>> > >>> IOMMU unmapping should perhaps continue despite an error, in an > >>> attempt to do best effort cleanup. > >>> > >>> Signed-off-by: Quan Xu <quan.xu@xxxxxxxxx> > >>> > >>> CC: Keir Fraser <keir@xxxxxxx> > >>> CC: Jan Beulich <jbeulich@xxxxxxxx> > >>> CC: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> > >>> CC: Jun Nakajima <jun.nakajima@xxxxxxxxx> > >>> CC: Kevin Tian <kevin.tian@xxxxxxxxx> > >>> CC: George Dunlap <george.dunlap@xxxxxxxxxxxxx> > >>> --- > >>> xen/arch/x86/mm.c | 13 ++++++++----- > >>> xen/arch/x86/mm/p2m-ept.c | 27 +++++++++++++++++++++++++-- > >>> xen/arch/x86/mm/p2m-pt.c | 24 ++++++++++++++++++++---- > >>> xen/arch/x86/mm/p2m.c | 11 +++++++++-- > >>> xen/drivers/passthrough/iommu.c | 14 +++++++++++++- > >>> 5 files changed, 75 insertions(+), 14 deletions(-) > >>> > >>> diff --git a/xen/arch/x86/mm.c b/xen/arch/x86/mm.c index > >>> a42097f..427097d 100644 > >>> --- a/xen/arch/x86/mm.c > >>> +++ b/xen/arch/x86/mm.c > >>> @@ -2467,7 +2467,7 @@ static int __get_page_type(struct page_info > >> *page, unsigned long type, > >>> int preemptible) { > >>> unsigned long nx, x, y = page->u.inuse.type_info; > >>> - int rc = 0; > >>> + int rc = 0, ret = 0; > >>> > >>> ASSERT(!(type & ~(PGT_type_mask | PGT_pae_xen_l2))); > >>> > >>> @@ -2578,11 +2578,11 @@ static int __get_page_type(struct page_info > >> *page, unsigned long type, > >>> if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) > >>> { > >>> if ( (x & PGT_type_mask) == PGT_writable_page ) > >>> - iommu_unmap_page(d, mfn_to_gmfn(d, page_to_mfn(page))); > >>> + ret = iommu_unmap_page(d, mfn_to_gmfn(d, > >> page_to_mfn(page))); > >>> else if ( type == PGT_writable_page ) > >>> - iommu_map_page(d, mfn_to_gmfn(d, page_to_mfn(page)), > >>> - page_to_mfn(page), > >>> - IOMMUF_readable|IOMMUF_writable); > >>> + ret = iommu_map_page(d, mfn_to_gmfn(d, > page_to_mfn(page)), > >>> + page_to_mfn(page), > >>> + > >>> + IOMMUF_readable|IOMMUF_writable); > >>> } > >>> } > >>> > >>> @@ -2599,6 +2599,9 @@ static int __get_page_type(struct page_info > >> *page, unsigned long type, > >>> if ( (x & PGT_partial) && !(nx & PGT_partial) ) > >>> put_page(page); > >>> > >>> + if ( !rc ) > >>> + rc = ret; > >>> + > >>> return rc; > >>> } > >>> > >>> diff --git a/xen/arch/x86/mm/p2m-ept.c b/xen/arch/x86/mm/p2m-ept.c > >>> index 1ed5b47..df87944 100644 > >>> --- a/xen/arch/x86/mm/p2m-ept.c > >>> +++ b/xen/arch/x86/mm/p2m-ept.c > >>> @@ -821,6 +821,8 @@ out: > >>> if ( needs_sync ) > >>> ept_sync_domain(p2m); > >>> > >>> + ret = 0; > >>> + > >>> /* For host p2m, may need to change VT-d page table.*/ > >>> if ( rc == 0 && p2m_is_hostp2m(p2m) && need_iommu(d) && > >>> need_modify_vtd_table ) > >>> @@ -831,11 +833,29 @@ out: > >>> { > >>> if ( iommu_flags ) > >>> for ( i = 0; i < (1 << order); i++ ) > >>> - iommu_map_page(d, gfn + i, mfn_x(mfn) + i, > >>> iommu_flags); > >>> + { > >>> + rc = iommu_map_page(d, gfn + i, mfn_x(mfn) + i, > >>> + iommu_flags); > >>> + > >>> + if ( !ret && unlikely(rc) ) > >>> + { > >>> + while ( i-- ) > >>> + iommu_unmap_page(d, gfn + i); > >>> + > >>> + ret = rc; > >>> + break; > >>> + } > >>> + } > >>> else > >>> for ( i = 0; i < (1 << order); i++ ) > >>> - iommu_unmap_page(d, gfn + i); > >>> + { > >>> + rc = iommu_unmap_page(d, gfn + i); > >>> + > >>> + if ( !ret && unlikely(rc) ) > >>> + ret = rc; > >>> + } > >>> } > >>> + > >>> + rc = 0; > >>> } > >> > >> So the reason for doing this thing with setting ret=0, then using rc, > >> then setting rc to zero, is to make sure that the change is > >> propagated to the altp2m if necessary? > >> > >> I'm not a fan of this sort of "implied signalling". The check at the > >> altp2m conditional is meant to say, "everything went fine, go ahead > >> and propagate the change". But with your changes, that's not what we > >> want anymore -- we want, "If the change actually made it into the > >> hostp2m, propagate it to the altp2m." > >> > >> As such, I think it would be a lot clearer to just make a new boolean > >> variable, maybe "entry_written", which we initialize to false and > >> then set to true when the entry is actually written; and then change > >> the condition re the altp2m to "if ( entry_written && > >> p2m_is_hostp2m(..) )". > >> > >> Then I think you could make the ret / rc use mirror what you do > >> elsewhere, where ret is used to store the return value of the > >> function call, and it's propagated to rc only if rc is not already set. > >> > > > > George, > > > > Thanks for your detailed explanation. This seems an another matter of > preference. > > Of course, I'd follow your suggestion in p2m field, while I need to > > hear the other maintainers' options as well (especially when it comes from > Kevin/Jan, as they do spend a lot of time for me). > > > > A little bit of difference here, IMO, an int 'entry_written' is much > > better, as we also need to propagate the 'entry_written' up to callers, > > i.e. the > errors returned from atomic_write_ept_entry() are '-EINVAL', '-ESRCH' and '- > EBUSY'. Then, the follow is my modification (feel free to correct me): > > But it doesn't look like you're actually taking advantage of the fact that > entry_written is non-boolean -- other than immediately copying the value into > rc, all the places you're only checking if it's zero or not (where "zero" in > this > case actually means "the entry was written"). > > The code you had before was, as far as I can tell, functionally correct. > The point of introducing the extra variable is to decrease cognitive load on > people coming along to modify the code later. To do that, you want to > maximize the things you do that are within expectation, and minimize the > things you do outside of that. > > The name "entry_written" sounds like a boolean; developers will expect > "0 / false" to mean "not written" and "1/true" to mean "entry written". > Furthermore, rc looks like a return value; if coders see "rc = > atomic_write_ept_entry()" they immediately have a framework for what's > going on; whereas if they see "entry_written == [...]; if > (entry_written) { ... rc= entry_written; }", it takes some thinking to figure > out. > Not much, but every little bit of unnecessary thinking adds up. > > My suggestion remains to: > 1. Declare entry_written as a bool, initialized to false 2. In both > atomic_write_ept_entry() calls, use rc to collect the return value 3. In the > second one (the one which may fail), add an else { entry_written = true; } 4. > In > the conditional deciding whether to update the iommu, use "entry_written", I > think. At this point rc == 0 and entry_written = true will be identical, but > if we > ever insert something in between, we want the iommu update to be > attempted any time the entry is successfully written. > 5. Use "if ( entry_written && ...)" when deciding whether to propagate the > change to the altp2m. > George, Thanks. It is very clear now, and I will follow it in next v4. Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |