[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] abstract model of IOMMU unmaping/mapping failures
On April 01, 2016 7:57pm, <JBeulich@xxxxxxxx> wrote: > >>> On 31.03.16 at 11:06, <quan.xu@xxxxxxxxx> wrote: > > 4. __gnttab_unmap_common():rollback (no change) > > > > (Existing code) > > >>... > > if ( !kind ) > > err = iommu_unmap_page(ld, op->frame); > > else if ( !(kind & MAPKIND_WRITE) ) > > err = iommu_map_page(ld, op->frame, op->frame, > > IOMMUF_readable); > > > > double_gt_unlock(lgt, rgt); > > > > if ( err ) > > rc = GNTST_general_error; > > <<... > > > > Similarly no change required, as error has been correctly handled. > > I wouldn't call this "correctly handled", but for the hardware domain it > should > be good enough, and by crashing DomU-s simply reporting the error up the call > tree is sufficient. One question though is whether the loops higher up the > call > tree in grant_table.c shouldn't be exited when noticing the domain has > crashed, > both to avoid unnecessary work and to reduce the risk of secondary problems. > For this point, I suppose that the domain structure is not destructed (we are safe to call domain's element, e.g. ld->grant_table) in do_grant_table_op hypercall cycle, even the domain is crashed down. I am not quite sure, whether it is correct or not, could you explain more? Then, it is unnecessary to flush tlb. The code is: ... fault: gnttab_flush_tlb(current->domain); ... If the domain is crashed down, the current->domain->is_shut_down is ture. So we can modify it as: ... fault: if ( current->domain->is_shut_down ) gnttab_flush_tlb(current->domain); ... The other code of the call trees is good to me. > > 7. set_identity_p2m_entry():rollback (no change). > > > > (Existing code) > > >>... > > if ( !paging_mode_translate(p2m->domain) ) > > { > > if ( !need_iommu(d) ) > > return 0; > > return iommu_map_page(d, gfn, gfn, > IOMMUF_readable|IOMMUF_writable); > > } > > <<... > > if ( is_hardware_domain(d) && !iommu_use_hap_pt(d) ) > > ret = iommu_map_page(d, gfn, gfn, > IOMMUF_readable|IOMMUF_writable); > > >> ... > > > > error handling and rollback already in-place. > > For the first portion of the quoted code you mean. I don't see any rollback in > the paging_mode_translate() case. > Does it refer to as the below call trees: set_identity_p2m_entry()--rmrr_identity_mapping()--intel_iommu_add_device() --... set_identity_p2m_entry()--rmrr_identity_mapping()-- intel_iommu_remove_device()--... my description is not accurate, but the current code is right, and I am no need to modify it. right? > > 8. p2m_remove_page():rollback one level(change required). > > > > (Existing code) > > >>... > > if ( !paging_mode_translate(p2m->domain) ) > > { > > if ( need_iommu(p2m->domain) ) > > for ( i = 0; i < (1 << page_order); i++ ) > > iommu_unmap_page(p2m->domain, mfn + i); > > return 0; > > } > > <<... > > > > There is no error checking of iommu_unmap_page. We need to add. > > However, there are several callers of this function which don't handle > > error at all. I'm not sure whether hardware domain will use this function. > > Since it is in core p2m logic (which I don't have much knowledge), I > > hope we can print a warning and simply return error here (given the > > fact that non-hardware domains are all crashed immediately within > > unmap call) > > Yes, at least error propagation needs to be added here. I don't think more is > required. (Be careful, btw, with adding warnings - you must not spam the log > with such.) > agreed. A quick question about error propagation: ... for ( i = 0; i < (1UL << page_order); i++ ) iommu_unmap_page(p2m->domain, gfn + i); ... As you mentioned, as a special case, unmapping should perhaps continue despite an error, in an attempt to do best effort cleanup. Then i could modify the code as: ... for ( i = 0; i < (1UL << page_order); i++ ) { rc = iommu_unmap_page(p2m->domain, gfn + i); if ( rc ) ret = rc; } .. return ret; ... It looks cumbersome to me, any suggestion? > > ---- > > 9. p2m_pt_set_entry(): open (propose no-rollback). > > > > (Existing code) > > >>... > > for ( i = 0; i < (1UL << page_order); i++ ) > > iommu_map_page(p2m->domain, gfn + i, mfn_x(mfn) + i, > > iommu_pte_flags); > > else > > for ( i = 0; i < (1UL << page_order); i++ ) > > iommu_unmap_page(p2m->domain, gfn + i); > > <<... > > > > Above is in the end of the func. > > > > I'm not sure whether it will be called for hardware domain ( PVH mode > > is one potential concern. as we know, PVH has been introduced on Xen > > 4.4 as a DomU, and on Xen 4.5 as a Dom0.). > > Indeed you can get here for Dom0 only when it's non-PV. > A quick question, is non-PV the same as PVH for hardware domain? > > If not, we can leave it as today. Otherwise, I want to hear your > > suggestion whether we can use no-rollback (just erring) for hardware > > domain here since this code path is very complex. > > Of course (and there's really little point in repeating the same over and over > again) - as long as no-one involved considers the general model (set forth at > the > top) problematic, it can be done the same way everywhere. Which still means > (as said before) that errors need to be propagated. > Agreed. > > 11. __get_page_type(): open (propose no-rollback). > > The following is in detail: > > >>... > > if ( d && is_pv_domain(d) && unlikely(need_iommu(d)) ) > > { > > if ( (x & PGT_type_mask) == PGT_writable_page ) > > iommu_unmap_page(d, mfn_to_gmfn(d, > page_to_mfn(page))); > > else if ( type == PGT_writable_page ) > > iommu_map_page(d, mfn_to_gmfn(d, > page_to_mfn(page)), > > page_to_mfn(page), > > > IOMMUF_readable|IOMMUF_writable); > > } > > <<... > > > > It's almost in the end of the func. > > As the limited set of cases where __get_page_type() calls > > iommu_{,un}map_page(), and a get_page_type() invocation that > > previously was known to never fail (or at least so we hope, based on > > the existing code) [I got it from Jan's reply.], > > Please don't mis-quote me: I said this for a specific caller of the function, > not > about the function itself. > Sorry for that. > > we can classify it as normal PV domain and be treated as a fatal error. > > So for I am inclined to leave it as today. > I think one level propagation, only propagated in __get_page_type(), is enough. > And again - at least errors need to be propagated. Sure. Quan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |