[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4] x86/p2m: use large pages for MMIO mappings



On Fri, 2016-01-22 at 08:42 -0700, Jan Beulich wrote:
> When mapping large BARs (e.g. the frame buffer of a graphics card) the
> overhead of establishing such mappings using only 4k pages has,
> particularly after the XSA-125 fix, become unacceptable. Alter the
> XEN_DOMCTL_memory_mapping semantics once again, so that there's no
> longer a fixed amount of guest frames that represents the upper limit
> of what a single invocation can map. Instead bound execution time by
> limiting the number of iterations (regardless of page size).
> 
> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> ---
> Open issues (perhaps for subsequent changes):
> - ARM side unimplemented (and hence libxc for now made cope with both
> Â models), the main issue (besides my inability to test any change
> Â there) being the many internal uses of map_mmio_regions())
> - iommu_{,un}map_page() interfaces don't support "order" (hence
> Â mmio_order() for now returns zero when !iommu_hap_pt_share, which in
> Â particular means the AMD side isn't being taken care of just yet, but
> Â note that this also has the intended effect of suppressing non-zero
> Â order mappings in the shadow mode case)
> ---
> v4: Move cleanup duty entirely to the caller of the hypercall. Move
> ÂÂÂÂreturn value description to from commit message to domctl.h.
> v3: Re-base on top of "x86/hvm: fold opt_hap_{2mb,1gb} into
> ÂÂÂÂhap_capabilities". Extend description to spell out new return value
> ÂÂÂÂmeaning. Add a couple of code comments. Use PAGE_ORDER_4K instead
> ÂÂÂÂof literal 0. Take into consideration r/o MMIO pages.
> v2: Produce valid entries for large p2m_mmio_direct mappings in
> ÂÂÂÂp2m_pt_set_entry(). Don't open code iommu_use_hap_pt() in
> ÂÂÂÂmmio_order(). Update function comment of set_typed_p2m_entry() and
> ÂÂÂÂclear_mmio_p2m_entry(). Use PRI_mfn. Add ASSERT()s to
> ÂÂÂÂ{,un}map_mmio_regions() to detect otherwise endless loops.
> 
> --- a/tools/libxc/xc_domain.c
> +++ b/tools/libxc/xc_domain.c
> @@ -2174,7 +2174,7 @@ int xc_domain_memory_mapping(
> Â{
> ÂÂÂÂÂDECLARE_DOMCTL;
> ÂÂÂÂÂxc_dominfo_t info;
> -ÂÂÂÂint ret = 0, err;
> +ÂÂÂÂint ret = 0, rc;
> ÂÂÂÂÂunsigned long done = 0, nr, max_batch_sz;
> Â
> ÂÂÂÂÂif ( xc_domain_getinfo(xch, domid, 1, &info) != 1 ||
> @@ -2199,19 +2199,24 @@ int xc_domain_memory_mapping(
> ÂÂÂÂÂÂÂÂÂdomctl.u.memory_mapping.nr_mfns = nr;
> ÂÂÂÂÂÂÂÂÂdomctl.u.memory_mapping.first_gfn = first_gfn + done;
> ÂÂÂÂÂÂÂÂÂdomctl.u.memory_mapping.first_mfn = first_mfn + done;
> -ÂÂÂÂÂÂÂÂerr = do_domctl(xch, &domctl);
> -ÂÂÂÂÂÂÂÂif ( err && errno == E2BIG )
> +ÂÂÂÂÂÂÂÂrc = do_domctl(xch, &domctl);
> +ÂÂÂÂÂÂÂÂif ( rc < 0 && errno == E2BIG )
> ÂÂÂÂÂÂÂÂÂ{
> ÂÂÂÂÂÂÂÂÂÂÂÂÂif ( max_batch_sz <= 1 )
> ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂbreak;
> ÂÂÂÂÂÂÂÂÂÂÂÂÂmax_batch_sz >>= 1;
> ÂÂÂÂÂÂÂÂÂÂÂÂÂcontinue;
> ÂÂÂÂÂÂÂÂÂ}
> +ÂÂÂÂÂÂÂÂif ( rc > 0 )
> +ÂÂÂÂÂÂÂÂ{
> +ÂÂÂÂÂÂÂÂÂÂÂÂdone += rc;
> +ÂÂÂÂÂÂÂÂÂÂÂÂcontinue;
> +ÂÂÂÂÂÂÂÂ}
> ÂÂÂÂÂÂÂÂÂ/* Save the first error... */
> ÂÂÂÂÂÂÂÂÂif ( !ret )
> -ÂÂÂÂÂÂÂÂÂÂÂÂret = err;
> +ÂÂÂÂÂÂÂÂÂÂÂÂret = rc;
> ÂÂÂÂÂÂÂÂÂ/* .. and ignore the rest of them when removing. */
> -ÂÂÂÂÂÂÂÂif ( err && add_mapping != DPCI_REMOVE_MAPPING )
> +ÂÂÂÂÂÂÂÂif ( rc && add_mapping != DPCI_REMOVE_MAPPING )
> ÂÂÂÂÂÂÂÂÂÂÂÂÂbreak;

This all looks good to me, assuming I've interpreted the interface comment
correctly.

> +
> +#define MAP_MMIO_MAX_ITER 64 /* pretty arbitrary */
> +

I suppose no existing in-tree code exceeds that (or there'd be more patch
here).

64 seems like as good a number as anything (corresponds to 256K for 4K
mappings, which doesn't seem too low).
Â
> ÂÂÂÂÂÂÂÂÂdone += nr;
> --- a/xen/include/public/domctl.h
> +++ b/xen/include/public/domctl.h
> @@ -542,8 +542,14 @@ DEFINE_XEN_GUEST_HANDLE(xen_domctl_bind_
> Â
> Â
> Â/* Bind machine I/O address range -> HVM address range. */
> -/* If this returns -E2BIG lower nr_mfns value. */
> Â/* XEN_DOMCTL_memory_mapping */
> +/* Returns
> +ÂÂÂ- zeroÂÂÂÂÂ(success, everything done)
> +ÂÂÂ- -E2BIGÂÂÂ(passed in nr_mfns value too large for the implementation)
> +ÂÂÂ- positive (partial success, this many [less than nr_mfns] done,

Is the successful region contiguous, i.e. 0..return val, or does the caller
need to figure it somehow? (I think based on libxc changes the former, but
it should be spelt out here I think).



> +ÂÂÂÂÂÂÂÂÂÂÂÂÂÂÂrequiring re-invocation by the caller with updated inputs)
> +ÂÂÂ- negative (error)

This is a more general case of -E2BIG, you might fix that by saying "other
error" or by moving -E2BIG to be a subclause.



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.