[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] x86/mm/p2m: don't needlessly limit MMIO mapping order to 4k
> -----Original Message----- > From: Xen-devel [mailto:xen-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On Behalf > Of Jan Beulich > Sent: 25 October 2018 15:28 > To: Paul Durrant <Paul.Durrant@xxxxxxxxxx> > Cc: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>; Wei Liu > <wei.liu2@xxxxxxxxxx>; xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx> > Subject: Re: [Xen-devel] [PATCH v2] x86/mm/p2m: don't needlessly limit > MMIO mapping order to 4k > > >>> On 17.10.18 at 16:24, <paul.durrant@xxxxxxxxxx> wrote: > > --- a/xen/arch/x86/mm/p2m.c > > +++ b/xen/arch/x86/mm/p2m.c > > @@ -2081,14 +2081,11 @@ static unsigned int mmio_order(const struct > domain *d, > > unsigned long start_fn, unsigned long > nr) > > { > > /* > > - * Note that the !iommu_use_hap_pt() here has three effects: > > - * - cover iommu_{,un}map_page() not having an "order" input yet, > > - * - exclude shadow mode (which doesn't support large MMIO > mappings), > > - * - exclude PV guests, should execution reach this code for such. > > - * So be careful when altering this. > > + * PV guests or shadow-mode HVM guests must be restricted to 4k > > + * mappings. > > Since you've already posted a patch to add order parameters to > IOMMU map/unmap, I'd prefer the respective part of the comment > to go away only when the order value really can be passed through. > This then requires permitting non-zero values only when the IOMMUs > also allow for the respective page sizes. > Ok. I can do that. I'll have to provide some way of querying the IOMMU implementation for which orders it supports in that case. > I am in particular not convinced that the time needed to carry out > the hypercall is going to be low enough even for 512 4k pages: You > need to take into account flushes, including those potentially > needed for ATS. You don't provide any proof that flushes are > always delayed and batched, nor do I think this is uniformly the > case. > In testing, doing the 512 4k loop does not seem to cause issues with either VT-d or AMD IOMMU. The AMD IOMMU case is still significantly slower than VT-d though so I'm currently experimenting with eliding the flushing as well as removing the page merging code to speed it up. > > @@ -2096,8 +2093,12 @@ static unsigned int mmio_order(const struct > domain *d, > > * set_typed_p2m_entry() when it needs to zap M2P entries > > * for a RAM range. > > */ && > > - !(start_fn & ((1UL << PAGE_ORDER_1G) - 1)) && (nr >> > PAGE_ORDER_1G) && > > - hap_has_1gb ) > > + !(start_fn & ((1UL << PAGE_ORDER_1G) - 1)) && > > + (nr >> PAGE_ORDER_1G) && > > + hap_has_1gb && > > + /* disable 1G mappings if we need to keep the IOMMU in sync */ > > + !need_iommu_pt_sync(d) > > I have to admit that I don't understand this v2 addition. How is > keeping-in-sync relevant for 1G but not 2M pages? > I was attempting to make it obvious that, if the IOMMU mappings needed syncing (i.e. they are not shared), that we would not handle a 1G mapping (and thus attempt to iterate over the 1G in 4k chunks). > > + ) > > Misplaced paren. True. Paul > > Jan > > > > _______________________________________________ > Xen-devel mailing list > Xen-devel@xxxxxxxxxxxxxxxxxxxx > https://lists.xenproject.org/mailman/listinfo/xen-devel _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |