[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH] iommu/quirk: disable shared EPT for Sandybridge and earlier processors.



> From: Tian, Kevin
> Sent: Thursday, December 03, 2015 9:20 AM
> 
> > From: Andrew Cooper [mailto:andrew.cooper3@xxxxxxxxxx]
> > Sent: Thursday, November 26, 2015 9:56 PM
> >
> > On 26/11/15 13:48, Malcolm Crossley wrote:
> > > On 26/11/15 13:46, Jan Beulich wrote:
> > >>>>> On 25.11.15 at 11:28, <andrew.cooper3@xxxxxxxxxx> wrote:
> > >>> The problem is that SandyBridge IOMMUs advertise 2M support and do
> > >>> function with it, but cannot cache 2MB translations in the IOTLBs.
> > >>>
> > >>> As a result, attempting to use 2M translations causes substantially
> > >>> worse performance than 4K translations.
> > >> Btw - how does this get explained? At a first glance, even if 2Mb
> > >> translations don't get entered into the TLB, it should still be one
> > >> less page table level to walk for the IOMMU, and should hence
> > >> nevertheless be a benefit. Yet you even say _substantially_
> > >> worse performance results.
> > > There is a IOTLB for the 4K translation so if you only use 4K
> > > translations then you get to take advantage of the IOTLB.
> > >
> > > If you use the 2Mb translation then a page table walk has to be
> > > performed every time there's a DMA access to that region of the BFN
> > > address space.
> >
> > Also remember that a high level dma access (from the point of view of a
> > driver) will be fragmented at the PCIe max packet size, which is
> > typically 256 bytes.
> >
> > So by not caching the 2Mb translation, a dma access of 4k may undergo 16
> > pagetable walks, one for each PCIe packet.
> >
> > We observed that using 2Mb mappings results in a 40% overhead, compared
> > to using 4k mappings, from the point of view of a sample network workload.
> >
> > ~Andrew
> 
> One confusion here. The original patch just disables shared_ept, w/o
> changing IOMMU to not use 2MB mapping. Is there something missing
> or other tricks behind?
> 
> When you say using 4k mapping saves 40% overhead back, is it w/
> ept shared or not?
> 

Just confirmed internally with HW team. On SNB 4KB cache is always
used regardless of 4KB/2MB/1GB mapping. There'd be another reason
for this 40% drop observation...

Thanks
Kevin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.