Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving

On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote:
> On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote:
> >On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote:
> >>On 03/26/14 08:15, Matt Wilson wrote:
> >>>On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote:
> >>>>Could you elaborate a bit more on the use-case please?
> >>>>My understanding is that most drivers use a scatter gather list - in which
> >>>>case it does not matter if the underlaying MFNs in the PFNs spare are
> >>>>not contingous.
> >>>>
> >>>>But I presume the issue you are hitting is with drivers doing dma_map_page
> >>>>and the page is not 4KB but rather large (compound page). Is that the
> >>>>problem you have observed?
> >>>Drivers are using very large size arguments to dma_alloc_coherent()
> >>>for things like RX and TX descriptor rings.
> >Large size like larger than 512kB? That would also cause problems
> >on baremetal then when swiotlb is activated I believe.
> I was looking at network IO performance so the buffers would not
> have been that large. I think large in this context is relative to
> the 4k page size and the odds of the buffer spanning a page
> boundary. For context I saw ~5-10% performance increase with guest
> network throughput by avoiding bounce buffers and also saw dom0 tcp
> streaming performance go from ~6Gb/s to over 9Gb/s on my test setup
> with a 10Gb NIC.

OK, but that would not be the dma_alloc_coherent ones then? That sounds
more like the generic TCP mechanism allocated 64KB pages instead of 4KB
and used those.

Did you try looking at this hack that Ian proposed a long time ago
to verify that it is said problem?


> >
> >>>--msw
> >>It's the dma streaming api I've noticed the problem with, so
> >>dma_map_single(). Applicable swiotlb code would be
> >>xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes
> >>for larger buffers it can cause bouncing.

