[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving

To: Matthew Rushton <mvrushton@xxxxxxxxx>
From: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Date: Fri, 28 Mar 2014 13:02:01 -0400
Cc: Keir Fraser <keir@xxxxxxx>, Matt Wilson <msw@xxxxxxxxxx>, Matt Wilson <msw@xxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
Delivery-date: Fri, 28 Mar 2014 17:02:28 +0000
List-id: Xen developer discussion <xen-devel.lists.xen.org>

On Wed, Mar 26, 2014 at 03:15:42PM -0700, Matthew Rushton wrote:
> On 03/26/14 10:56, Konrad Rzeszutek Wilk wrote:
> >On Wed, Mar 26, 2014 at 10:47:44AM -0700, Matthew Rushton wrote:
> >>On 03/26/14 09:36, Konrad Rzeszutek Wilk wrote:
> >>>On Wed, Mar 26, 2014 at 08:59:04AM -0700, Matthew Rushton wrote:
> >>>>On 03/26/14 08:15, Matt Wilson wrote:
> >>>>>On Wed, Mar 26, 2014 at 11:08:01AM -0400, Konrad Rzeszutek Wilk wrote:
> >>>>>>Could you elaborate a bit more on the use-case please?
> >>>>>>My understanding is that most drivers use a scatter gather list - in 
> >>>>>>which
> >>>>>>case it does not matter if the underlaying MFNs in the PFNs spare are
> >>>>>>not contingous.
> >>>>>>
> >>>>>>But I presume the issue you are hitting is with drivers doing 
> >>>>>>dma_map_page
> >>>>>>and the page is not 4KB but rather large (compound page). Is that the
> >>>>>>problem you have observed?
> >>>>>Drivers are using very large size arguments to dma_alloc_coherent()
> >>>>>for things like RX and TX descriptor rings.
> >>>Large size like larger than 512kB? That would also cause problems
> >>>on baremetal then when swiotlb is activated I believe.
> >>I was looking at network IO performance so the buffers would not
> >>have been that large. I think large in this context is relative to
> >>the 4k page size and the odds of the buffer spanning a page
> >>boundary. For context I saw ~5-10% performance increase with guest
> >>network throughput by avoiding bounce buffers and also saw dom0 tcp
> >>streaming performance go from ~6Gb/s to over 9Gb/s on my test setup
> >>with a 10Gb NIC.
> >OK, but that would not be the dma_alloc_coherent ones then? That sounds
> >more like the generic TCP mechanism allocated 64KB pages instead of 4KB
> >and used those.
> >
> >Did you try looking at this hack that Ian proposed a long time ago
> >to verify that it is said problem?
> >
> >https://lkml.org/lkml/2013/9/4/540
> >
> 
> Yes I had seen that and intially had the same reaction but the
> change was relatively recent and not relevant. I *think* all the
> coherent allocations are ok since the swiotlb makes them contiguous.
> The problem comes with the use of the streaming api. As one example
> with jumbo frames enabled a driver might use larger rx buffers which
> triggers the problem.
> 
> I think the right thing to do is to make the dma streaming api work
> better with larger buffers on dom0. That way it works across all

OK.
> drivers and device types regardless of how they were designed.

Can you point me to an example of the DMA streaming API?

I am not sure if you mean 'streaming API' as scatter gather operations
using DMA API?

Is there a particular easy way for me to reproduce this. I have
to say I hadn't enabled Jumbo frame on my box since I am not even
sure if the switch I have can do it. Is there a idiots-punch-list
of how to reproduce this?

Thanks!
> 
> >>>>>--msw
> >>>>It's the dma streaming api I've noticed the problem with, so
> >>>>dma_map_single(). Applicable swiotlb code would be
> >>>>xen_swiotlb_map_page() and range_straddles_page_boundary(). So yes
> >>>>for larger buffers it can cause bouncing.
> 

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

Follow-Ups:
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Matthew Rushton

References:
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Matthew Rushton
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Tim Deegan
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Matt Wilson
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Matt Wilson
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Matthew Rushton
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Matthew Rushton
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Konrad Rzeszutek Wilk
- Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
  - From: Matthew Rushton

Prev by Date: [Xen-devel] [PATCH v2 2/2] xentrace: Add TRC_HVM_VCHIP
Next by Date: Re: [Xen-devel] [GIT PULL] remove xend for 4.5 (Was: Re: [PATCH] MAINTAINERS: Exclude xend from toolstack maintainers entry)
Previous by thread: Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
Next by thread: Re: [Xen-devel] [RFC PATCH] page_alloc: use first half of higher order chunks when halving
Index(es):
- Date
- Thread

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.