Xen project Mailing List

Re: [Xen-devel] [RFC 21/23] net/xen-netback: Make it running on 64KB page granularity

Hi Wei, Thanks you for the review. On 15/05/15 03:35, Wei Liu wrote: > On Thu, May 14, 2015 at 06:01:01PM +0100, Julien Grall wrote: >> The PV network protocol is using 4KB page granularity. The goal of this >> patch is to allow a Linux using 64KB page granularity working as a >> network backend on a non-modified Xen. >> >> It's only necessary to adapt the ring size and break skb data in small >> chunk of 4KB. The rest of the code is relying on the grant table code. >> >> Although only simple workload is working (dhcp request, ping). If I try >> to use wget in the guest, it will stall until a tcpdump is started on >> the vif interface in DOM0. I wasn't able to find why. >> > > I think in wget workload you're more likely to break down 64K pages to > 4K pages. Some of your calculation of mfn, offset might be wrong. If so, why tcpdump on the vif interface would make wget suddenly working? Does it make netback use a different path? >> I have not modified XEN_NETBK_RX_SLOTS_MAX because I wasn't sure what >> it's used for (I have limited knowledge on the network driver). >> > > This is the maximum slots a guest packet can use. AIUI the protocol > still works on 4K granularity (you break 64K page to a bunch of 4K > pages), you don't need to change this. 1 slot = 1 grant right? If so, XEN_NETBK_RX_SLOTS_MAX is based on the number of Linux page. So we would have to get the number for Xen page. Although, I gave a try to multiple by XEN_PFN_PER_PAGE (4KB/64KB = 16) but it get stuck in the loop. >> --- >> drivers/net/xen-netback/common.h | 7 ++++--- >> drivers/net/xen-netback/netback.c | 27 ++++++++++++++------------- >> 2 files changed, 18 insertions(+), 16 deletions(-) >> >> diff --git a/drivers/net/xen-netback/common.h >> b/drivers/net/xen-netback/common.h >> index 8a495b3..0eda6e9 100644 >> --- a/drivers/net/xen-netback/common.h >> +++ b/drivers/net/xen-netback/common.h >> @@ -44,6 +44,7 @@ >> #include <xen/interface/grant_table.h> >> #include <xen/grant_table.h> >> #include <xen/xenbus.h> >> +#include <xen/page.h> >> #include <linux/debugfs.h> >> >> typedef unsigned int pending_ring_idx_t; >> @@ -64,8 +65,8 @@ struct pending_tx_info { >> struct ubuf_info callback_struct; >> }; >> >> -#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, PAGE_SIZE) >> -#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, PAGE_SIZE) >> +#define XEN_NETIF_TX_RING_SIZE __CONST_RING_SIZE(xen_netif_tx, >> XEN_PAGE_SIZE) >> +#define XEN_NETIF_RX_RING_SIZE __CONST_RING_SIZE(xen_netif_rx, >> XEN_PAGE_SIZE) >> >> struct xenvif_rx_meta { >> int id; >> @@ -80,7 +81,7 @@ struct xenvif_rx_meta { >> /* Discriminate from any valid pending_idx value. */ >> #define INVALID_PENDING_IDX 0xFFFF >> >> -#define MAX_BUFFER_OFFSET PAGE_SIZE >> +#define MAX_BUFFER_OFFSET XEN_PAGE_SIZE >> >> #define MAX_PENDING_REQS XEN_NETIF_TX_RING_SIZE >> >> diff --git a/drivers/net/xen-netback/netback.c >> b/drivers/net/xen-netback/netback.c >> index 9ae1d43..ea5ce84 100644 >> --- a/drivers/net/xen-netback/netback.c >> +++ b/drivers/net/xen-netback/netback.c >> @@ -274,7 +274,7 @@ static void xenvif_gop_frag_copy(struct xenvif_queue >> *queue, struct sk_buff *skb >> { >> struct gnttab_copy *copy_gop; >> struct xenvif_rx_meta *meta; >> - unsigned long bytes; >> + unsigned long bytes, off_grant; >> int gso_type = XEN_NETIF_GSO_TYPE_NONE; >> >> /* Data must not cross a page boundary. */ >> @@ -295,7 +295,8 @@ static void xenvif_gop_frag_copy(struct xenvif_queue >> *queue, struct sk_buff *skb >> if (npo->copy_off == MAX_BUFFER_OFFSET) >> meta = get_next_rx_buffer(queue, npo); >> >> - bytes = PAGE_SIZE - offset; >> + off_grant = offset & ~XEN_PAGE_MASK; >> + bytes = XEN_PAGE_SIZE - off_grant; >> if (bytes > size) >> bytes = size; >> >> @@ -314,9 +315,9 @@ static void xenvif_gop_frag_copy(struct xenvif_queue >> *queue, struct sk_buff *skb >> } else { >> copy_gop->source.domid = DOMID_SELF; >> copy_gop->source.u.gmfn = >> - virt_to_mfn(page_address(page)); >> + virt_to_mfn(page_address(page) + offset); >> } >> - copy_gop->source.offset = offset; >> + copy_gop->source.offset = off_grant; >> >> copy_gop->dest.domid = queue->vif->domid; >> copy_gop->dest.offset = npo->copy_off; >> @@ -747,7 +748,7 @@ static int xenvif_count_requests(struct xenvif_queue >> *queue, >> first->size -= txp->size; >> slots++; >> >> - if (unlikely((txp->offset + txp->size) > PAGE_SIZE)) { >> + if (unlikely((txp->offset + txp->size) > XEN_PAGE_SIZE)) { >> netdev_err(queue->vif->dev, "Cross page boundary, >> txp->offset: %x, size: %u\n", >> txp->offset, txp->size); >> xenvif_fatal_tx_err(queue->vif); >> @@ -1241,11 +1242,11 @@ static void xenvif_tx_build_gops(struct xenvif_queue >> *queue, >> } >> >> /* No crossing a page as the payload mustn't fragment. */ >> - if (unlikely((txreq.offset + txreq.size) > PAGE_SIZE)) { >> + if (unlikely((txreq.offset + txreq.size) > XEN_PAGE_SIZE)) { >> netdev_err(queue->vif->dev, >> "txreq.offset: %x, size: %u, end: %lu\n", >> txreq.offset, txreq.size, >> - (txreq.offset&~PAGE_MASK) + txreq.size); >> + (txreq.offset&~XEN_PAGE_MASK) + txreq.size); >> xenvif_fatal_tx_err(queue->vif); >> break; >> } >> @@ -1287,7 +1288,7 @@ static void xenvif_tx_build_gops(struct xenvif_queue >> *queue, >> virt_to_mfn(skb->data); > > You didn't change the calculation of MFN here. I think it returns the > MFN of the first 4K sub-page of that 64K page. Do I miss anything? There is no change required. On ARM virt_to_mfn is implemented with: pfn_to_mfn(virt_to_phys(v) >> XEN_PAGE_SHIFT) which will return a 4KB PFN (see patch #23). > >> queue->tx_copy_ops[*copy_ops].dest.domid = DOMID_SELF; >> queue->tx_copy_ops[*copy_ops].dest.offset = >> - offset_in_page(skb->data); >> + offset_in_page(skb->data) & ~XEN_PAGE_MASK; >> >> queue->tx_copy_ops[*copy_ops].len = data_len; >> queue->tx_copy_ops[*copy_ops].flags = GNTCOPY_source_gref; >> @@ -1366,8 +1367,8 @@ static int xenvif_handle_frag_list(struct xenvif_queue >> *queue, struct sk_buff *s > > This function is to coalesce frag_list to a new SKB. It's completely > fine to use the natural granularity of backend domain. The way you > modified it can lead to waste of memory, i.e. you only use first 4K of a > 64K page. Thanks for explaining. I wasn't sure how the function works so I change it for safety. I will redo the change. FWIW, I'm sure there is other place in netback where we waste memory with 64KB page granularity (such as grant table). I need to track them. Let me know if you have some place in mind where the memory usage can be improved. >> return -ENOMEM; >> } >> >> - if (offset + PAGE_SIZE < skb->len) >> - len = PAGE_SIZE; >> + if (offset + XEN_PAGE_SIZE < skb->len) >> + len = XEN_PAGE_SIZE; >> else >> len = skb->len - offset; >> if (skb_copy_bits(skb, offset, page_address(page), len)) >> @@ -1396,7 +1397,7 @@ static int xenvif_handle_frag_list(struct xenvif_queue >> *queue, struct sk_buff *s >> /* Fill the skb with the new (local) frags. */ >> memcpy(skb_shinfo(skb)->frags, frags, i * sizeof(skb_frag_t)); >> skb_shinfo(skb)->nr_frags = i; >> - skb->truesize += i * PAGE_SIZE; >> + skb->truesize += i * XEN_PAGE_SIZE; > > The true size accounts for the actual memory occupied by this SKB. Since > the page is allocated with alloc_page, the granularity should be > PAGE_SIZE not XEN_PAGE_SIZE. Ok. I will replace with PAGE_SIZE. Regards, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.