[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [win-pv-devel] poor winpv performance with iommu=on + disk driver domain



> -----Original Message-----
> From: win-pv-devel [mailto:win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx] On
> Behalf Of James Dingwall
> Sent: 22 May 2018 12:13
> To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> Subject: [win-pv-devel] poor winpv performance with iommu=on + disk
> driver domain
> 
> Hi,
> 
> I raised this issue with some non specific details before on xen-devel:
> https://lists.xenproject.org/archives/html/xen-devel/2017-
> 12/msg00262.html
> 
> In summary when enabling the iommu we note a significant performance in
> the disk performance of Windows guests running with WinPV drivers.
> 
> Since the original report we have done some more investigation and now
> seem to have isolated the performance issue noted in our HVM guests as
> an adverse interaction between the WinPV drivers and the blkback driver
> in the disk driver domain.  This is observed as slow disk speed in the
> guest and high cpu usage of the corresponding blkback.<domid>.<disk>
> process in the backend.  If the HVM is configured to network boot a PXE
> Linux image there is no disk performance issue.
> 
> We have updated our stack to be based on Xen 4.8.3, a custom compile of
> the Ubuntu kernel tag Ubuntu-lts-4.4.0-127.153_14.04.1 and ZFS 0.7.9.
> We have our own build of the WinPV drivers too but we see the same when
> using the 8.2.1 release.
> 
> I did some tracing in the disk driver domain for the block back driver:
> 
> cd /sys/kernel/debug/tracing
> echo function_graph > current_tracer
> echo ':mod:xen_blkback' > set_ftrace_filter
> echo pid_of_blback_worker > set_ftrace_pid
> 
> For WinPV there are some very long periods at the end of some
> operations which are not seen when tracing with a Linux guest, some
> parts of the captured traces below.  I assume that something extra is
> happening with WinPV due to the extra {} set after xen_blkbk_map
> [xen_blkback]().
> 
> If there are any suggestions on how to further debug this, options to
> try or if more information about the environment is required I'd be
> happy to give anything a go.
>

Hi James,

  I have a hunch that you may be suffering frm a Xen 'feature'...

  When there is no hardware passed through to an HVM domain, Xen overrides the 
memory cacheability of MMIO ranges... because it 'knows' that all h/w is 
emulated and therefore anything mapped in the MMIO ranges is only ever going to 
be RAM and can therefore accesses can be cached.
  However, as soon as you pass through h/w to a domain then Xen can no longer 
use this 'optimization'. Unfortunately this means that accesses to the Xen 
platform device's MMIO BAR are no longer cached and it just happens that PV 
drivers (both Linux and Windows) use this BAR to map the guest grant tables. 
Thus every access the guest PV drivers make to the grant table is uncached and 
thus you will suffer a substantial slow-down in the domain.

  Your results with persistent grants is consistent with my hunch because, of 
course, using persistent grants means substantially fewer access to grant 
tables in the both the frontend and the backend.

  I think that there have been recent patches to the Linux PV code to use a 
ballooned memory to host the grant tables instead (but I could be wrong) and I 
have done similar in the latest Windows xenbus driver. You may also be able to 
work around the problem by adding this little hack to your Xen:

diff --git a/xen/arch/x86/hvm/mtrr.c b/xen/arch/x86/hvm/mtrr.c
index 3c51244..d0126b4 100644
--- a/xen/arch/x86/hvm/mtrr.c
+++ b/xen/arch/x86/hvm/mtrr.c
@@ -811,7 +811,8 @@ int epte_get_entry_emt(struct domain *d, unsigned long gfn, 
mfn_t mfn,
         return MTRR_TYPE_UNCACHABLE;
     }
 
-    if ( !need_iommu(d) && !cache_flush_permitted(d) )
+    if ( (!need_iommu(d) && !cache_flush_permitted(d)) ||
+         is_xen_heap_mfn(mfn_x(mfn)) )
     {
         *ipat = 1;
         return MTRR_TYPE_WRBACK; 
 
  Cheers,

    Paul

> Thanks,
> James
> 
> 
> WinPV trace sample (iommu=on):
> 
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)               |      xen_blkbk_map [xen_blkback]() {
>  2) ! 208.229 us  |      }
>  2)               |      xen_blkbk_unmap [xen_blkback]() {
>  2)   0.208 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  2) ! 168.404 us  |      }
>  2)   0.478 us    |      xen_vbd_translate [xen_blkback]();
>  2)               |      xen_blkbk_map [xen_blkback]() {
>  2) * 26306.86 us |      }
>  2) * 26773.76 us |    }
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2) + 90.297 us   |      xen_blkbk_map [xen_blkback]();
>  2)               |      xen_blkbk_unmap [xen_blkback]() {
>  2)   0.196 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  2) + 90.160 us   |      }
>  2)   0.286 us    |      xen_vbd_translate [xen_blkback]();
>  2)               |      xen_blkbk_map [xen_blkback]() {
>  2) * 21062.59 us |      }
>  2) * 21314.26 us |    }
>  3) @ 203846.2 us |  } /* __do_block_io_op [xen_blkback] */
>  3)               |  __do_block_io_op [xen_blkback]() {
>  3)               |    dispatch_rw_block_io [xen_blkback]() {
>  3) ! 169.541 us  |      xen_blkbk_map [xen_blkback]();
>  3)               |      xen_blkbk_unmap [xen_blkback]() {
>  3)   0.173 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  3) ! 221.701 us  |      }
>  3)   0.471 us    |      xen_vbd_translate [xen_blkback]();
>  3)               |      xen_blkbk_map [xen_blkback]() {
>  3) + 15.050 us   |        xen_blkif_be_int [xen_blkback]();
>  3) * 25467.70 us |      }
>  3) * 25922.68 us |    }
>  3)               |    dispatch_rw_block_io [xen_blkback]() {
>  3)   0.252 us    |      xen_vbd_translate [xen_blkback]();
>  3)               |      xen_blkbk_map [xen_blkback]() {
>  3) ! 533.866 us  |      }
>  3) ! 556.344 us  |    }
>  3)               |    dispatch_rw_block_io [xen_blkback]() {
>  3) + 84.407 us   |      xen_blkbk_map [xen_blkback]();
>  3)               |      xen_blkbk_unmap [xen_blkback]() {
>  3)   0.239 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  3) ! 135.869 us  |      }
>  3)   0.310 us    |      xen_vbd_translate [xen_blkback]();
>  3)               |      xen_blkbk_map [xen_blkback]() {
>  3) # 2886.270 us |      }
>  3) # 3146.883 us |    }
> 
> 
> 
> Linux trace sample (iommu=on):
> 
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)   1.288 us    |      xen_blkbk_map [xen_blkback]();
>  2)               |      xen_blkbk_unmap [xen_blkback]() {
>  2)   0.108 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  2)   0.770 us    |      }
>  2)   0.117 us    |      xen_vbd_translate [xen_blkback]();
>  2) + 13.500 us   |      xen_blkbk_map [xen_blkback]();
>  2) + 23.938 us   |    }
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)   1.362 us    |      xen_blkbk_map [xen_blkback]();
>  2)               |      xen_blkbk_unmap [xen_blkback]() {
>  2)   0.131 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  2)   0.888 us    |      }
>  2)   0.120 us    |      xen_vbd_translate [xen_blkback]();
>  2) + 11.492 us   |      xen_blkbk_map [xen_blkback]();
>  2) + 24.236 us   |    }
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)   0.439 us    |      xen_blkbk_map [xen_blkback]();
>  2)               |      xen_blkbk_unmap [xen_blkback]() {
>  2)   0.117 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  2)   1.273 us    |      }
>  2)   0.226 us    |      xen_vbd_translate [xen_blkback]();
>  2)   7.867 us    |      xen_blkbk_map [xen_blkback]();
>  2) + 17.328 us   |    }
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)   0.521 us    |      xen_blkbk_map [xen_blkback]();
>  2)               |      xen_blkbk_unmap [xen_blkback]() {
>  2)   0.111 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  2)   0.620 us    |      }
>  2)   0.093 us    |      xen_vbd_translate [xen_blkback]();
>  2)   7.689 us    |      xen_blkbk_map [xen_blkback]();
>  2) + 14.368 us   |    }
> 
> 
> 
> WinPV trace sample (iommu=off):
> 
>  1)               |    dispatch_rw_block_io [xen_blkback]() {
>  1)   0.308 us    |      xen_vbd_translate [xen_blkback]();
>  1)   4.855 us    |      xen_blkbk_map [xen_blkback]();
>  1) + 15.915 us   |    }
>  1)               |    dispatch_rw_block_io [xen_blkback]() {
>  1)   0.113 us    |      xen_vbd_translate [xen_blkback]();
>  1)   1.902 us    |      xen_blkbk_map [xen_blkback]();
>  1)   7.278 us    |    }
>  1) ! 283.801 us  |  }
>  1)               |  __do_block_io_op [xen_blkback]() {
>  1)               |    dispatch_rw_block_io [xen_blkback]() {
>  1)   0.167 us    |      xen_vbd_translate [xen_blkback]();
>  1)   2.786 us    |      xen_blkbk_map [xen_blkback]();
>  1)   9.637 us    |    }
>  1) + 41.691 us   |  }
>  1)               |  __do_block_io_op [xen_blkback]() {
>  1)               |    dispatch_rw_block_io [xen_blkback]() {
>  1) + 10.566 us   |      xen_blkbk_map [xen_blkback]();
>  1)               |      xen_blkbk_unmap [xen_blkback]() {
>  1)   0.195 us    |        xen_blkbk_unmap_prepare [xen_blkback]();
>  1)   4.691 us    |      }
>  1)   0.736 us    |      xen_vbd_translate [xen_blkback]();
>  1) + 57.897 us   |      xen_blkbk_map [xen_blkback]();
>  1) + 94.186 us   |    }
>  1) + 96.658 us   |  }
>  2)               |  __do_block_io_op [xen_blkback]() {
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)   0.441 us    |      xen_vbd_translate [xen_blkback]();
>  2)   8.656 us    |      xen_blkbk_map [xen_blkback]();
>  2) + 18.881 us   |    }
>  2) + 46.961 us   |  }
>  2)               |  __do_block_io_op [xen_blkback]() {
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)   0.495 us    |      xen_vbd_translate [xen_blkback]();
>  2)   7.614 us    |      xen_blkbk_map [xen_blkback]();
>  2) + 23.518 us   |    }
>  2) + 25.980 us   |  }
>  2)               |  __do_block_io_op [xen_blkback]() {
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)   0.479 us    |      xen_vbd_translate [xen_blkback]();
>  2)   5.872 us    |      xen_blkbk_map [xen_blkback]();
>  2) + 15.831 us   |    }
>  2) + 18.521 us   |  }
>  2)               |  __do_block_io_op [xen_blkback]() {
>  2)               |    dispatch_rw_block_io [xen_blkback]() {
>  2)   0.473 us    |      xen_vbd_translate [xen_blkback]();
>  2)   8.724 us    |      xen_blkbk_map [xen_blkback]();
>  2) + 23.275 us   |    }
>  2) + 29.333 us   |  }
> 
> 
> _______________________________________________
> win-pv-devel mailing list
> win-pv-devel@xxxxxxxxxxxxxxxxxxxx
> https://lists.xenproject.org/mailman/listinfo/win-pv-devel
_______________________________________________
win-pv-devel mailing list
win-pv-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/win-pv-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.