[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[Xen-devel] [PATCH 0/4] mitigate the per-pCPU blocking list may be too long



VT-d PI introduces a per-pCPU blocking list to track the blocked vCPU
running on the pCPU. Theoretically, there are 32K domain on single
host, 128 vCPUs per domain. If all vCPUs are blocked on the same pCPU,
4M vCPUs are in the same list. Travelling this issue consumes too
much time. We have discussed this issue in [1,2,3].

To mitigate this issue, we proposed the following two method [3]:
1. Evenly distributing all the blocked vCPUs among all pCPUs.
2. Don't put the blocked vCPUs which won't be woken by the wakeup
interrupt into the per-pCPU list.

PATCH 1/4 tracks the event, adding entry to PI blocking list. With the
patch, some data can be acquired to help to validate the following
patches. 

Patch 2/4 randomly distritbutes entries (vCPUs) among all oneline
pCPUs, which can theoretically decrease the maximum of #entry
in the list by N times. N is #pCPU.

Patch 3/4 adds a refcount to vcpu's pi_desc. If the pi_desc is
recorded in one IRTE, the refcount increase by 1 and If the pi_desc is
cleared in one IRTE, the refcount decrease by 1.

In Patch 4/4, one vCPU is added to PI blocking list only if its
pi_desc is referred by at least one IRTE.

I tested this series in the following scene:
* One 128 vCPUs guest and assign a NIC to it
* all 128 vCPUs are pinned to one pCPU.
* use xentrace to collect events for 5 minutes

I compared the maximum of #entry in one list and #event (adding entry to
PI blocking list) with and without the three latter patches. Here
is the result:
-------------------------------------------------------------
|               |                      |                    |
|    Items      |   Maximum of #entry  |      #event        |
|               |                      |                    |
-------------------------------------------------------------
|               |                      |                    |
|W/ the patches |         6            |       22740        |
|               |                      |                    |
-------------------------------------------------------------
|               |                      |                    |
|W/O the patches|        128           |       46481        |
|               |                      |                    |
-------------------------------------------------------------

This is a very simple test. But with this patch series,
maximum of #entry has largely decreased and #adding entry to
PI blocking list reduces about half. From this aspect, this
patch series takes effect.

[1] 
https://lists.gt.net/xen/devel/422661?search_string=VT-d%20posted-interrupt%20core%20logic%20handling;#422661
[2] 
https://lists.gt.net/xen/devel/422567?search_string=%20The%20length%20of%20the%20list%20depends;#422567
[3] 
https://lists.gt.net/xen/devel/472749?search_string=enable%20vt-d%20pi%20by%20default;#472749

Chao Gao (4):
  xentrace: add TRC_HVM_VT_D_PI_BLOCK
  VT-d PI: Randomly Distribute entries to all online pCPUs' pi blocking
    list
  VT-d PI: Add reference count to pi_desc
  VT-d PI: Don't add vCPU to PI blocking list for a case

 tools/xentrace/formats                 |  1 +
 xen/arch/x86/hvm/vmx/vmx.c             | 90 +++++++++++++++++++++++++++++-----
 xen/drivers/passthrough/io.c           |  2 +-
 xen/drivers/passthrough/vtd/intremap.c | 60 ++++++++++++++++++++++-
 xen/include/asm-x86/hvm/domain.h       |  6 +++
 xen/include/asm-x86/hvm/trace.h        |  1 +
 xen/include/asm-x86/hvm/vmx/vmcs.h     |  3 ++
 xen/include/asm-x86/iommu.h            |  2 +-
 xen/include/asm-x86/msi.h              |  2 +-
 xen/include/public/trace.h             |  1 +
 10 files changed, 151 insertions(+), 17 deletions(-)

-- 
1.8.3.1


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
https://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.