Xen project Mailing List

Re: [Xen-devel] [PATCH for-4.12 v2 17/17] xen/arm: Track page accessed between batch of Set/Way operations

To: Stefano Stabellini <sstabellini@xxxxxxxxxx>

From: Julien Grall <julien.grall@xxxxxxx>

Date: Tue, 11 Dec 2018 16:22:07 +0000

Cc: Wei Liu <wei.liu2@xxxxxxxxxx>, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>, George Dunlap <George.Dunlap@xxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Ian Jackson <ian.jackson@xxxxxxxxxxxxx>, Tim Deegan <tim@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Roger Pau Monné <roger.pau@xxxxxxxxxx>

Delivery-date: Tue, 11 Dec 2018 16:22:15 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Stefano, On 07/12/2018 21:43, Stefano Stabellini wrote:

On Tue, 4 Dec 2018, Julien Grall wrote:

At the moment, the implementation of Set/Way operations will go through
all the entries of the guest P2M and flush them. However, this is very
expensive and may render unusable a guest OS using them.

For instance, Linux 32-bit will use Set/Way operations during secondary
CPU bring-up. As the implementation is really expensive, it may be possible
to hit the CPU bring-up timeout.

To limit the Set/Way impact, we track what pages has been of the guest
has been accessed between batch of Set/Way operations. This is done
using bit[0] (aka valid bit) of the P2M entry.

This patch adds a new per-arch helper is introduced to perform actions just
before the guest is first unpaused. This will be used to invalidate the
P2M to track access from the start of the guest.

Signed-off-by: Julien Grall <julien.grall@xxxxxxx>

---

While we can spread d->creation_finished all over the code, the per-arch
helper to perform actions just before the guest is first unpaused can
bring a lot of benefit for both architecture. For instance, on Arm, the
flush to the instruction cache could be delayed until the domain is
first run. This would improve greatly the performance of creating guest.

I am still doing the benchmark whether having a command line option is
worth it. I will provide numbers as soon as I have them.

Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>
Cc: Julien Grall <julien.grall@xxxxxxx>
Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
Cc: George Dunlap <George.Dunlap@xxxxxxxxxxxxx>
Cc: Ian Jackson <ian.jackson@xxxxxxxxxxxxx>
Cc: Jan Beulich <jbeulich@xxxxxxxx>
Cc: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>
Cc: Tim Deegan <tim@xxxxxxx>
Cc: Wei Liu <wei.liu2@xxxxxxxxxx>
---
  xen/arch/arm/domain.c     | 14 ++++++++++++++
  xen/arch/arm/p2m.c        | 30 ++++++++++++++++++++++++++++--
  xen/arch/x86/domain.c     |  4 ++++
  xen/common/domain.c       |  5 ++++-
  xen/include/asm-arm/p2m.h |  2 ++
  xen/include/xen/domain.h  |  2 ++
  6 files changed, 54 insertions(+), 3 deletions(-)

diff --git a/xen/arch/arm/domain.c b/xen/arch/arm/domain.c
index 1d926dcb29..41f101746e 100644
--- a/xen/arch/arm/domain.c
+++ b/xen/arch/arm/domain.c
@@ -767,6 +767,20 @@ int arch_domain_soft_reset(struct domain *d)
      return -ENOSYS;
  }

+void arch_domain_creation_finished(struct domain *d)

+{
+    /*
+     * To avoid flushing the whole guest RAM on the first Set/Way, we
+     * invalidate the P2M to track what has been accessed.
+     *
+     * This is only turned when IOMMU is not used or the page-table are
+     * not shared because bit[0] (e.g valid bit) unset will result
+     * IOMMU fault that could be not fixed-up.
+     */
+    if ( !iommu_use_hap_pt(d) )
+        p2m_invalidate_root(p2m_get_hostp2m(d));
+}
+
  static int is_guest_pv32_psr(uint32_t psr)
  {
      switch (psr & PSR_MODE_MASK)
diff --git a/xen/arch/arm/p2m.c b/xen/arch/arm/p2m.c
index 8ee6ff7bd7..44ea3580cf 100644
--- a/xen/arch/arm/p2m.c
+++ b/xen/arch/arm/p2m.c
@@ -1079,6 +1079,22 @@ static void p2m_invalidate_table(struct p2m_domain *p2m, 
mfn_t mfn)
  }

/*

+ * Invalidate all entries in the root page-tables. This is
+ * useful to get fault on entry and do an action.
+ */
+void p2m_invalidate_root(struct p2m_domain *p2m)
+{
+    unsigned int i;
+
+    p2m_write_lock(p2m);
+
+    for ( i = 0; i < P2M_ROOT_LEVEL; i++ )
+        p2m_invalidate_table(p2m, page_to_mfn(p2m->root + i));
+
+    p2m_write_unlock(p2m);
+}
+
+/*
   * Resolve any translation fault due to change in the p2m. This
   * includes break-before-make and valid bit cleared.
   */
@@ -1587,15 +1603,18 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
*pstart, gfn_t end)
           */
          if ( gfn_eq(start, next_block_gfn) )
          {
-            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, NULL);
+            bool valid;
+
+            mfn = p2m_get_entry(p2m, start, &t, NULL, &order, &valid);
              next_block_gfn = gfn_next_boundary(start, order);

/*

               * The following regions can be skipped:
               *      - Hole
               *      - non-RAM
+             *      - block with valid bit (bit[0]) unset
               */
-            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) )
+            if ( mfn_eq(mfn, INVALID_MFN) || !p2m_is_any_ram(t) || !valid )
              {
                  count++;
                  start = next_block_gfn;
@@ -1629,6 +1648,7 @@ int p2m_cache_flush_range(struct domain *d, gfn_t 
*pstart, gfn_t end)
   */
  void p2m_flush_vm(struct vcpu *v)
  {
+    struct p2m_domain *p2m = p2m_get_hostp2m(v->domain);
      int rc;
      gfn_t start = _gfn(0);

@@ -1648,6 +1668,12 @@ void p2m_flush_vm(struct vcpu *v)

                  "P2M has not been correctly cleaned (rc = %d)\n",
                  rc);

+ /*

+     * Invalidate the p2m to track which page was modified by the guest
+     * between call of p2m_flush_vm().
+     */
+    p2m_invalidate_root(p2m);


Does this mean that we are invalidating the p2m once more than
necessary, when the caches are finally enabled in Linux?Could that be
avoided by passing an additional argument to p2m_flush_vm?

I don't think you can know when the guest finally enabled the cache. A guest is

free to disable the cache afterwards. This is actually what arm32 does because

it decompress itself with cache enabled and then disabled it afterwards.

Cheers, -- Julien Grall _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.