[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [XEN][PATCH] xen/x86: guest_access: optimize raw_x_guest() for PV and HVM combinations


  • To: Jason Andryuk <jason.andryuk@xxxxxxx>, Teddy Astie <teddy.astie@xxxxxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx, Jan Beulich <jbeulich@xxxxxxxx>
  • From: Grygorii Strashko <grygorii_strashko@xxxxxxxx>
  • Date: Fri, 7 Nov 2025 12:27:26 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=epam.com; dmarc=pass action=none header.from=epam.com; dkim=pass header.d=epam.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=NONuVg/NJzAZPSkDh4oPbS2bwGloAdzoWNcm+cTEJHk=; b=ZLLuhWpYU8WAr1E2q6VLMXy4kAdeIwwa3hXTAb872YkMt7y5lPRji5ZKaThz8VN2Yfq/pKDb1MRimg4Vdp3W2c9/OALbatEm1vWYvUn6oftPWTkbObuTi7AFa/4x5kDgBLYz6Nb4kp/lYYrzeCqNW6kzK2Ppwgl7v4j4opwMPIKCIRKdgCeuDLt03zKBQHVZz/1RUyKDc+k+kR1MbJrfnhGkPeWI14TmQ1cGnd+fEbNew3xgbgAX8xBULoGStgOjsVdSNFDdEA0eSbe7Qgok1kD5KoBcBbEPushHbnRZqUqan33fDpoAp6+u7aqz3D4zA16Nv+5XVZiDng99q94vbw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=FN6vzqDctfHbhHC4iNqMour3Fu+7wWirDkfgOOL/4J1uq6UD+xFNmA+5Cs7uFokvQ/HzpyoaNjD0qI+4mhO/rgW0Pt/eXaLooF+QWJWQtrifVwHEjyozbShNLWNxfMz/Sr8dL029iqwE2zIKAGv8yl2By9BDCH4jrpabfs5fBbMz0afGjbtnxExNaJVyfXA6FqmrO8ru9UB8L91djViat4iB+XLHbyCqyYFJx4F+7wOo2bbOTZtGXJhGfxmOW0zhgEHG5hPc9oMh93J1gD4jYdwEZbLGg29BoiMkjzUCRBE3L+Fhwd12C2KN7Cf6jS0Wry8coohsUXWef2M95k4xMA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=epam.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Anthony PERARD <anthony.perard@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>
  • Delivery-date: Fri, 07 Nov 2025 10:27:38 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi Jason,

On 07.11.25 03:29, Jason Andryuk wrote:
On 2025-11-06 12:40, Grygorii Strashko wrote:


On 06.11.25 19:27, Teddy Astie wrote:
Le 06/11/2025 à 18:00, Jason Andryuk a écrit :
On 2025-11-06 11:33, Grygorii Strashko wrote:
Hi Teddy, Jan,

On 06.11.25 17:57, Teddy Astie wrote:
Le 31/10/2025 à 22:25, Grygorii Strashko a écrit :
Can try.

Yes, I was thinking something like Teddy suggested:

#define raw_copy_to_guest(dst, src, len)        \
          (is_hvm_vcpu(current) ? copy_to_user_hvm(dst, src, len) :
           is_pv_vcpu(current) ? copy_to_guest_pv(dst, src, len) :
           fail_copy(dst, src, len))

But that made the think the is_{hvm,pv}_{vcpu,domain}() could be
optimized for when only 1 of HVM or PV is enabled.

Regards,
Jason

xen: Optimize is_hvm/pv_domain() for single domain type

is_hvm_domain() and is_pv_domain() hardcode the false conditions for
HVM=n and PV=n, but they still leave the XEN_DOMCTL_CDF_hvm flag
checking.  When only one of PV or HVM is set, the result can be hard
coded since the other is impossible.  Notably, this removes the
evaluate_nospec() lfences.

Signed-off-by: Jason Andryuk <jason.andryuk@xxxxxxx>
---
Untested.

HVM=y PV=n bloat-o-meter:

add/remove: 3/6 grow/shrink: 19/212 up/down: 3060/-60349 (-57289)

Full bloat-o-meter below.
---
   xen/include/xen/sched.h | 18 ++++++++++++++----
   1 file changed, 14 insertions(+), 4 deletions(-)

diff --git a/xen/include/xen/sched.h b/xen/include/xen/sched.h
index f680fb4fa1..12f10d9cc8 100644
--- a/xen/include/xen/sched.h
+++ b/xen/include/xen/sched.h
@@ -1176,8 +1176,13 @@ static always_inline bool
is_hypercall_target(const struct domain *d)

   static always_inline bool is_pv_domain(const struct domain *d)
   {
-    return IS_ENABLED(CONFIG_PV) &&
-        evaluate_nospec(!(d->options & XEN_DOMCTL_CDF_hvm));
+    if ( !IS_ENABLED(CONFIG_PV) )
+        return false;
+
+    if ( IS_ENABLED(CONFIG_HVM) )
+        return evaluate_nospec(!(d->options & XEN_DOMCTL_CDF_hvm));
+
+    return true;
   }

   static always_inline bool is_pv_vcpu(const struct vcpu *v)
@@ -1218,8 +1223,13 @@ static always_inline bool is_pv_64bit_vcpu(const
struct vcpu *v)

   static always_inline bool is_hvm_domain(const struct domain *d)
   {
-    return IS_ENABLED(CONFIG_HVM) &&
-        evaluate_nospec(d->options & XEN_DOMCTL_CDF_hvm);
+    if ( !IS_ENABLED(CONFIG_HVM) )
+        return false;
+
+    if ( IS_ENABLED(CONFIG_PV) )
+        return evaluate_nospec(d->options & XEN_DOMCTL_CDF_hvm);
+
+    return true;
   }

   static always_inline bool is_hvm_vcpu(const struct vcpu *v)

While I like the idea, it may slightly impact some logic as special
domains (dom_xen and dom_io) are now considered HVM domains (when !PV &&
HVM) instead of "neither PV nor HVM".
We want at least to make sure we're not silently breaking something
elsewhere.

first of all idle domain - I've tried to constify is_hvm_domain() and even made 
it work,
but diff is very fragile.

Interesting.  Yeah, I did not consider system domains.  It seems fragile today if 
sometimes !is_hvm_domain implies idle_domain.  :/

Diff below - just FYI.

--
Best regards,
-grygorii

Author: Grygorii Strashko <grygorii_strashko@xxxxxxxx>
Date:   Fri Oct 17 17:21:29 2025 +0300

     HACK: hvm only
     Signed-off-by: Grygorii Strashko <grygorii_strashko@xxxxxxxx>

diff --git a/xen/arch/x86/domain.c b/xen/arch/x86/domain.c
index d65c2bd3661f..2ea3d81670de 100644
--- a/xen/arch/x86/domain.c
+++ b/xen/arch/x86/domain.c
@@ -567,17 +567,17 @@ int arch_vcpu_create(struct vcpu *v)

      spin_lock_init(&v->arch.vpmu.vpmu_lock);

-    if ( is_hvm_domain(d) )
-        rc = hvm_vcpu_initialise(v);
-    else if ( !is_idle_domain(d) )
-        rc = pv_vcpu_initialise(v);
-    else
+    if ( is_idle_domain(d) )
      {
          /* Idle domain */
          v->arch.cr3 = __pa(idle_pg_table);
          rc = 0;
          v->arch.msrs = ZERO_BLOCK_PTR; /* Catch stray misuses */
      }
+    else if ( is_hvm_domain(d) )
+        rc = hvm_vcpu_initialise(v);
+    else
+        rc = pv_vcpu_initialise(v);

This looks like an improvement as it makes the idle domain case explicit.


      if ( rc )
          goto fail;
@@ -2123,7 +2123,7 @@ void context_switch(struct vcpu *prev, struct vcpu *next)
      vpmu_switch_from(prev);
      np2m_schedule(NP2M_SCHEDLE_OUT);

-    if ( is_hvm_domain(prevd) && !list_empty(&prev->arch.hvm.tm_list) )
+    if ( !is_idle_domain(prevd) && is_hvm_domain(prevd) && ! 
list_empty(&prev->arch.hvm.tm_list) )

The idle domain's tm_list could be initialized.  It should remain empty and be 
equivalent without modifying this line.  Though maybe your way is better.

          pt_save_timer(prev);

      local_irq_disable();


diff --git a/xen/arch/x86/hvm/svm/vmcb.c b/xen/arch/x86/hvm/svm/vmcb.c
index 839d3ff91b5a..e3c9b4ffba52 100644
--- a/xen/arch/x86/hvm/svm/vmcb.c
+++ b/xen/arch/x86/hvm/svm/vmcb.c
@@ -236,7 +236,7 @@ static void cf_check vmcb_dump(unsigned char ch)

      for_each_domain ( d )
      {
-        if ( !is_hvm_domain(d) )
+        if ( is_idle_domain(d) || !is_hvm_domain(d) )

I don't think this should be needed as idle domain, and system domains in general, 
are not added to domlist.  So for_each_domain() will only iterate over user 
domains.

domain_create() has an early exit for system domains:
....
     /* DOMID_{XEN,IO,IDLE,etc} are sufficiently constructed. */
     if ( is_system_domain(d) )
         return d;

     arch_domain_create()
         paging_domain_init()
             p2m_init()

     domlist_insert()

              continue;
          printk("\n>>> Domain %d <<<\n", d->domain_id);
          for_each_vcpu ( d, v )
diff --git a/xen/arch/x86/mm/p2m-basic.c b/xen/arch/x86/mm/p2m-basic.c
index e126fda26760..c53269b3c06d 100644
--- a/xen/arch/x86/mm/p2m-basic.c
+++ b/xen/arch/x86/mm/p2m-basic.c
@@ -34,7 +34,7 @@ static int p2m_initialise(struct domain *d, struct p2m_domain 
*p2m)
      p2m->default_access = p2m_access_rwx;
      p2m->p2m_class = p2m_host;

-    if ( !is_hvm_domain(d) )
+    if ( is_idle_domain(d) || !is_hvm_domain(d) )
          return 0;

      p2m_pod_init(p2m);
@@ -113,7 +113,7 @@ int p2m_init(struct domain *d)
      int rc;

      rc = p2m_init_hostp2m(d);
-    if ( rc || !is_hvm_domain(d) )
+    if ( rc || is_idle_domain(d) || !is_hvm_domain(d) )

Given the snippet above, I think p2m functions can't be reached for system 
domains.

          return rc;

      /*
diff --git a/xen/arch/x86/mm/p2m-pod.c b/xen/arch/x86/mm/p2m-pod.c
index 05633fe2ac88..4e62d98861fe 100644
--- a/xen/arch/x86/mm/p2m-pod.c
+++ b/xen/arch/x86/mm/p2m-pod.c
@@ -1425,7 +1425,7 @@ bool p2m_pod_active(const struct domain *d)
      struct p2m_domain *p2m;
      bool res;

-    if ( !is_hvm_domain(d) )
+    if ( is_idle_domain(d) || !is_hvm_domain(d) )

accessed via:
     do_domctl()
     vm_event_domctl()
             p2m_pod_active()

The passed in d needs to be from domlist, so again a system domain cannot reach 
here.

          return false;

      p2m = p2m_get_hostp2m(d);
diff --git a/xen/arch/x86/setup.c b/xen/arch/x86/setup.c
index ccf8563e5a64..e1862c5085f5 100644
--- a/xen/arch/x86/setup.c
+++ b/xen/arch/x86/setup.c
@@ -2158,7 +2158,7 @@ static int __hwdom_init cf_check io_bitmap_cb(

  void __hwdom_init setup_io_bitmap(struct domain *d)
  {
-    if ( !is_hvm_domain(d) )
+    if ( is_idle_domain(d) || !is_hvm_domain(d) )

This looks like it is called for dom0 or late_hwdom, so only real domains.


Thank you for your comments. Unfortunately, I probably will not continue this
HVM_ONLY exercise in the nearest future :(, so if anyone interested and want to 
pick up - feel free.

--
Best regards,
-grygorii




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.