[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17 v2] hvm/apic: repurpose the reporting of the APIC assist options


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Mon, 14 Nov 2022 13:21:08 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tNTcsBiFZbrLh133XwDwMIdoXwDbriQt8eK9G94FhYo=; b=G6qEqDcGWD1QE/BIyw0+EiGa1cyqVZL7Yk03DeqR4XRs0L1v2jLirm1VO9ve9aia/e455fEpBefSZ6eHgPbEYVo22D4kZxdhSHs1f7nSrgV3Uzs5jG+HYxxFCnsVv+kEwvDmtQgvIaip+7GJJoFMJzw7Kgzz741fwLf05iql0RNprt2D9cLPrAeEivLOZSbf/KxdyzocZcM4sOzDGAtQsh+cZKSia0En75ta2kR5iFraNKLfeEx1L6Zr204qSFqRlrYhLgqcFrLSscqbbwG+DlDAIEXG8dtg85jeIAMBAvTb99f2UNDmtGxxdzxsGNOET78ZlMx9yCGKASsMyFR++A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Jiq5zdj9TXMp5mQGFuOAbY4VTu0hQE8Xc0XAENwJZxpXaXEXdlIGI16KfGFBrp8DMjvvNLN3EmTjaxKdAlIkEjOZSUOC04miBakMt/2Q5eVIoWe6msViO/p+hKAMF4kTx0wJRAxTz80bfcAfiIiLQwsKkHW1khT5qq6Iem6P8O/6GZ480Fr0JVYwSg8xODBjUbs9NTrqBf/Z+llk5bqzfOiryvQHAuxhgKZb+jPbtdysIw6Ab9qPgTKZcjr/SbPQ5G2hpRzfvzXr4ysrpt0tEaoHtbL9NoslwF8STvPHI+9/im3YRW7a+8YtzAe4u8EvB1Xz9GWFSZPjupMDPW8glA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Henry.Wang@xxxxxxx" <Henry.Wang@xxxxxxx>, Paul Durrant <paul@xxxxxxx>, Wei Liu <wl@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Jun Nakajima <jun.nakajima@xxxxxxxxx>, Kevin Tian <kevin.tian@xxxxxxxxx>
  • Delivery-date: Mon, 14 Nov 2022 12:21:33 +0000
  • Ironport-data: A9a23:eRtrRa0DsfT2G4GhxPbD5SRwkn2cJEfYwER7XKvMYLTBsI5bp2cPm mUeXW7QaKnYN2emedhzbYWz9EwF78OBm4cxSgVkpC1hF35El5HIVI+TRqvS04F+DeWYFR46s J9OAjXkBJppJpMJjk71atANlVEliefTAOK5ULSfUsxIbVcMYD87jh5+kPIOjIdtgNyoayuAo tq3qMDEULOf82cc3lk8tuTS9nuDgNyo4GlC5wVnOagS1LPjvyJ94Kw3dPnZw0TQGuG4LsbiL 87fwbew+H/u/htFIrtJRZ6iLyXm6paLVeS/oiI+t5qK23CulQRrukoPD9IOaF8/ttm8t4sZJ OOhF3CHYVxB0qXkwIzxWvTDes10FfUuFLTveRBTvSEPpqFvnrSFL/hGVSkL0YMkFulfGGht+ /wbLAo2Uk673euT5fWUVeRVmZF2RCXrFNt3VnBI6xj8VK5jbbWdBqLA6JlfwSs6gd1IEbDGf c0FZDFzbRPGJRpSJlMQD5F4l+Ct7pX9W2QA9BTJ+uxqsy6KlFQZPLvFabI5fvSQQspYhACAr 3/u9GXlGBAKcteYzFJp91r83b+VzHylCOr+EpWJ8PEyuRqpmVc/SzcwZWqk4ubohA2xDoc3x 0s8v3BGQbIJ3FymSJzxUgO1pFaAvwUAQJxAHusi8gaPx6HIpQGDCQAsbjNHcs1gi8YwShQjz FrPlNTsbRR3uaCRYWKQ8PGTtzzaETcRBX8PY2kDVwRty9vprZw3jxnPZs1+C6PzhdrwcRny3 jbMqiE9jrcSiMcj1qOn8FSBiDWpzrDKQxA04EPLX2ujxgJ/eIOhIYev7DDmAe1oKY+YShyNu SYCks3HtOQWV8jVzGqKXfkHG6yv67CdKjrAjFVzHp4nsTOw53qkeoMW6zZ7TKt0Dvs5lfbSS Be7kWtsCFV7ZRNGsYcfj1qNNvkX
  • Ironport-hdrordr: A9a23:00jV0652qNBYKYKXUgPXwa+CI+orL9Y04lQ7vn2ZFiY5TiXIra qTdaogviMc6Ax/ZJjvo6HkBEClewKlyXcT2/hrAV7CZniehILMFu1fBOTZowEIdxeOldK1kJ 0QCZSWa+eAcmSS7/yKhzVQeuxIqLfnzEnrv5a5854Ed3AXV0gK1XYcNu/0KDwVeOEQbqBJaa Z0q/A37gZJPh8sH7eGL0hAe9KGi8zAlZrgbxJDLxk76DOWhTftxK/mHwOe1hI+VSoK5bs562 DKnyHw+63m6piAu1Lh/l6Wy64TtMrqy9NFCsDJos8JKg/0ggLtSJV9V6aEtDUVpvjqzFoxit HDrzopIsw2wXLMeWOepwfrxmDboXwTwk6n7WXdrWrooMT/Sj5/I81dhbhBeh+cz0Y7ptlz3I 9Cwmrc7vNsfFj9tRW4w+KNewBhl0Kyr3ZnuekPj0ZHWY9bTLNKt4QQ8G5cDZ9FNiPn74IMFv VoEajnlb5rWGLfS0qcknhkwdSqUHh2NhCaQnIassjQ6DRSlGAR9Tpt+OUv2lM7sL4tQZhN4O rJdo5ykqtVc8MQZaVhQM8cXMqeEAX2MFPxGVPXBW6iOLAMOnrLpZKyyq4y/vuWdJsBy4Z3sI jdUWlfqXU5dyvVeIOzNaVwg1PwqViGLHbQIpk03ek9hlS8fsulDcS7ciFvryP6yM9vRvEyWJ 6ISedr6rHYXCzT8L1yrn3DsqlpWAcjufIuy6cGsnK107X2w97Rx5rmWceWAobROhAZfU66Kk c/fVHIVbd9BwaQKzPFvCQ=
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Nov 10, 2022 at 10:47:07PM +0000, Andrew Cooper wrote:
> On 04/11/2022 16:18, Roger Pau Monne wrote:
> > The current reporting of the hardware assisted APIC options is done by
> > checking "virtualize APIC accesses" which is not very helpful, as that
> > feature doesn't avoid a vmexit, instead it does provide some help in
> > order to detect APIC MMIO accesses in vmexit processing.
> >
> > Repurpose the current reporting of xAPIC assistance to instead report
> > such feature as present when there's support for "TPR shadow" and
> > "APIC register virtualization" because in that case some xAPIC MMIO
> > register accesses are handled directly by the hardware, without
> > requiring a vmexit.
> >
> > For symetry also change assisted x2APIC reporting to require
> > "virtualize x2APIC mode" and "APIC register virtualization", dropping
> > the option to also be reported when "virtual interrupt delivery" is
> > available.  Presence of the "virtual interrupt delivery" feature will
> > be reported using a different option.
> >
> > Fixes: 2ce11ce249 ('x86/HVM: allow per-domain usage of hardware virtualized 
> > APIC')
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > ---
> > I find the logic in vmx_vlapic_msr_changed() hard to follow, but I
> > don't want to rewrite the function logic at this point.
> > ---
> > Changes since v1:
> >  - Fix Viridian MSR tip conditions.
> > ---
> >  xen/arch/x86/hvm/viridian/viridian.c |  2 +-
> >  xen/arch/x86/hvm/vmx/vmcs.c          |  8 ++++----
> >  xen/arch/x86/hvm/vmx/vmx.c           | 25 ++++++++++++++++++-------
> >  xen/arch/x86/traps.c                 |  4 +---
> >  4 files changed, 24 insertions(+), 15 deletions(-)
> >
> > diff --git a/xen/arch/x86/hvm/viridian/viridian.c 
> > b/xen/arch/x86/hvm/viridian/viridian.c
> > index 25dca93e8b..44eb3d0519 100644
> > --- a/xen/arch/x86/hvm/viridian/viridian.c
> > +++ b/xen/arch/x86/hvm/viridian/viridian.c
> > @@ -197,7 +197,7 @@ void cpuid_viridian_leaves(const struct vcpu *v, 
> > uint32_t leaf,
> >          res->a = CPUID4A_RELAX_TIMER_INT;
> >          if ( viridian_feature_mask(d) & HVMPV_hcall_remote_tlb_flush )
> >              res->a |= CPUID4A_HCALL_REMOTE_TLB_FLUSH;
> > -        if ( !cpu_has_vmx_apic_reg_virt )
> > +        if ( !has_assisted_xapic(d) )
> >              res->a |= CPUID4A_MSR_BASED_APIC;
> 
> This check is broken before and after.  It needs to be keyed on
> virtualised interrupt delivery, not register acceleration.
> 
> But doing this correctly needs a per-domain vintr setting, which we
> don't have yet.
> 
> It is marginally less broken with this change, than without, but that's
> not saying much.

I took a look at the HyperV TLFS spec but the table that lists the
CPUID bits don't explicitly name which registers are accessed using
MSRs rather than MMIO when the 'MSR_BASED_APIC' suggestion is set on
CPUID.

It's my understanding that the hint is not useful anymore, as Xen
exposes x2APIC by default, and that's what any new-ish version of
Windows should use, in which case all APIC registers are already
accessed using MSRs and the hint is moot.

> >          if ( viridian_feature_mask(d) & HVMPV_hcall_ipi )
> >              res->a |= CPUID4A_SYNTHETIC_CLUSTER_IPI;
> > diff --git a/xen/arch/x86/hvm/vmx/vmcs.c b/xen/arch/x86/hvm/vmx/vmcs.c
> > index a1aca1ec04..7bb96e1a8e 100644
> > --- a/xen/arch/x86/hvm/vmx/vmcs.c
> > +++ b/xen/arch/x86/hvm/vmx/vmcs.c
> > @@ -1136,7 +1136,7 @@ static int construct_vmcs(struct vcpu *v)
> >  
> >      if ( !has_assisted_xapic(d) )
> >          v->arch.hvm.vmx.secondary_exec_control &=
> > -            ~SECONDARY_EXEC_VIRTUALIZE_APIC_ACCESSES;
> > +            ~SECONDARY_EXEC_APIC_REGISTER_VIRT;
> >  
> >      if ( cpu_has_vmx_secondary_exec_control )
> >          __vmwrite(SECONDARY_VM_EXEC_CONTROL,
> > @@ -2156,10 +2156,10 @@ int __init vmx_vmcs_init(void)
> >      if ( !ret )
> >      {
> >          /* Check whether hardware supports accelerated xapic and x2apic. */
> > -        assisted_xapic_available = cpu_has_vmx_virtualize_apic_accesses;
> > +        assisted_xapic_available = cpu_has_vmx_tpr_shadow &&
> > +                                   cpu_has_vmx_apic_reg_virt;
> >          assisted_x2apic_available = cpu_has_vmx_virtualize_x2apic_mode &&
> > -                                    (cpu_has_vmx_apic_reg_virt ||
> > -                                     cpu_has_vmx_virtual_intr_delivery);
> > +                                    cpu_has_vmx_apic_reg_virt;
> 
> apic reg virt already depends on tpr shadow, so that part of the
> condition is redundant.
> 
> virtualise x2apic mode and apic reg virt aren't dependent, but they do
> only ever appear together in hardware.

I would keep those as-is for sanity, it's easier to spot the
dependencies between them.  And then it's also possible that we want
to introduce a control for tpr_shadow, and which point having the
conditional here is a good reminder that virtualize_apic_accesses
depends on that feature being available and enabled.

> Keeping the conditionals might be ok to combat a bad outer hypervisor,
> but ...
> 
> >          register_keyhandler('v', vmcs_dump, "dump VT-x VMCSs", 1);
> >      }
> >  
> > diff --git a/xen/arch/x86/hvm/vmx/vmx.c b/xen/arch/x86/hvm/vmx/vmx.c
> > index e624b415c9..bf0fe3355c 100644
> > --- a/xen/arch/x86/hvm/vmx/vmx.c
> > +++ b/xen/arch/x86/hvm/vmx/vmx.c
> > @@ -3405,25 +3405,29 @@ static void vmx_install_vlapic_mapping(struct vcpu 
> > *v)
> >  
> >  void vmx_vlapic_msr_changed(struct vcpu *v)
> >  {
> > +    bool virtualize_x2apic_mode = has_assisted_x2apic(v->domain) ||
> > +                                  (cpu_has_vmx_virtualize_x2apic_mode &&
> > +                                   cpu_has_vmx_virtual_intr_delivery);
> 
> ... this is still wrong, and ...
> 
> >      struct vlapic *vlapic = vcpu_vlapic(v);
> >      unsigned int msr;
> >  
> > -    if ( !has_assisted_xapic(v->domain) &&
> > -         !has_assisted_x2apic(v->domain) )
> > +    if ( !cpu_has_vmx_virtualize_apic_accesses &&
> > +         !virtualize_x2apic_mode )
> >          return;
> 
> ... this surely cannot be right.

It's indeed missing a has_assisted_xapic(v->domain), so it should be:

if ( !cpu_has_vmx_virtualize_apic_accesses &&
     !has_assisted_xapic(v->domain) &&
     !virtualize_x2apic_mode )
    return;

Logic in this function is not the best one IMO, as I've already
mentioned in a post-commit remark.

> 
> While attempting to figure ^ out, I've found yet another regression vs
> 4.16.  Because virt intr delivery is set in the execution controls
> system-wide and not controlled per domain, we'll take a vmentry failure
> on SKX/CLX/ICX when trying to build an HVM domain without xAPIC
> acceleration.

I've tried creating the following guest on one of our Ice Lake boxes
(with and without this patch applied):

type="pvh"
name = "test"
memory = 1024
vcpus = 1

kernel = "/root/vmlinuz-6.1.0-rc4+"
extra = "console=hvc0"

assisted_xapic=0

And it seem s to boot just fine, no vmentry failure.

> This, combined with the ABI errors (/misfeatures) that we really don't
> want to escape into the world but I haven't finished fixing yet, means
> that the only appropriate course of action is to revert.

Hm, I'm confused by this, as it is still not clear to me what's
wrong with the ABI.  Is it the way in which we expose the flags?  Or
is it the naming?

AFAIK we want to have a flag to toggle apic_reg_virt, which is the
proposal here.  Should it be named differently?

> I'd really hoped to avoid a full revert, but we've run out of time.

I've also hoped so.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.