[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17 v4 2/2] amd: remove VIRT_SC_MSR_HVM synthetic feature


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 15 Nov 2022 17:50:02 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=tPYNm5w9ovo03eANqvdErtCYKxnc9sm041giFK3W+/w=; b=Swe4JlTT2NAhggkvtgTpHVGKMaWGK3tA9m4/kf9VTLCKRykVfpNtdi1OhwARz2GHtGUKx4u82jercCiKuFyGyv6mHSy+NYnJGY/xP/U2rYWUT51iq3UAC6UjKpkz0NvpRck56hDqIOYGA5iIScp/KNG6Yr+BajzUnC3PZ3hcGvQ2h4fPVTR6TCmCePCP64rOc15p8ihrwUxYTUENBJbbx5Yo4vBRTFmMchBcHyQ7fqaNMTUkWc4JitjcBx10nXgCCWqlUDAmddNj9eRXY3KPzJiko6Rc7FUV5fc6LXPapZqGsp4a3JcM6l0N70L7D7gUBngLAAORHubv2dBuc9nAGw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=TDYfdX9K8q5j2U9REtwcCCVA6zqlJCqSYH+YHFeZbGXfANd1WrO8cQr4YdEKNgRk2Dof1o510dkjn3Nxi+pGhuJ6McEuIwNWvMntwYkAGuDIg7EwKHfDh9PhWZdKnU0MO+dt26O+aPNt+iJpM3plwNKIBg/6XIi1L57lS9KIL0u2Yq3DoJjklXCfHP4uAJhGuV7WUTJ6KQgURqW6dVhLBjr8U7OUJNJo5VlDoGzfsh7oa/a9Q5ZjqzSIB820aQXoBnD2uKkkCoX6jj/70CZB4uDc02PKq+1uXf52b14DL0q3adrXEik9Rooaml5+LbQx3a/MLhE8iscqtHK/m2RykA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, "Henry.Wang@xxxxxxx" <Henry.Wang@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Tue, 15 Nov 2022 16:50:31 +0000
  • Ironport-data: A9a23:4JHRTaM/AxFo7L3vrR2WlsFynXyQoLVcMsEvi/4bfWQNrUok3mNTy 2EaDTuAPviJNmTzKY1zPoWxph8A6MSEm9RrGQto+SlhQUwRpJueD7x1DKtS0wC6dZSfER09v 63yTvGacajYm1eF/k/F3oDJ9CU6jufQA+KmU4YoAwgpLSd8UiAtlBl/rOAwh49skLCRDhiE/ Nj/uKUzAnf8s9JPGj9SuvzrRC9H5qyo4mpB5wFmP5ingXeF/5UrJMNHTU2OByOQrrl8RoaSW +vFxbelyWLVlz9F5gSNy+uTnuUiG9Y+DCDW4pZkc/HKbitq/0Te5p0TJvsEAXq7vh3S9zxHJ HehgrTrIeshFvWkdO3wyHC0GQkmVUFN0OevzXRSLaV/ZqAJGpfh66wGMa04AWEX0rxlXF9wy MEeFBdOMReSpNKW4e2Ca9A506zPLOGzVG8ekldJ6GiASNwAEdXESaiM4sJE1jAtgMwIBezZe 8cSdTtoalLHfgFLPVAUTpk5mY9EhFGmK2Ee9A3T+PpxujaCpOBy+OGF3N79YNuFSN8Thk+Fj mnH4374ElcRM9n3JT+toin93b+RzHyTtIQ6FJekyq54sUOo9GERMRlIa2XiuPSClRvrMz5YA wlOksY0loAi+UruQtTjUhmQpH+fogVaS9dWC/c96gyG1uzT+QnxLkgJSCRQLuMvssAeTCYvk FSOmrvBBzZirbmUQnK17aqPoHW5Pi19BXAGTT8JS00C+daLnW0ophfGT9ImGqjliNTwQGj02 2rT8nl4gKgPh8kW0an95UrAnz+nupnOSEgy+xnTWWWmqAh+YeZJerCV1LQS1t4YRK7xc7VLl CJsdxS2hAzWMaywqQ==
  • Ironport-hdrordr: A9a23:EfuVp64Mlw6XZQrjrgPXwaiCI+orL9Y04lQ7vn2ZFiY5TiXIra qTdaogviMc6Ax/ZJjvo6HkBEClewKlyXcT2/hrAV7CZniehILMFu1fBOTZowEIdxeOldK1kJ 0QCZSWa+eAcmSS7/yKhzVQeuxIqLfnzEnrv5a5854Ed3AXV0gK1XYcNu/0KDwVeOEQbqBJaa Z0q/A30QaISDAyVICWF3MFV+/Mq5nik4/nWwcPA1oK+RSDljSh7Z/9Cly90g0FWz1C7L8++S yd+jaJrJmLgrWe8FvxxmXT55NZlJ/IzcZCPtWFjow4OyjhkQGhYaVmQvmnsCouqO+ixV42mJ 3nogsmPe5093TNF1vF6ifF6k3F6nID+nXiwViXjT/KptH4fiszD45kiZhCehXUxkI8tJUkuZ g7l16xht5yN1ftjS7979/HW1VDkVe1m2Mrlao2g2ZEWYUTRbdNpcg0/V9TEr0HACXmgbpXWd VGPYX53rJ7YFmaZ3fWsi1Gx8GtZG06GlO8Tk0LqqWuok1rtUE863Fd6N0Un38G+p54YYJD/f 74PqNhk6wLZtMKbIpmbd1xD/efOyjoe1bhIWiSKVPoGOUsIHTWsaP6570z+aWDZIEI9p0vg5 7MOWko+lLaQ3ieSfFm4ac7sSwkGA6GLHbQI4BlltREU4THNfvW2XbpciFqryOiy89vcPEzFc zDfK6+OMWTXVcGKbw5oTEWZKMiWEX2cPdlzurTCGj+1f7jG8nNitHxVsr1Cf7ELQsEM1mPcU frGgKDafl90g==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, Nov 15, 2022 at 04:21:07PM +0000, Andrew Cooper wrote:
> On 15/11/2022 13:26, Roger Pau Monne wrote:
> > Since the VIRT_SPEC_CTRL.SSBD selection is no longer context switched
> > on vm{entry,exit} there's no need to use a synthetic feature bit for
> > it anymore.
> >
> > Remove the bit and instead use a global variable.
> >
> > No functional change intended.
> >
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
> > Release-acked-by: Henry Wang <Henry.Wang@xxxxxxx>
> 
> This is definitely not appropriate for 4.17, but it's a performance
> regression in general, hence my firm and repeated objection to this
> style of patch.

While I don't have any objections in deferring this past 4.17, none of
the modified paths are performance sensitive AFAICT.

> General synthetic bits have existed for several decades longer than
> alternatives.  It has never ever been a rule, or even a recommendation,
> to aggressively purge the non-alternative bits, because it's a provably
> bad thing to do.
> 
> 
> You are attempting a micro-optimisation, that won't produce any
> improvement at all in centuries, while...

Oh, I wasn't attempting any micro-optimizations TBH, just didn't see
the need to keep this as a synthetic feature, and generally consider
better to use a global variable because it's IMO easier to follow.

> > diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
> > index a332087604..9e3b9094d3 100644
> > --- a/xen/arch/x86/cpu/amd.c
> > +++ b/xen/arch/x86/cpu/amd.c
> > @@ -49,6 +49,7 @@ boolean_param("allow_unsafe", opt_allow_unsafe);
> >  /* Signal whether the ACPI C1E quirk is required. */
> >  bool __read_mostly amd_acpi_c1e_quirk;
> >  bool __ro_after_init amd_legacy_ssbd;
> > +bool __ro_after_init amd_virt_spec_ctrl;
> 
> ... actually expending .rodata with something 8 times less efficiently
> packed, and ...
> 
> >  
> >  static inline int rdmsr_amd_safe(unsigned int msr, unsigned int *lo,
> >                              unsigned int *hi)
> > diff --git a/xen/arch/x86/cpuid.c b/xen/arch/x86/cpuid.c
> > index 822f9ace10..acc2f606ce 100644
> > --- a/xen/arch/x86/cpuid.c
> > +++ b/xen/arch/x86/cpuid.c
> > @@ -3,6 +3,7 @@
> >  #include <xen/param.h>
> >  #include <xen/sched.h>
> >  #include <xen/nospec.h>
> > +#include <asm/amd.h>
> 
> ... (Specific to this instance) making life harder for the people trying
> to make CONFIG_AMD work, and ...

That's indeed a point, albeit I think adding a `#define
amd_virt_spec_ctrl false` won't be the bigger of the problems when
dealing with CONFIG_AMD, and will need to be done for other AMD
specific variables anyway.

> > diff --git a/xen/arch/x86/spec_ctrl.c b/xen/arch/x86/spec_ctrl.c
> > index 4e53056624..0b94af6b86 100644
> > --- a/xen/arch/x86/spec_ctrl.c
> > +++ b/xen/arch/x86/spec_ctrl.c
> > @@ -514,12 +514,12 @@ static void __init print_details(enum ind_thunk 
> > thunk, uint64_t caps)
> >             (boot_cpu_has(X86_FEATURE_SC_MSR_HVM) ||
> >              boot_cpu_has(X86_FEATURE_SC_RSB_HVM) ||
> >              boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM) ||
> > -            boot_cpu_has(X86_FEATURE_VIRT_SC_MSR_HVM) ||
> > +            amd_virt_spec_ctrl ||
> 
> ... breaking apart a single TEST instruction, which not only adds an
> extra conditional merge, but now hits an cold-ish cache line everywhere
> it's used.

Why does performance matter here?  It's an init function that prints
the speculation related settings to the screen, so that's likely to be
many times slower that accessing a cold cache line.

> Count how many synthetic feature bits it will actually take to change
> the per-cpu data size, and realise that, when it will take more than 200
> years at the current rate of accumulation, any believe that this is an
> improvement to be had disappears.
> 
> Yes, it is only a micro regression, but you need a far better
> justification than "there is a gain of 64 bytes per CPU which will be
> non-theoretical in more than 200 years" when traded off vs the actual
> 512 bytes, plus extra code bloat bloat, plus reduced locality of data
> that this "improvement" genuinely costs today.

I wasn't considering any of the above when proposing the change, my
only motivation was that global variables are clearer to use than
synthetic features, and I didn't see a need for a synthetic feature in
this case.  If we agree the above possible performance regressions
are worth it I'm fine keeping it as-is.

Now that I realize it amd_virt_spec_ctrl could even be plain __init.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.