[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.17 v4 2/2] amd: remove VIRT_SC_MSR_HVM synthetic feature


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 15 Nov 2022 17:44:56 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=wGhslgKI/dGUIazKz+iL2ZtC2TcXPMBaZeUDTEtHdOo=; b=fIUYDZtxVyCcqgWqzNj7xcRcxok54CnLwQov7Cxvb3aLQSXU6iVGV3Y+PeqgaqKkgtGYzjS68Isjv8YmQkSB4bE1fiHKylYMTs/nrF/oB8FosIaHw19NELSZWPcVflb4wSGUaTF6GVnDU+N4PHAfsO508P/oEfnC/IJX85mwNNQvPdFYZcTOLpTXDtdi4IvnWw9j0ehMbQq0D3gS7fErzZHCUi/Vq1ZJJMTJkguiqaNc/dGDiEnwNO+c4h/0YgBKjmYEC3Z647GlXM97TG4KQXobpubiPLNPcuEC6KYboKR4/etituhxh0ypyTMamVrtt1gErkZFdazbjG/9sONgAA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=JO/O6lZlrfuDjIAm5UZSjvKP30SSgEi2Vw8xF1B3RmJSCv/p5/ll+ZYTJ8nyzg6hClwQ6R37choEjHQlRgNZHv1SAiyIv+7KTDkoUuALOEdeH4Xys918xY96nyZEEcuWzQTEWxmfSm4JAC03jxhaLmz431G8VQZnLf7Sn4H2Ql0rKSEXxrAsU/ztDX3Px8mJWS+Wkvd9fhhroHgY+vPp3VbK44jC5A2xjXUMtCmQnIB5ymS2eAhb0xjwPGzMIajvpkcC+tpITe/KrIIWuhx7K6i/EJ17+A2GgpJsR7q9OUtUjUHPWatVd8Slq3qKP0D962BicSRYwYwsJppw9SE6BQ==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: "Henry.Wang@xxxxxxx" <Henry.Wang@xxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 15 Nov 2022 16:45:09 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15.11.2022 17:21, Andrew Cooper wrote:
> On 15/11/2022 13:26, Roger Pau Monne wrote:
>> Since the VIRT_SPEC_CTRL.SSBD selection is no longer context switched
>> on vm{entry,exit} there's no need to use a synthetic feature bit for
>> it anymore.
>>
>> Remove the bit and instead use a global variable.
>>
>> No functional change intended.
>>
>> Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
>> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>
>> Release-acked-by: Henry Wang <Henry.Wang@xxxxxxx>
> 
> This is definitely not appropriate for 4.17, but it's a performance
> regression in general, hence my firm and repeated objection to this
> style of patch.
> 
> General synthetic bits have existed for several decades longer than
> alternatives.  It has never ever been a rule, or even a recommendation,
> to aggressively purge the non-alternative bits, because it's a provably
> bad thing to do.

There we are again - you state something as bad without really saying
why it is bad. My view is that synthetic bits were wrong to introduce
when they don't stand a chance of being used in an alternative.

I agree though that there's no strong need for this to make 4.17. It
may end up making backports slightly easier, as no such bit existed
in 4.16.

> You are attempting a micro-optimisation, that won't produce any
> improvement at all in centuries, while...
> 
>> diff --git a/xen/arch/x86/cpu/amd.c b/xen/arch/x86/cpu/amd.c
>> index a332087604..9e3b9094d3 100644
>> --- a/xen/arch/x86/cpu/amd.c
>> +++ b/xen/arch/x86/cpu/amd.c
>> @@ -49,6 +49,7 @@ boolean_param("allow_unsafe", opt_allow_unsafe);
>>  /* Signal whether the ACPI C1E quirk is required. */
>>  bool __read_mostly amd_acpi_c1e_quirk;
>>  bool __ro_after_init amd_legacy_ssbd;
>> +bool __ro_after_init amd_virt_spec_ctrl;
> 
> ... actually expending .rodata with something 8 times less efficiently
> packed, and ...

... as long as you're talking of just a single CPU. The break-even is
at 8 CPUs (8 bits used either way).

>> --- a/xen/arch/x86/spec_ctrl.c
>> +++ b/xen/arch/x86/spec_ctrl.c
>> @@ -514,12 +514,12 @@ static void __init print_details(enum ind_thunk thunk, 
>> uint64_t caps)
>>             (boot_cpu_has(X86_FEATURE_SC_MSR_HVM) ||
>>              boot_cpu_has(X86_FEATURE_SC_RSB_HVM) ||
>>              boot_cpu_has(X86_FEATURE_IBPB_ENTRY_HVM) ||
>> -            boot_cpu_has(X86_FEATURE_VIRT_SC_MSR_HVM) ||
>> +            amd_virt_spec_ctrl ||
> 
> ... breaking apart a single TEST instruction, which not only adds an
> extra conditional merge, but now hits an cold-ish cache line everywhere
> it's used.
> 
> Count how many synthetic feature bits it will actually take to change
> the per-cpu data size, and realise that, when it will take more than 200
> years at the current rate of accumulation, any believe that this is an
> improvement to be had disappears.
> 
> Yes, it is only a micro regression, but you need a far better
> justification than "there is a gain of 64 bytes per CPU which will be
> non-theoretical in more than 200 years" when traded off vs the actual
> 512 bytes, plus extra code bloat bloat, plus reduced locality of data
> that this "improvement" genuinely costs today.

I don't see Roger stating anything like this.

I think we need to settle on at least halfway firm rules on when to use
synthetic feature bits and when to use plain global booleans. Without
that the tastes of the three of us are going to collide again every once
in a while.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.