[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 07/12] x86: Have x86_emulate/ implement the single-vendor optimisation


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>
  • Date: Thu, 12 Feb 2026 16:29:16 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=suse.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=sw5ygoxVKiT+AAE2mVV5SNuh5d1fTHTcz39e3luENq4=; b=X6uBbkixlnVaPCeNMY0qM+JREe4u/0CLlUk95U1/e8gPjRh6a/K/uTEa/jy+gEXW5w/NTBv895p6SyFIk9JxfBiLIbhGnSFB16XfTVioRn7kj8kS9tb5VDlf7Nz+NCo+6ejaej1zbBVpvaBvObINYiJGqOmc30RhSwlpwHR8T11qFYeOofX8nj4IEHr2DQ6NUgWKFFQSYPMW/ZFaU8xYtbHRNANI5Up8/z2AEjcFIQ2B2NTPRUAsbnnbdJk8/dkLcvF/OlTNgoDnLX40OBnUwIKM3p4czdK3R/oqHApI7CPg0x+5dn/njzc/bxXYVpIAuIu9J6LYKh7zp5DJQQPw4A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=BlKJxYAWr5ed9D7SsyImVkeODMJOIHB6yxRG2OC2wpy9vNr/9fXqoDu/uOuleu1KoJm3q4ZqkWFMNErp5RzwH0qO5yOVzE8wauebAmWBQ0X/czGaK/GmTZs3YsSXv6xqw5H7sW0dmPawV2gKW0YmthwOXhLb4hkiFLd8ClT2zcI/0lLFBu8i2TXz5b/0JtlXm/JRjzLGqy7jdwnjUWgAt2VysuNtRSJ4Uz9zILOgHB0UMc0MtnPBn4VS3aSdgw8x9TUlzC6jX/ssPXOYyHwf5tDeQjIrxBd/vJZdg9D3pkD1zmXuciO3m88eYHCveZ9ANAH0jmcybQUlynihSVIJuA==
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Jason Andryuk <jason.andryuk@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 12 Feb 2026 15:29:38 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu Feb 12, 2026 at 12:26 PM CET, Jan Beulich wrote:
> On 06.02.2026 17:15, Alejandro Vallejo wrote:
>> Open code the vendor check through the policy as a one-off. The emulator
>> embeds amd_like() in macros and is called in MANY places. Using a
>> local variable (cp->x86_vendor) makes it a lot smaller (300-400 bytes
>> smaller). So treat this as the exception it is and let it use the policy
>> rather than boot_cpu_data.
>
> As elsewhere you mainly discuss benefits for the single-vendor case, is the
> above about the opposite situation? Else why would codegen suffer this much
> here?

In single-vendor it doesn't matter. A constant is a constant. It matters in
multivendor. And it matters for 2 reasons.

  1. The x86_emulate() function is HUGE.
  2. The x86_emulate() function has LOTS of amd_like() invocations.

When amd_like() uses the policy it has an advantage over using the global.
Namely, the policy is already cached in a register, so codegen simply has to
pull the vendor from an offset into a register. When we go for a global variable
we need to reach out and pull the variable from its 64bit address (because we
compile with model=large). It normally evens out with the codegen reductions
cpu_vendor() encourages you to have, but in here it just doesn't. There's way
too many accesses to global state and .text suffers.

The fix for this is caching the vendor somewhere else. A "bool amd_like" in a
local variable would shrink code substantially (by having rsp-relative access to
the solution of the question amd_like() asks, and by avoiding the masking). This
optimisation is worth doing with or without this patch in place.

Alas, I didn't test that, because this series was sufficiently complicated
as-is.

>
> Using cp also is preferable for test and fuzzing harnesses, which don't
> even know boot_cpu_data.

They don't now, which is why I made the x86emul_cpu() macro. New subsystems
added to a userlevel testing ground could simply have a boot_cpu_data with the
desired policy as part of their harness' global state.

>
>> @@ -30,8 +31,15 @@ void BUG(void);
>>  #  define X86EMUL_NO_SIMD
>>  # endif
>>  
>> +/* intentionally avoid cpu_vendor(), as it produces much worse codegen */
>
> Nit (style): Capital letter wanted at the start.
>
>> +# define x86emul_cpu(cp) ((X86_ENABLED_VENDORS ==            \
>> +                           ISOLATE_LSB(X86_ENABLED_VENDORS)) \
>> +                               ? X86_ENABLED_VENDORS         \
>> +                               : ((cp)->x86_vendor & X86_ENABLED_VENDORS))
>
> Nit: Indentation. The ? and : want to align with the controlling expression.

sure.

>
> Further, is this a good name, without "vendor" in it?

x86emul_cpu_vendor() is fine with me too.

>
> And then I'm of two minds here as to the use of the macro parameter: On one
> hand we can be pretty certain what is passed in won't have side effects.
> Otoh in a hypothetical odd case (seeing that this lives in a header file,
> not local to an isolated piece of code) where there would be one, the
> argument being evaluated unreliably could cause an unpleasant surprise.
> The more ...
>
>>  #else /* !__XEN__ */
>>  # include "x86-emulate.h"
>> +# define x86emul_cpu(cp) ((cp)->x86_vendor)
>
> ... that the same wouldn't be observable in the fuzzing or test harnesses.

I can turn this into a static inline to avoid such worries.

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.