[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 03/12] x86: Add cpu_vendor() as a wrapper for the host's CPU vendor


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Alejandro Vallejo <alejandro.garciavallejo@xxxxxxx>
  • Date: Thu, 12 Feb 2026 15:36:05 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=suse.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=uOil5cli+5dIUAJBekbnUD34gjeGCHSwpdVoZDgwE6c=; b=Haq1I9uToA4i/g+306VG+OJ4F13RLnib9PjBTo1VeuXWYGziLjVGTiuoDF0SeZoiE2lo3kstxSNEU5O6VGSQLMf2KlejULNCTECvl72qHfxoItBBFrfnqBDRBIs2Far0805owjwtxL7EQnvmv+XAwgkOqeVc5ypwuY8Fqm7PZTxvwqf7p32MCOzsapATNoB3dJQWbW6GTXQ9JOLBD89adTv8urGDz0Ea9cZqiCIpaxVVqj83p+QcD+xKpkCpNtguWZ8L2+iQFKZYU3lFEfRz7U+UUUP+DtWGj5mLFBu/3+Mux1BZ9Ky5aYSrdfRFwgVHDt3CQJvJxLIKdKrP8SnG0A==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=Fgl71IUI0ZhytNqOAA5HCSpTD3i5RRi2tXPF9DKG4xVGuuE0tcnL6zL68BDkvPLdeYrwredRHiXg+NZ5tKZY25FSQEA3nDqeewkAynPbLyYfAS6LPftV5QOb2hqUVlmezJvH77jPLBYK1BKJsnPBVE5icUdTnPQqGJNKvPMjb8X+bI21iFn8KugmvP15g2oA+OXgLGdYxomYVnKmUrFchayzCCWbjtYNThiQusUew9z5XkmcEVUrXQTg9gX240JZ+JfrHSszy14J2tH4Yo8U1jLwOz64cfwLoN6EItdtOxqCcduLIojXjYC7e+m3Wcj+/+4nVDP5RJcIQWKnTRjuGQ==
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Jason Andryuk <jason.andryuk@xxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Thu, 12 Feb 2026 14:36:22 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu Feb 12, 2026 at 11:52 AM CET, Jan Beulich wrote:
> On 06.02.2026 17:15, Alejandro Vallejo wrote:
>> Introduces various optimisations that rely on constant folding, Value
>> Range Propagation (VRP), and Dead Code Elimination (DCE) to aggressively
>> eliminate code surrounding the uses of the function.
>> 
>>   * For single-vendor+no-unknown-vendor builds returns a compile-time
>>     constant.
>>   * For all other cases it ANDs the result with the mask of compiled
>>     vendors, with the effect of performing DCE in switch cases, removing
>>     dead conditionals, etc.
>> 
>> It's difficult to reason about codegen in general in a project this big,
>> but in this case the ANDed constant combines with the values typically
>> checked against, folding into a comparison against zero. Thus, it's better
>> for codegen to AND its result with the desired compared-against vendor,
>> rather than using (in)equality operators. That way the comparison is
>> always against zero.
>> 
>>   "cpu_vendor() & (X86_VENDOR_AMD | X86_VENDOR_HYGON)"
>> 
>> turns into (cpu_vendor() & X86_VENDOR_AMD) in AMD-only builds (AND +
>> cmp with zero). Whereas this...
>> 
>>   "cpu_vendor() == X86_VENDOR_AMD"
>> 
>> forces cpu_vendor() to be ANDed and then compared to a non-zero value.
>
> Coming back to this: How does the value compared against being zero or
> non-zero matter here? As long as cpu_vendor() yields a compile-time
> constant, the compiler should be able to leverage that for DCE? And

Yes, for true single-vendor cases it doesn't matter. It matters on multivendor
cases where some vendors are off and cpu_vendor() is not a constant.

> even if it's not a compile time constant, bits masked off in principle
> allow the compiler to leverage that, too. It may of course be that
> even up-to-date compilers fall short of doing so.

There might be some of that, but there's also a non-avoidable codegen hurdle
unless you can tell the compiler your variable is a power of 2 or 0 in
multivendor cases.

cpu_vendor() == X86_VENDOR_AMD, which expands to
(boot_cpu_data.vendor & X86_ENABLED_VENDORS) == X86_VENDOR_AMD, which expands to
(boot_cpu_data.vendor & (X86_VENDOR_AMD | X86_VENDOR_INTEL) == X86_VENDOR_AMD

which produces a lot worse codegen, because now the compiler must AND the
variable with the AMD|INTEL mask, and then compare to the AMD mask, whereas
having & instead of == means the compiler can simply do a comparison with zero
and call it a day (due to the masks being folded together).

I tried creating unreachable paths in cpu_vendor() to assist the VRP pass
in noticing the variable invariant, but it just doesn't. It doesn't seem to
be sufficiently aggressive in range tracking. Bummer, because VRP is a very
unobstrusive technique for DCE that could be great if we could reliably teach it
invariants we know are held by certain variables as part of their accessors..

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.