|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH 03/12] x86: Add cpu_vendor() as a wrapper for the host's CPU vendor
On Thu Feb 12, 2026 at 11:52 AM CET, Jan Beulich wrote: > On 06.02.2026 17:15, Alejandro Vallejo wrote: >> Introduces various optimisations that rely on constant folding, Value >> Range Propagation (VRP), and Dead Code Elimination (DCE) to aggressively >> eliminate code surrounding the uses of the function. >> >> * For single-vendor+no-unknown-vendor builds returns a compile-time >> constant. >> * For all other cases it ANDs the result with the mask of compiled >> vendors, with the effect of performing DCE in switch cases, removing >> dead conditionals, etc. >> >> It's difficult to reason about codegen in general in a project this big, >> but in this case the ANDed constant combines with the values typically >> checked against, folding into a comparison against zero. Thus, it's better >> for codegen to AND its result with the desired compared-against vendor, >> rather than using (in)equality operators. That way the comparison is >> always against zero. >> >> "cpu_vendor() & (X86_VENDOR_AMD | X86_VENDOR_HYGON)" >> >> turns into (cpu_vendor() & X86_VENDOR_AMD) in AMD-only builds (AND + >> cmp with zero). Whereas this... >> >> "cpu_vendor() == X86_VENDOR_AMD" >> >> forces cpu_vendor() to be ANDed and then compared to a non-zero value. > > Coming back to this: How does the value compared against being zero or > non-zero matter here? As long as cpu_vendor() yields a compile-time > constant, the compiler should be able to leverage that for DCE? And Yes, for true single-vendor cases it doesn't matter. It matters on multivendor cases where some vendors are off and cpu_vendor() is not a constant. > even if it's not a compile time constant, bits masked off in principle > allow the compiler to leverage that, too. It may of course be that > even up-to-date compilers fall short of doing so. There might be some of that, but there's also a non-avoidable codegen hurdle unless you can tell the compiler your variable is a power of 2 or 0 in multivendor cases. cpu_vendor() == X86_VENDOR_AMD, which expands to (boot_cpu_data.vendor & X86_ENABLED_VENDORS) == X86_VENDOR_AMD, which expands to (boot_cpu_data.vendor & (X86_VENDOR_AMD | X86_VENDOR_INTEL) == X86_VENDOR_AMD which produces a lot worse codegen, because now the compiler must AND the variable with the AMD|INTEL mask, and then compare to the AMD mask, whereas having & instead of == means the compiler can simply do a comparison with zero and call it a day (due to the masks being folded together). I tried creating unreachable paths in cpu_vendor() to assist the VRP pass in noticing the variable invariant, but it just doesn't. It doesn't seem to be sufficiently aggressive in range tracking. Bummer, because VRP is a very unobstrusive technique for DCE that could be great if we could reliably teach it invariants we know are held by certain variables as part of their accessors.. Cheers, Alejandro
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |