[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH v2 11/11] x86/bitops: Use the POPCNT instruction when available
On 29/08/2024 3:36 pm, Jan Beulich wrote: > On 29.08.2024 00:03, Andrew Cooper wrote: >> It has existed in x86 CPUs since 2008, so we're only 16 years late adding >> support. With all the other scafolding in place, implement arch_hweightl() >> for x86. >> >> The only complication is that the call to arch_generic_hweightl() is behind >> the compilers back. Address this by writing it in ASM and ensure that it >> preserves all registers. >> >> Copy the code generation from generic_hweightl(). It's not a complicated >> algorithm, and is easy to regenerate if needs be, but cover it with the same >> unit tests as test_generic_hweightl() just for piece of mind. >> >> Signed-off-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx> >> --- >> CC: Jan Beulich <JBeulich@xxxxxxxx> >> CC: Roger Pau Monné <roger.pau@xxxxxxxxxx> >> >> v2: >> * Fix MISRA 8.2 (parameter name) and 8.5 (single declaration) regressions. >> * Rename {arch->x86}-generic-hweightl.{S->c} >> * Adjust ASM formating >> >> The __constructor trick to cause any reference to $foo() to pull in >> test_$foo() only works when both are in the same TU. i.e. what I did in >> v1 (putting test_arch_generic_hweightl() in the regular generic-hweightl.c) >> didn't work. > I'm afraid I don't understand this. What exactly didn't work, breaking in > which > way? Presumably as much as you, I don't really like the global asm() in a C > file, when ideally the same could be written with less clutter in an assembly > one. We have lib-y because we wish to not include arch_generic_hweightl() if it isn't referenced. But, test_arch_generic_hweightl() unconditionally references arch_generic_hweightl() (in CONFIG_SELF_TESTS builds). So, the trick is to have both {test_,}arch_generic_hweightl() together in the same object file, with test_* being entirely self contained as a constructor. That way, if and only if something else references arch_generic_hweightl() do we pull in this object file, and pick up the tests as side effect. v1 of this patch had the test in a separate object file, causing arch_generic_hweightl() to be referenced based on the inclusion criteria for regular generic-hweightl.c This patch causes the x86 build of Xen to no longer reference generic_hweightl(), so it was excluded along with its own tests, and test_arch_generic_hweightl(). Therefore, we had arch_generic_hweightl() in use, but the selftests not included. Both {test_,}arch_generic_hweightl() do need to be in the same TU for this to work (nicely), and I'm very uninterested in writing test_arch_generic_hweightl() in asm. > >> This in turn means that arch_generic_hweightl() needs writing in a global asm >> block, and also that we can't use FUNC()/END(). While we could adjust it to >> work for GCC/binutils, we can't have CPP macros in Clang-IAS strings. > What does Clang different from gcc there? I was hoping that at least their > pre- > processors would work in (sufficiently) similar ways. It's inside a string, not outside, so CPP on the C file does nothing. Passing CPP over the intermediate .S would work, but we don't have an intermediate .S with Clang-IAS. I'm also not particularly interested in doubling up xen/linkage.h with different quoting in !__ASSEMBLY__ case. One other option which comes to mind is to have: .macro func name FUNC(\name) .endm which (I think) could be used as: asm ( "FUNC foo\n\t" ... "ret\n\t" "END foo" ); It reads almost the same, and avoids opencoding contents of FUNC, but even this requires changing the __ASSEMBLY__-ness of linkage.h and I can't see a nice way of making it happen. Or we finally bump our minimum toolchain version... GCC 7 is already old enough to be out of the recent LTS's... ~Andrew
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |