[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH] bitops/32: Convert variable_ffs() and fls() zero-case handling to C
On Tue, 29 Apr 2025 at 14:24, H. Peter Anvin <hpa@xxxxxxxxx> wrote: > > Could you file a gcc bug? Gcc shouldn't generate worse code than inline asm > ... Honestly, I've given up on that idea. That's basically what the bug report I reported about clts was - gcc generating worse code than inline asm. But why would we use gcc builtins at all if we have inline asm that is better than what som eversions of gcc generates? The inline asm works for all versions. Anyway, attached is a test file that seems to generate basically "optimal" code. It's not a kernel patch, but a standalone C file for testing with a couple of stupid test-cases to make sure that it gets the constant argument case right, and that it optimizes the "I know it's not zero" case. That is why it also uses that "#if __SIZEOF_LONG__ == 4" instead of something like CONFIG_64BIT. So it's a proof-of-concept thing, and it does seem to be fairly simple. The "important" parts are just the #define variable_ffs(x) \ (statically_true((x)!=0) ? __ffs(x)+1 : do_variable_ffs(x)) #define constant_ffs(x) __builtin_ffs(x) #define ffs(x) \ (__builtin_constant_p(x) ? constant_ffs(x) : variable_ffs(x)) lines which end up picking the right choice. The rest is either the "reimplement the kernel infrastructure for testing" or the trivial do_variable_ffs() thing. I just did gcc -m32 -O2 -S -fomit-frame-pointer t.c (with, and without that -m32) and looked at the result to see that it looks sane. No *actual* testing. Linus Attachment:
t.c
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |