[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 1/4] bitops: speed up hweight<N>()



>>> On 31.05.19 at 21:40, <andrew.cooper3@xxxxxxxxxx> wrote:
> On 31/05/2019 02:51, Jan Beulich wrote:
>> Algorithmically this gets us in line with current Linux, where the same
>> change did happen about 13 years ago. See in particular Linux commits
>> f9b4192923 ("bitops: hweight() speedup") and 0136611c62 ("optimize
>> hweight64 for x86_64").
>>
>> Kconfig changes for actually setting HAVE_FAST_MULTIPLY will follow.
>>
>> Take the opportunity and change generic_hweight64()'s return type to
>> unsigned int.
>>
>> Suggested-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
> 
> Reviewed-by: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
> 
> The code in Linux could do with a bit of cleanup.  Do you have patches
> prepared?

Not yet, no. I'll try to do this eventually, but it doesn't have a priority
for me.

>> --- a/xen/include/xen/bitops.h
>> +++ b/xen/include/xen/bitops.h
>> @@ -153,41 +153,54 @@ static __inline__ int get_count_order(un
>>  
>>  static inline unsigned int generic_hweight32(unsigned int w)
>>  {
>> -    unsigned int res = (w & 0x55555555) + ((w >> 1) & 0x55555555);
>> -    res = (res & 0x33333333) + ((res >> 2) & 0x33333333);
>> -    res = (res & 0x0F0F0F0F) + ((res >> 4) & 0x0F0F0F0F);
>> -    res = (res & 0x00FF00FF) + ((res >> 8) & 0x00FF00FF);
>> -    return (res & 0x0000FFFF) + ((res >> 16) & 0x0000FFFF);
>> +    w -= (w >> 1) & 0x55555555;
>> +    w =  (w & 0x33333333) + ((w >> 2) & 0x33333333);
>> +    w =  (w + (w >> 4)) & 0x0f0f0f0f;
>> +
>> +#ifdef CONFIG_HAS_FAST_MULTIPLY
>> +    return (w * 0x01010101) >> 24;
>> +#else
>> +    w += w >> 8;
>> +
>> +    return (w + (w >> 16)) & 0xff;
>> +#endif
> 
> This would be slightly shorter, less liable to bitrot, and IMO cleaner,
> to do
> 
> if ( IS_ENABLED(CONFIG_HAS_FAST_MULTIPLY) )
>     w = w * 0x01010101) >> 24;
> else
>     w += w >> 8;
> 
> return w;
> 
> which removes the #ifdef-ary fully, and in particular, avoids having
> different return logic.

Hmm, yes, I can switch to such a model, albeit I think this will make
Coverity point out unreachable code.

Jan



_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.