[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[PATCH 2/3] x86: use POPCNT for hweight<N>() when available


  • To: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Wed, 12 Jul 2023 14:34:31 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WSfHTrJ7pOa2oeWcPF+TfLLO+C/+vFOck5rOH+Ofwek=; b=EJWj3UTh9LnXveqOOW9EPCbFXXrMf2jGIK2QStWjPsM038Oc3WyMMwEsoVI2aJektL/u3e5CR9lEDY5tHOX2da9L4sZNwsMIA+1erP1vg9PD1XuX2t9iQUU0UUqHXFNXRjX/hb0Ll8JLpMp8em0vZfM5TVz6oGoDDNcijUYMMbCM6YN4c/w3a50u/LIimDW9iA7iZ0FUQvHSm43LORHxozxeTG8HI8Adm6amcLiG4kaWTMl0YCU0bwd887oiArStybk9fPYdYv9sffTfwGvfOl1ID6knat3LklQUdjBYL01tdEpyZaGTixrMNYq3GUaN1cF6aS2dgKXi6Z1DZd5AXw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=X18hHrRz4hEnVPvdts88BmNXw6De0Xek5EPFjYXGgEmNMNH7syP/ogRTi++6T24r0uXn9T0gtxaJbS0KPddbFhr2psW/S3cIhSw2FST3uLBDDyA3O2jzxHj2VwPeO+K9AThibKruD26HBkMphkfPAQ36qolXBG5MJR7RFpRkfxKJNayJapOP7s88WLQYaRxdiO1XZrfBtVQd2Ab+xKYQVQVyjrtqHbano1q/0GPkClDTOKbpKE5hOG90p89o5ZZ1+7nzmBcWL8qyzJ71s0ZxAIUxOXU9LLG2wsJl9dCgcP03+m3+0kIkIqXXU/F4ShwlhaT27SDf9t55pgPLP44g9A==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Delivery-date: Wed, 12 Jul 2023 12:34:48 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

This is faster than using the software implementation, and the insn is
available on all half-way recent hardware. Use the respective compiler
builtins when available.

Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
---
v3: Re-do in a much simpler manner, without alternatives patching.
v2: Also suppress UB sanitizer instrumentation. Reduce macroization in
    hweight.c. Exclude clang builds.

--- a/xen/arch/x86/include/asm/bitops.h
+++ b/xen/arch/x86/include/asm/bitops.h
@@ -475,9 +475,16 @@ static inline int fls(unsigned int x)
  *
  * The Hamming Weight of a number is the total number of bits set in it.
  */
+#ifdef __POPCNT__
+#define hweight64(x) __builtin_popcountll(x)
+#define hweight32(x) __builtin_popcount(x)
+#define hweight16(x) __builtin_popcount((uint16_t)(x))
+#define hweight8(x)  __builtin_popcount((uint8_t)(x))
+#else
 #define hweight64(x) generic_hweight64(x)
 #define hweight32(x) generic_hweight32(x)
 #define hweight16(x) generic_hweight16(x)
 #define hweight8(x) generic_hweight8(x)
+#endif
 
 #endif /* _X86_BITOPS_H */




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.