[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v2] x86: use POPCNT for hweight<N>() when available


  • To: Jan Beulich <JBeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Fri, 17 Mar 2023 12:22:02 +0100
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=hiYJ88ZehW1O6dwzd4w0giCq3y8ClsIgYxMoxB1sKX0=; b=aoi0o+nT6KPqNWy8KRLnFwMnveD5hgcjtcQQvPp7KJJRO7N8wTD6wBMvZe2kKUuOsVwtKxupYZdftFhgHzo3TUn3llsJofuOpgTSsfjGe3mZTFEClDzXDdo3lxKtatLlifx/wX3yffMtceiZ0gceH86k26p4c1b/r40gnEiMwUyKEoH9pvTEcmQ3BKdjx60oAv8bur/rNXYDe9+YhMoZW6IKc3JlQVVlVdT4oCxH1+6rncRdwz3YHlIC2sCVqjeJFzVVy7BAltj+6YCNF/pOT5bBKbjJhDOgJH9wJQSj45TIicqQA8WjUzdtySLMoBZs9qkp7sQssBxq43vVatzRrg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=F2n+JjgEty4B8g5k4YBRFalRWeJKo541aJGEMZ0c5hHvsx06yFcE9U9QhaQs+MLE5AWvoEAloIoBJunH99QVA8w5L8IAB8qvtPe1gZY/7F3/jQDmi3wwOlHSUjzqfoFnkJTAY79jYQHc1e2TLGPR4MtCNUjqJHZi9hj9LliQR/G12odIctguogO/js2uCA8SBbtv0IKqCfcWW3sDU1kh2YhmDxwoZAqfzJv1JU4tgIRg4Lhq2/20/n+AQXBO8S+2T2ySxjI7kivRP+wU5ktM/6rMCGeCTe3YynWXxZ6ghd9Ex3p/+sb9PIWNS4ZpPs63CYPEgHkviPbkFi07faiE3Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>
  • Delivery-date: Fri, 17 Mar 2023 11:22:27 +0000
  • Ironport-data: A9a23:025k8qqdiFDOx4X+O2Of5964nmheBmI+ZBIvgKrLsJaIsI4StFCzt garIBnSafqMYDT9e4tzadnk9R5Q7JPTnN8wSwpo/yFgRioVpZuZCYyVIHmrMnLJJKUvbq7FA +Y2MYCccZ9uHhcwgj/3b9ANeFEljfngqoLUUbKCYWYpA1c/Ek/NsDo788YhmIlknNOlNA2Ev NL2sqX3NUSsnjV5KQr40YrawP9UlKm06WNwUmAWP6gR5weFziZNVfrzGInqR5fGatgMdgKFb 76rIIGRpgvx4xorA9W5pbf3GmVirmn6ZFXmZtJ+AsBOszAazsAA+v9T2Mk0MC+7vw6hjdFpo OihgLTrIesf0g8gr8xGO/VQO3kW0aSrY9YrK1Dn2SCY5xWun3cBX5yCpaz5VGEV0r8fPI1Ay RAXAA0MaDeNgciI+/GqZO1ihdx/N8rlOapK7xmMzRmBZRonabbqZvySoPN9gnI3jM0IGuvCb c0EbzYpdA7HfxBEJlYQDtQ5gfusgX78NTZfrTp5p4JuuzSVkFM3jeiraYKFEjCJbZw9ckKwv GXJ8n6/GhgHHNee1SCE4jSngeqncSbTAdpJSuLlq64x6LGV7lcsJwYKRQSxmL6apWqAUNsBA WY7+xN7+MDe82TuFLERRSaQglSJoxodUNp4CPAh5UeGza+8yxaUAC0IQyBMbPQitdQqXno62 1mRhdTrCDdz9rqPRhq16bO8vT60fy8PIgc/iTQsSAIE55zvpd81hxeWFtJ7Svft3pvyBC36x C2MoG4mnbIPgMUX1qK9u1fanzaroZuPRQkwjunKYl+YAspCTNbNT+SVBZLztJ6s8K7xooG9g UU5
  • Ironport-hdrordr: A9a23:Sm4p2aBlG7IWCQjlHemE55DYdb4zR+YMi2TDtnofdfUxSKelfq +V7ZMmPHPP6Qr5IUtQ/OxoW5PvfZqjz+8Q3WBLB8bAYOCOggLBRuwP0WKI+V3d8kPFh5ZgPI 5bAspDIey1IV9mjdvrpCmUeuxQu+VvKZrY49s2GU0dND1XVw==
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Mon, Jul 15, 2019 at 02:39:04PM +0000, Jan Beulich wrote:
> This is faster than using the software implementation, and the insn is
> available on all half-way recent hardware. Therefore convert
> generic_hweight<N>() to out-of-line functions (without affecting Arm)
> and use alternatives patching to replace the function calls.
> 
> Note that the approach doesn#t work for clang, due to it not recognizing
> -ffixed-*.

I've been giving this a look, and I wonder if it would be fine to
simply push and pop the scratch registers in the 'call' path of the
alternative, as that won't require any specific compiler option.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.