[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [PATCH for-4.21 v2] x86/AMD: avoid REP MOVSB for Zen3/4
- To: Jan Beulich <jbeulich@xxxxxxxx>
- From: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>
- Date: Wed, 7 Jan 2026 10:42:18 +0000
- Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
- Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=WF455ridIJIOwrXs7Eu8RJ5A454cGJPrSBMOt+Y9C4o=; b=scYj990GkY/OQnb99Nv8bTW/RAWUwfHDO8TaPcfbjy2PdVJZVSFc3sY823maM1E0T/Z6AnezM78tWSFRIy9s/Em0NxIkNwJe3yrSDoeYrFrtQrk7HzYOz20ANcwrLwrBJqgXztZPYkE6Mjsi1IBoFMLVVX/5kdtX8bRqUEEppt48jQjG7v6gZatcGiIWtEpQg+9nMhnnkvvHtI6I/WXvltuwo+7/toE99dENQkGDOqteovnr/pS4sS3iBCotLvswV3ZXg3s6WkQS78V4OC0QDZt6ieEjoV0aNmoL5JNdDZ7vA+qLGV4rTO6/oQC7ZRAU0vR+GubxpIEcbzSOPVgHgg==
- Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=BeOXdQFJTa2TDDxynChQrKNt9VU6AT85FJE7zZj00a7Nul49c3YonR4EVVTScVbaP9G1UdDki388aXgfFdo03VUumeyG0GQTHHkOuC0aPheMojod99r5MwZznkJI/OH/L9B9dXocLbXLF4hDEko/CyKG+6uFh2W5eA3KxR9yqxVYj3VGh0kdyPWhLGh0vnv6EeOGwHxhF7tXJRovUYbbFrlDKf9QsxPIrAT4M0Yj8tEI1loBBwUOLsn/z8+SrSdyLh4VOh02Y0ak/9yQCFyKcVN9XZm308CwBn7GoEIROcvA8SXDHb7Leiiolfhu6K9Oj+5zchT1bUJSERDZbXwNUA==
- Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
- Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Andrew Cooper <andrew.cooper@xxxxxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, Oleksii Kurochko <oleksii.kurochko@xxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
- Delivery-date: Wed, 07 Jan 2026 10:42:36 +0000
- List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
On 07/01/2026 7:22 am, Jan Beulich wrote:
> On 06.01.2026 22:07, Andrew Cooper wrote:
>> On 13/10/2025 2:06 pm, Jan Beulich wrote:
>>> Along with Zen2 (which doesn't expose ERMS), both families reportedly
>>> suffer from sub-optimal aliasing detection when deciding whether REP MOVSB
>>> can actually be carried out the accelerated way. Therefore we want to
>>> avoid its use in the common case of memcpy(); copy_page_hot() is fine, as
>>> its two pointers are always going to be having the same low 5 bits.
>> I think this could be a bit clearer. How about this:
>>
>> ---8<---
>> Zen2 (which doesn't expose ERMS) through Zen4 have sub-optimal aliasing
>> detection for REP MOVS, and fall back to a unit-at-a-time loop when the
>> two pointers have differing bottom 5 bits. While both forms are
>> affected, this makes REP MOVSB 8 times slower than REP MOVSQ.
>>
>> memcpy() has a high likelihood of encountering this slowpath, so avoid
>> using REP MOVSB. This undoes the ERMS optimisation added in commit
>> d6397bd0e11c which turns out to be an anti-optimisation on these
>> microarchitectures.
>>
>> However, retain the use of ERMS-based REP MOVSB in other cases such as
>> copy_page_hot() where there parameter alignment is known to avoid the
>> slowpath.
>> ---8<---
>>
>> ?
> Fine with me; changed. Do I take this as an okay-to-commit?
Yeah - with something to this effect, Reviewed-by: Andrew Cooper
<andrew.cooper3@xxxxxxxxxx>
Sorry it took so long.
~Andrew
|