[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 1/2][4.17] x86emul: further correct 64-bit mode zero count repeated string insn handling


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Tue, 11 Oct 2022 12:32:45 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=scZ4KbixaPv599tYfN4CIeBtDOwAInSpvwmi+VBdTAc=; b=WQ8rQU6I4fqXMbqW/VQ+4MuzGepNd7GkjH9B/kxNyawSGV81jc+CDljuq5/EQfJ8L+ZsAgOxv4Ia5POGPw8teTG8JTSEHnAZKWVjjxlvMkJjHWM4bXLAttmDmJSPiAiu6LfASiZbOvQkYMo4p4eayIMnighl3qqJQVth8NjvyzDlwsnFsoTw2jir+bx20Z1CX0FMmGwFpJHlBOINzAbpSUADqcssI1ylONVQmzSsnxz6ocaWJu4vMS4+ZGkDuiOFjGAAWB9KVWnd8x/r+LeNk4fRKwypemoXrb/cP+F+1XXHc7x9cXfFAdJr03Wg+zX96uux4QV0m/FtVlgoQrWjkA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=PgdBzoC8Mce5oREt/6h1OfZitjGn1wYrsOhVOdrwVzPuHEgYDAGf+dYGqqjsy4tjtc4PcPOcUqXvMnGTcdBPooTLfate3q864JgRE2xV3J7hFvt/2/qSab0ZIhCZ2Fugyj6FI+Lvu1nz+S5PDb5FGuJ4CKbcOhonfAo6C3mDIGSgRG87DMJa/yVJkTqvcpWt96bp+uOM5DLQbEZ5UXSTdEfmpVgIct2NGPhJFNt+zRhdygXH8R6vsZ85W9b5KKuB37syJD85mkkq3utes1+gEGhPmJerTRcUlgsRfq30NwVep+RbdzjfLIGVcCf//q4GufkTjpr4qgXPzIXU2AuT0Q==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Wei Liu <wl@xxxxxxx>, Roger Pau Monne <roger.pau@xxxxxxxxxx>, Henry Wang <Henry.Wang@xxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 11 Oct 2022 10:32:51 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 10.10.2022 20:56, Andrew Cooper wrote:
> On 06/10/2022 14:11, Jan Beulich wrote:
>> In an entirely different context I came across Linux commit 428e3d08574b
>> ("KVM: x86: Fix zero iterations REP-string"), which points out that
>> we're still doing things wrong: For one, there's no zero-extension at
>> all on AMD. And then while RCX is zero-extended from 32 bits uniformly
>> for all string instructions on newer hardware, RSI/RDI are only for MOVS
>> and STOS on the systems I have access to. (On an old family 0xf system
>> I've further found that for REP LODS even RCX is not zero-extended.)
>>
>> Fixes: 79e996a89f69 ("x86emul: correct 64-bit mode repeated string insn 
>> handling with zero count")
>> Signed-off-by: Jan Beulich <jbeulich@xxxxxxxx>
>> ---
>> Partly RFC for none of this being documented anywhere (and it partly
>> being model specific); inquiry pending.
> 
> None of this surprises me.  The rep instructions have always been
> microcoded, and 0 reps is a special case which has been largely ignored
> until recently.
> 
> I wouldn't be surprised if the behaviour changes with
> MISC_ENABLE.FAST_STRINGS (given the KVM commit message) and I also
> wouldn't be surprised if it's different between Core and Atom too (given
> the Fam 0xf observation).
> 
> It's almost worth executing a zero-length rep stub, except that may
> potentially go very wrong in certain ecx/rcx cases.
> 
> I'm not sure how important these cases are to cover.  Given that they do
> differ between vendors and generation, and that their use in compiled
> code is not going to consider the registers live after use, is the
> complexity really worth it?

By "complexity", what do you mean? The patch doesn't add new complexity,
it only converts "true" to "false" in several places, plus it updates a
comment. I don't think we can legitimately simplify things (by removing
logic), so the only thing I can think of is your thought towards
executing a zero-length REP stub (which you say may be problematic in
certain cases). Patch 2 makes clear why this wouldn't be a good idea
for INS and OUTS. It also cannot possibly be got right when emulating
16-bit code (without switching to a 16-bit code segment), and it's
uncertain whether a 32-bit address size override would actually yield
the same behavior as a native address size operation in 32-bit code.
Of course, if limiting this (the way we currently do) to just 32-bit
addressing in 64-bit mode, then this ought to be representative (with
the INS/OUTS caveat remaining), but - as you say - adding complexity
for likely little gain.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.