Xen project Mailing List

Re: [PATCH v2 0/8] x86emul: a few small steps towards disintegration

From: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Thu, 30 Mar 2023 11:54:05 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=6IxSEg8DkVUQNEGQHlPB8+9xYjyZBvOqPYorJiDMMTo=; b=nubaI1nOtVQCu8ryhybxggn52G7GDk8cE+mphrQPLbJlCQ8drrEpNjoV8Tr37nhH+oedVeUrsIwc0hAdwRBvwkF66YjFSOqkWDMSoGHWRUEtSbuvyamoo7xrFVe05vQjniQdK6QQcb85rUEvyWQs+JnC+SuhhVmCJRAgCXvNTJ6jDSHrlesQCoDJUg0eN3iGTPBLiZ1lIOyXe/J7p21iB2ZsIPHDEcwVvnc4QLzyxYT9W9+JabzH8Q8KQ2senAMc+3peLB05p2WNvRRZMZMDeL2pZszOoYi0yVqDuKcpMlPZ9PzDIcAii3KucAv86cJM7OCtiStCZ43McrRVHTnQbQ==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=e1r4UrcfsOw09T/DYeg6k5NLe7ZsJRBoQ6R1gNVFjAw83DCBCFcMCUJw3Jm7I/MxdMNvWWFCNyTn/lnfrG9L5HHWUjEs0ykSqqOrnUSmZ3sNxRQs8cxAoyDSz2B+AY0MOT/T35aN2gM/djbQZgNZ6mxIV3jV2Ss/d3IVGab6Id79w4bthxsrCNR3eDWZc9jgh1BDc5WR+Z/BH2fOX72VpmCY+rmvfdovmYIpDY0NsK7mTfIdYezV4/0uL2RpHfw4b3nGMj1q7HJGre2E/t5lBWCA3lAEbUIy6/KqeMZYm4Cl6vqKF/WchDCX1cK6ipR4+m3DFh6PQF2IVWN9B/44RQ==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Thu, 30 Mar 2023 09:55:03 +0000

Ironport-data: A9a23:JVHmL6I0CleoIATbFE+R9pQlxSXFcZb7ZxGr2PjKsXjdYENSgTNVy 2MaCm+APq2LM2bxLopzOoSw8x9UsZLTmNIxGgBlqX01Q3x08seUXt7xwmUcnc+xBpaaEB84t ZV2hv3odp1coqr0/0/1WlTZhSAgk/rOHvykU7Ss1hlZHWdMUD0mhQ9oh9k3i4tphcnRKw6Ws Jb5rta31GWNglaYCUpJrfPTwP9TlK6q4mhA5QdmPakjUGL2zBH5MrpOfcldEFOgKmVkNrbSb /rOyri/4lTY838FYj9yuu+mGqGiaue60Tmm0hK6aYD76vRxjnVaPpIAHOgdcS9qZwChxLid/ jnvWauYEm/FNoWU8AgUvoIx/ytWZcWq85efSZSzXFD6I+QrvBIAzt03ZHzaM7H09c4pXlES9 PA3MQkwLRWS2Ou87ZKmZMxz05FLwMnDZOvzu1lG5BSAV7MDfsqGRK/Ho9hFwD03m8ZCW+7EY NYUYiZuaxKGZABTPlAQC9Q1m+LAanvXKmUE7g7K4/RppTSPpOBy+OGF3N79YNuFSN8Thk+Fj mnH4374ElcRM9n3JT+tqyr33raQzX6mMG4UPK3lx64yim2N+jA4FjNIDHyq/8O3oEHrDrqzL GRRoELCt5Ma9kamU938VB2Qu2Ofs1gXXN84O/037kSBx7TZ5y6dB3MYVXhRZdo+rsg0SDc2k FiTkLvBGjhHoLCTD3WH+d+pQSiaPCEUKSoHenUCRA5cud37+tlv0lTIU8ppF7OzgpvtAzbsz juWrS84wbIOkcoM0Kb99lfC696xmqX0oscOzl2/dgqYAslRPeZJu6TABYDn0Mt9

Ironport-hdrordr: A9a23:T7qnC6s1RaVra4ACZle1okEM7skDZ9V00zEX/kB9WHVpm62j+v xG+c5xvyMc5wxhO03I5urwWpVoLUmzyXcX2+Us1NWZPDUO0VHARL2KhrGM/9SPIUzDH+dmpM JdT5Q=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Thu, Mar 30, 2023 at 09:53:23AM +0200, Jan Beulich wrote: > On 29.03.2023 16:17, Roger Pau Monné wrote: > > On Tue, Mar 28, 2023 at 04:48:10PM +0200, Jan Beulich wrote: > >> On 28.03.2023 16:19, Roger Pau Monné wrote: > >>> On Wed, Jun 15, 2022 at 11:57:54AM +0200, Jan Beulich wrote: > >>>> ... of the huge monolithic source file. The series is largely code > >>>> movement and hence has the intention of not incurring any functional > >>>> change. > >>> > >>> I take the intention is to make code simpler and easier to follow by > >>> splitting it up into smaller files? > >> > >> Well, I can't say yes or no to "simpler" or "easier to follow", but > >> splitting is the goal, in the hope that these may end up as a side > >> effects. There's always the risk that scattering things around may > >> also make things less obvious. My main motivation, however, is the > >> observation that this huge source file alone consumes a fair part > >> of (non-parallelizable) build time. To the degree that with older > >> gcc building this one file takes ten (or so) times as long as the > > > > I wouldn't really trade compiler speed for clarity in a piece of code > > like the x86 emulator, which is already very complex. > > Of course, and I specifically said "main" motivation. The hope is that > by splitting things become less entangled / convoluted. I guess fpu.c > is a good example where certain non-trivial macros have isolated use, > and hence are no longer cluttering other parts of the emulator sources. > > > Do you have some figures of the performance difference this series > > makes when building the emulator? > > No, I don't. And the difference isn't going to be significant, I expect, > as the build being slow is - from all I can tell - directly connected to > the huge switch() statement. Yet the number of cases there shrinks only > marginally for now. The series is named "a few small steps" for this > reason, along with others. See below for a first bigger step, which may > then make a noticeable difference. > > > A couple of notes from someone that's not familiar with the x86 > > emulator. It would be clearer if the split files where prefixed with > > opcode_ for those that deal with emulation (blk.c, 0f01.c, ...) IMO, > > so that you clearly see this is an opcode family that has been split > > into a separate file, or maybe insn_{opcode,blk,fpu,...}? > > Hmm. For one, "blk" isn't really dealing with any opcode family in > particular. It contains a helper function for code using the emulator. > So it falls more in the group of util*.c. For the others may main > objection would be that I'd prefer to keep file names short. At least > at this point of splitting I think file names are sufficiently > descriptive. Nevertheless, insn-0f01.c or opc-0f01.c may be options, if > we really think we want/need to group files visually. However, I don't > expect there are going to be more files paralleling 0f01.c et al: The > opcode groups split out are the ones that are large/heterogeneous > enough to warrant doing it on this basis. Of course new such groups may > appear in the ISA down the road. > > FPU is isolated functionally, and I'd expect a simd.c to appear once > it becomes clear if/how to sensibly split out SIMD code. Unlike fpu.c > I'd further expect this to (long term) consist of more than just a > single function, hopefully replacing the massive use of "goto" within > that big switch statement by function calls (but as said, plans here > - if one can call it that way in the first place - are very vague at > this point). > > > I've noticed in some of the newly introduced files the original > > copyright notice from Keir is lost, I assume that's on purpose because > > none of the code was contributed by Kier in that file? (see 0f01.c vs > > 0fae.c for example). > > Right - 0fae.c contains only code which was added later (mostly by me), > if I'm not mistaken. OK, just wanted to make sure this wasn't an oversight. > > Do we expect to be able to split other opcode handling into separate > > files? > > As per above, "expect" is perhaps too optimistic. I'd say "hope", in > particular for SIMD code (which I guess is now the main part of the > ISA as well as the sources, at least number-of-insns-wise). One > possible (likely intermediate) approach might be to move all SIMD code > from the huge switch() statement to its own file/function, invoked > from that huge switch()'s default: case. It may then still be a big > switch() statement in (e.g.) simd.c, but we'd then at least have two > of them, each about half the size of what we have right now. Plus it > may allow some (most?) of the X86EMUL_NO_SIMD #ifdef-ary to be dropped, > as the new file - like fpu.c - could then itself be built only > conditionally. I don't like the handling of SIMD from a default case in the global switch much, as we then could end up chaining all kind of random handling in the default case. It's IMO clearer if we can detect and forward insn to the SIMD code when we know it's a SIMD instruction. I guess that's for another series anyway, so not really the point here. > > I wonder how tied together are the remaining cases, and whether we > > will be able to split more of them into separate units? > > That's the big open question indeed. As per above - with some effort > I expect all SIMD code could collectively be moved out; further > splitting would likely end up more involved. > > > Overall I don't think the disintegration makes the code harder, as the split > > cases seems to be fully isolated, I do however wonder whether further splits > > might not be so isolated? > > And again - indeed. This series, while already a lot of code churn, is > only collecting some of the very low hanging fruit. But at least I > hope that the pieces now separated out won't need a lot of touching > later on, except of course to add support for new insns. OK, I'm a bit concerned that we end up growing duplicated switch cases, in the sense that we will have the following: switch ( insn ) { case 0x100: case 0x1f0: case 0x200: x86emul_foo(); ... } x86emul_foo() { switch (insn ) { case 0x100: /* Handle. */ break; case 0x1f0: /* Handle. */ break; case 0x200: ... } } So that we would end up having to add the opcodes twice, once in the generic switch, and then again at place the insn are actually handled. So far the introduced splits seems fine in that they deal with a contiguous range of opcodes. For patches 1-7: Acked-by: Roger Pau Monné <roger.pau@xxxxxxxxxx> Patch 8 I'm unsure, I guess it should be up to the user to decide whether to use -Os or some other optimization? I expect introspection users likely care way more about the speed rather than the size of the generated x86 emulator code? Thanks, Roger.

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.