Xen project Mailing List

Re: [PATCH v2 0/8] x86emul: a few small steps towards disintegration

To: Roger Pau Monné <roger.pau@xxxxxxxxxx>

Date: Thu, 30 Mar 2023 09:53:23 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=FmyaIWbGd1F1rkkKU61F9iEiMyNoCr2Iem4Xj6WltIQ=; b=A52+QWnQUlUrjjWfBzPZ7PwTYmUDnQ/LeAMEqX6ZjC2L2jPIaJR18BcCEGyYgo/K3mzxIXwh+6KvvYRAPygzcNq/rvTYXy5IgXisf+xqNoqcakg76JL23yi2xgsQKy1VNBHenuBzM0cc6Iilv4qeu8O7HVT7+Hnv/vyM+YiNv6xkl4AWGpQWIYn7vOQ2EjVRld3sSwXwSzOccYqDB8dt1sq1xfTHtt6iKZF0qgvXnWK9MIGkKfP+4lwZ5wwwPVNUNV877t4ZKHeByt916yylGqXRA5MFwG/OiMMX8u3WoJMchPBLiu2ZNCzP5xB1wHD6UJvVxD0HjlCeP3eRLRYlIg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=YAtEcToWzKr9JuDmD075lsgn70h7pQsmIbt0tNOCyoLesJ+A+pnutwe2ebYmorxOWw7UfBBc14CgQO+4Gf1rdKSotKNbbQadcM9++/+MGWqs++Ap+9mO/LS3V9fYRtNRGf8iLXiCr7wdvr3uyraMJ1NwKVNAD/Ny6garjDwg3Ydl1jKtxuqAlrCW5GS3NGVYBCMNkT1vQFoXV0CAy6mChPNFSypgqVOYvsEUEAaKEGI4pLyMe1VzCp2g0NWIwHliSJolLnYIWyT5PUA7IvSb6QGwEh4ex9mchH1GGH3qkUWIGRT6hjn5LejwPkQZoG7nLZNr7NVPcFrpanwqyBRi8A==

Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;

Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>

Delivery-date: Thu, 30 Mar 2023 07:53:46 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 29.03.2023 16:17, Roger Pau Monné wrote: > On Tue, Mar 28, 2023 at 04:48:10PM +0200, Jan Beulich wrote: >> On 28.03.2023 16:19, Roger Pau Monné wrote: >>> On Wed, Jun 15, 2022 at 11:57:54AM +0200, Jan Beulich wrote: >>>> ... of the huge monolithic source file. The series is largely code >>>> movement and hence has the intention of not incurring any functional >>>> change. >>> >>> I take the intention is to make code simpler and easier to follow by >>> splitting it up into smaller files? >> >> Well, I can't say yes or no to "simpler" or "easier to follow", but >> splitting is the goal, in the hope that these may end up as a side >> effects. There's always the risk that scattering things around may >> also make things less obvious. My main motivation, however, is the >> observation that this huge source file alone consumes a fair part >> of (non-parallelizable) build time. To the degree that with older >> gcc building this one file takes ten (or so) times as long as the > > I wouldn't really trade compiler speed for clarity in a piece of code > like the x86 emulator, which is already very complex. Of course, and I specifically said "main" motivation. The hope is that by splitting things become less entangled / convoluted. I guess fpu.c is a good example where certain non-trivial macros have isolated use, and hence are no longer cluttering other parts of the emulator sources. > Do you have some figures of the performance difference this series > makes when building the emulator? No, I don't. And the difference isn't going to be significant, I expect, as the build being slow is - from all I can tell - directly connected to the huge switch() statement. Yet the number of cases there shrinks only marginally for now. The series is named "a few small steps" for this reason, along with others. See below for a first bigger step, which may then make a noticeable difference. > A couple of notes from someone that's not familiar with the x86 > emulator. It would be clearer if the split files where prefixed with > opcode_ for those that deal with emulation (blk.c, 0f01.c, ...) IMO, > so that you clearly see this is an opcode family that has been split > into a separate file, or maybe insn_{opcode,blk,fpu,...}? Hmm. For one, "blk" isn't really dealing with any opcode family in particular. It contains a helper function for code using the emulator. So it falls more in the group of util*.c. For the others may main objection would be that I'd prefer to keep file names short. At least at this point of splitting I think file names are sufficiently descriptive. Nevertheless, insn-0f01.c or opc-0f01.c may be options, if we really think we want/need to group files visually. However, I don't expect there are going to be more files paralleling 0f01.c et al: The opcode groups split out are the ones that are large/heterogeneous enough to warrant doing it on this basis. Of course new such groups may appear in the ISA down the road. FPU is isolated functionally, and I'd expect a simd.c to appear once it becomes clear if/how to sensibly split out SIMD code. Unlike fpu.c I'd further expect this to (long term) consist of more than just a single function, hopefully replacing the massive use of "goto" within that big switch statement by function calls (but as said, plans here - if one can call it that way in the first place - are very vague at this point). > I've noticed in some of the newly introduced files the original > copyright notice from Keir is lost, I assume that's on purpose because > none of the code was contributed by Kier in that file? (see 0f01.c vs > 0fae.c for example). Right - 0fae.c contains only code which was added later (mostly by me), if I'm not mistaken. > Do we expect to be able to split other opcode handling into separate > files? As per above, "expect" is perhaps too optimistic. I'd say "hope", in particular for SIMD code (which I guess is now the main part of the ISA as well as the sources, at least number-of-insns-wise). One possible (likely intermediate) approach might be to move all SIMD code from the huge switch() statement to its own file/function, invoked from that huge switch()'s default: case. It may then still be a big switch() statement in (e.g.) simd.c, but we'd then at least have two of them, each about half the size of what we have right now. Plus it may allow some (most?) of the X86EMUL_NO_SIMD #ifdef-ary to be dropped, as the new file - like fpu.c - could then itself be built only conditionally. > I wonder how tied together are the remaining cases, and whether we > will be able to split more of them into separate units? That's the big open question indeed. As per above - with some effort I expect all SIMD code could collectively be moved out; further splitting would likely end up more involved. > Overall I don't think the disintegration makes the code harder, as the split > cases seems to be fully isolated, I do however wonder whether further splits > might not be so isolated? And again - indeed. This series, while already a lot of code churn, is only collecting some of the very low hanging fruit. But at least I hope that the pieces now separated out won't need a lot of touching later on, except of course to add support for new insns. Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.