Xen project Mailing List

Re: Hyperlaunch/dom0less code sharing

To: "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: Alejandro Vallejo <agarciav@xxxxxxx>

Date: Thu, 22 May 2025 14:02:21 +0200

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=apertussolutions.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jex+Qt+qWx5kNzevt+0dG6ZIVPt4PtXsw3oH3ptE2Aw=; b=CvzP/q374AIrxNy/F1gtda0NxoPSKSALkFc1E6lHPt3tWS0MXJFPyDyepWbqfzhwa6xpnutaCMfro7tH27iSJGPqGLVUlilpJu1+b1prdCO+0qp2GevFdvNGCO7rsa9UexL6JF4iPwiUQjol7h0nXx80V/yg100Sl9SrnGjZgmFj2oFqtlWZ3Ll7y1H0Cv7ga8WtGZbiqu/LLRnY8MQjKOfYJ+a2csp1r57VPLf8Ox4k8uL8v+BNTHKeD4uU3MPqJz26Y6RwBkj88HyzN5wpm7Ivgyiiwna49YV5HNRsmFys5M5HxyuJRBmQVDsMMKrFRj07gs/By5kly8dfmSTxTg==

Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=No9qwNZEfinG8VZTFYbVUqHk/fE1OEgqu8bNc/y5tuxD0ojCoc9zMrBRqeQt/1PUMFhJA6/8m7xT7pXQE3wTQgKnPqIaUDkiz4LIPlNro6z0yKFAeLhyRNZ3dkvp31U4N1wvDjdlefxYIAWZwH/Nk9YRAca0ZsjXpoB8PUkCCuSgztgq0mR/g2uYBNfN1+gyvx5lbfv/Hj0NAgoderZ9qvAYn4ovvCWQwAJnMwNIP++GEsnRhRavxJX7uuIg53AmuAcxEr7yvECTqooSTmwYaWpMY1Cpg0D4DuhAaPjuUCJWhGARTRdmzmrk0RUsoLKWs8kQiXQvJmXoE+C/oUEbIg==

Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Jason Andryuk <jason.andryuk@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "Xenia Ragiadakou" <xenia.ragiadakou@xxxxxxx>

Delivery-date: Thu, 22 May 2025 12:02:35 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi, On Wed May 21, 2025 at 5:31 PM CEST, Daniel P. Smith wrote: > On 5/21/25 10:35, Alejandro Vallejo wrote: >> Hi, >> >> (There's a TL;DR at the end) >> >> While working on preparing and reworking the hyperlaunch series for >> upstreaming it's slowly becoming apparent the degree of duplication with >> dom0less. > > Yes, this was by design so that when we got to the point of convergence > it would be apparent where the commonalities are and thus be collapsed. I'd say they are already fairly apparent. So long as we're all in agreement I'd like to collapse them sooner when the differences are small rather than later when it'll be a bigger PITA. I'll prepare and send what I have so far. It'll all be code motion and non-functional changes to remove struct duplication. > >> Oleksii's latest effort to move all that code into common[*] (see >> ad03faa942b9("xen/common: dom0less: make some...) makes this even clearer. >> There are 1:1 relationships for every key data strucutre, and by deviating >> we're overcomplicating the ability to share code. >> >> [*] >> https://lore.kernel.org/xen-devel/cover.1746468003.git.oleksii.kurochko@xxxxxxxxx/ >> >> dom0less Hyperlaunch >> ------------------------------ >> kernel_info boot_domain >> bootmodule boot_module >> bootmodule_kind bootmod_type >> membanks memmap >> bootinfo boot_info > > If you go back, you will see less of these differences. A lot of the > variations have been the results of reviewer's request. And that goes to > the fact that dom0less was developed with Arm mentality in terminology, > eg. memory banks verse memory map. bootinfo and banks/maps are more complicated to merge (I do want to merge them, though, but one elephant at a time). All the others can be merged without much hassle with minor code adjustments. > >> The difficulty in code sharing is not just unfortunate from a vague sense of >> elegance or neatness. Having different code paths parsing the same DT >> bindings >> means it's a matter of time before x86 and all other Xen ports have different >> bindings, which would would only worsen the situation wrt code sharing and >> user >> confusion. > > Only if we allowed it, Granted. But mistakes happen, and if detected late enough you end up with aberrations like x86's compat layer for hypercall parsing. I know I don't want to see that kind of thing ever again if I can help it. > but with that said there are subtleties between the arch that will > require variation. Such as mode, which is primarily an x86 specific. Yes. Common code can't mean exclusively common. Fortunately, dom0less in common gained provisions for per-ach bindings (see arch_create_domUs() in staging). With those hunks extracted (keeping the arch hook in them), x86 could use the same, keeping private bindings where appropriate. > >> I've been trying to get _somewhere_ in using parts of dom0less on x86 >> and develop a credible PoC that highlights the use of dom0less parsing >> code paths. The results are interesting. >> >> While I didn't get to a full integration in the hyperlaunch series as I hoped >> due to time constraints I did get far enough to: >> >> 1. Replace boot_module, boot_domain and bootmod_type with their dom0less >> counterparts (pending some cleanup). >> 2. Isolate binding parsing code in common/device-tree so it's dettached >> from >> the not-so-common dom0less domain building logic in dom0less-build.c >> 3. Do early module kind detection from x86 using code in (2). >> 4. Do late binding parsing also using code in (2) after unflattening the >> DTB. >> >> And it works well enough under QEMU to populate boot_info to a first >> approximation. It's missing hyperlaunch-specific bindings (like "domid" >> or "mode"), but that's just a matter of adding them to the already >> existing per-arch binding parser ("mode", maybe, would be a candidate >> for this) or retrofit them in dom0less ("domid" is a very clear >> candidate for this) and integrating in the larger series to be able to >> actually boot domains. >> >> The PoC does not go all the way due to time constraints, as I said, but >> I think it goes far enough to convince me it's both feasible and >> beneficial to go in this direction rather than that of a full >> reimplementation on x86, specially seeing how Oleksii already aims to >> have full code reuse on riscv. >> >> A brave new world would reuse all data (including bootinfo) and code >> (bootfdt.c/bootinfo.c), but I couldn't get that far, and I don't think >> I'll be able to in a timely manner. IOW: I compiled-in device-tree.c, >> but NOT bootfdt.c or bootinfo.c, or any of the others. I merely >> extracted from those files the binding parsing bits of interest. >> >> It'd be nice to use them, but the domain construction logic is just too >> different at present. As for the code I tested with, I need to do some >> serious >> cleanup before it's ready for sharing, and even moreso for review, but >> before I >> go through that I'd like to reach consensus on the following points before >> investing any more of my time working on the side. >> >> TL;DR: >> >> I think we should aim to share binding code wherever possible, using common >> datastructures (kernel_info and bootmodule) as dumping ground for the results >> of the binding parsing functions. I seek agreement on the following 3 points >> for the end goal of DTB multidomain boots on x86 before I start slicing >> my hacks into reasonable chunks. >> >> 1. We want common data structures, with arch-specific fields to hold >> information from xen,domain DT nodes >> 2. We want to have a single collection of DTB parsers in the code. >> 3. We want to operate on the unflattened DTB for the majority of parsing. >> (plus a minimal version on the FDT in order to bootstrap, also common) > > As for 3, I can repost my original analysis that went to the working > group on why using this is not the best idea. Doing 3 would require > doing at least two, if not three passes on the DTB for x86 with zero > benefit/need since, unlike Arm, we never have to read from it after > boot. The way the x86 parser is today, we are walking the DTB only once. Note that some libfdt wrappers are deceptive and have hideous time complexities. Doing one pass might not mean a monotonically increasing cursor. Regardless, DTBs are always quite small (particularly so in x86), so the time complexity of the parsing algorithm is of little consequence. We could do 512 passes and the time it would take would still be negligible. > > What I would suggest for 2 is that, perhaps, it might be an opportune > time to review the existing DTB parsing code. Providing a common > interface to parse standard/spec DTB structures regardless if it is > coming from an FTB or via the FTB index, aka "unflattened" DTB. Doing so > would enable a complete reuse within the DTB parsers and remove then > need for 3. The biggest reason for using the DTB unflattened is (imo) having arm and x86 parsing the same data structure. The unflattened tree is not merely an FDT with pointers. There's logic to merge nodes with the same names, remove duplicates, etc. A "quirky" DTB (e.g: with overlapping nodes), would thus behave differently on arm and x86 unless they both use the FDT or the unflattened DT. Furthermore, parsing strictly in the order you find properties in creates a danger to synthesise different systems when fed DTBs that have properties in different orders. imagebuilder could be instructed to reduce the danger of this by having a canonical order of properties, but I really dislike having room for diverging interpretations of identical source data. This isn't necessarily an issue with FDT parsing per se, but it is with the way it's been done in the existing series, whereas the code in dom0less looks up properties in a specific order. > >> (2) and (3) are tightly related. There's many reasons for wanting to use the >> unflattened blobs as much as possible. They range from quirks in specific >> "dtc" >> versions, to complexities parsing phandles, to corner cases involving >> duplicate >> properties (i.e: due to .dtsi files), etc. Unflattening an FDT brings a >> lots of "maybe-ok-after-sanitising" FDTs back into canonically correct DTs. >> >> I'll share the PoC code as soon as as it's in a presentable state. >> Hopefully by the end of the week. But I'm sending this ahead of time to >> start collecting thoughts right away. >> >> So. Thoughts? > > > v/r, > dps All this said, the analysis (thanks for sharing) seems still relevant and also seems like a reasonable goal for all architectures, but having everyone parsing the DTB in the exact same way leads to far better predictability. Once all the code is in common and in use, it'll be a lot easier to simply modify arm to use the FDT on the grounds of keeping arch-symmetry and the drop of code it'd mean for x86. For the time being, however, I'm more interested in getting staging to a working state where it can boot multiple domains off a DTB, without endangering arch-symmetry. Then the rest is refinement and optimisation. Cheers, Alejandro

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.