[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Hyperlaunch/dom0less code sharing


  • To: "Daniel P. Smith" <dpsmith@xxxxxxxxxxxxxxxxxxxx>, <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • From: Alejandro Vallejo <agarciav@xxxxxxx>
  • Date: Thu, 22 May 2025 14:02:21 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=apertussolutions.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none (0)
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector10001; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=jex+Qt+qWx5kNzevt+0dG6ZIVPt4PtXsw3oH3ptE2Aw=; b=CvzP/q374AIrxNy/F1gtda0NxoPSKSALkFc1E6lHPt3tWS0MXJFPyDyepWbqfzhwa6xpnutaCMfro7tH27iSJGPqGLVUlilpJu1+b1prdCO+0qp2GevFdvNGCO7rsa9UexL6JF4iPwiUQjol7h0nXx80V/yg100Sl9SrnGjZgmFj2oFqtlWZ3Ll7y1H0Cv7ga8WtGZbiqu/LLRnY8MQjKOfYJ+a2csp1r57VPLf8Ox4k8uL8v+BNTHKeD4uU3MPqJz26Y6RwBkj88HyzN5wpm7Ivgyiiwna49YV5HNRsmFys5M5HxyuJRBmQVDsMMKrFRj07gs/By5kly8dfmSTxTg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector10001; d=microsoft.com; cv=none; b=No9qwNZEfinG8VZTFYbVUqHk/fE1OEgqu8bNc/y5tuxD0ojCoc9zMrBRqeQt/1PUMFhJA6/8m7xT7pXQE3wTQgKnPqIaUDkiz4LIPlNro6z0yKFAeLhyRNZ3dkvp31U4N1wvDjdlefxYIAWZwH/Nk9YRAca0ZsjXpoB8PUkCCuSgztgq0mR/g2uYBNfN1+gyvx5lbfv/Hj0NAgoderZ9qvAYn4ovvCWQwAJnMwNIP++GEsnRhRavxJX7uuIg53AmuAcxEr7yvECTqooSTmwYaWpMY1Cpg0D4DuhAaPjuUCJWhGARTRdmzmrk0RUsoLKWs8kQiXQvJmXoE+C/oUEbIg==
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Michal Orzel <michal.orzel@xxxxxxx>, Jan Beulich <jbeulich@xxxxxxxx>, Jason Andryuk <jason.andryuk@xxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, "Xenia Ragiadakou" <xenia.ragiadakou@xxxxxxx>
  • Delivery-date: Thu, 22 May 2025 12:02:35 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi,

On Wed May 21, 2025 at 5:31 PM CEST, Daniel P. Smith wrote:
> On 5/21/25 10:35, Alejandro Vallejo wrote:
>> Hi,
>> 
>> (There's a TL;DR at the end)
>> 
>> While working on preparing and reworking the hyperlaunch series for
>> upstreaming it's slowly becoming apparent the degree of duplication with
>> dom0less.
>
> Yes, this was by design so that when we got to the point of convergence 
> it would be apparent where the commonalities are and thus be collapsed.

I'd say they are already fairly apparent. So long as we're all in
agreement I'd like to collapse them sooner when the differences are
small rather than later when it'll be a bigger PITA.

I'll prepare and send what I have so far. It'll all be code motion and
non-functional changes to remove struct duplication.

>
>> Oleksii's latest effort to move all that code into common[*] (see
>> ad03faa942b9("xen/common: dom0less: make some...) makes this even clearer.
>> There are 1:1 relationships for every key data strucutre, and by deviating
>> we're overcomplicating the ability to share code.
>> 
>>    [*] 
>> https://lore.kernel.org/xen-devel/cover.1746468003.git.oleksii.kurochko@xxxxxxxxx/
>> 
>>      dom0less           Hyperlaunch
>>      ------------------------------
>>      kernel_info        boot_domain
>>      bootmodule         boot_module
>>      bootmodule_kind   bootmod_type
>>      membanks                memmap
>>      bootinfo             boot_info
>
> If you go back, you will see less of these differences. A lot of the 
> variations have been the results of reviewer's request. And that goes to 
> the fact that dom0less was developed with Arm mentality in terminology, 
> eg. memory banks verse memory map.

bootinfo and banks/maps are more complicated to merge (I do want to
merge them, though, but one elephant at a time). All the others can be
merged without much hassle with minor code adjustments.

>
>> The difficulty in code sharing is not just unfortunate from a vague sense of
>> elegance or neatness. Having different code paths parsing the same DT 
>> bindings
>> means it's a matter of time before x86 and all other Xen ports have different
>> bindings, which would would only worsen the situation wrt code sharing and 
>> user
>> confusion.
>
> Only if we allowed it,

Granted. But mistakes happen, and if detected late enough you end up
with aberrations like x86's compat layer for hypercall parsing. I know I
don't want to see that kind of thing ever again if I can help it.

> but with that said there are subtleties between the arch that will
> require variation. Such as mode, which is primarily an x86 specific.

Yes. Common code can't mean exclusively common. Fortunately, dom0less in
common gained provisions for per-ach bindings (see arch_create_domUs()
in staging). With those hunks extracted (keeping the arch hook in them),
x86 could use the same, keeping private bindings where appropriate.

>
>> I've been trying to get _somewhere_ in using parts of dom0less on x86
>> and develop a credible PoC that highlights the use of dom0less parsing
>> code paths. The results are interesting.
>> 
>> While I didn't get to a full integration in the hyperlaunch series as I hoped
>> due to time constraints I did get far enough to:
>> 
>>    1. Replace boot_module, boot_domain and bootmod_type with their dom0less
>>       counterparts (pending some cleanup).
>>    2. Isolate binding parsing code in common/device-tree so it's dettached 
>> from
>>       the not-so-common dom0less domain building logic in dom0less-build.c
>>    3. Do early module kind detection from x86 using code in (2).
>>    4. Do late binding parsing also using code in (2) after unflattening the 
>> DTB.
>> 
>> And it works well enough under QEMU to populate boot_info to a first
>> approximation. It's missing hyperlaunch-specific bindings (like "domid"
>> or "mode"), but that's just a matter of adding them to the already
>> existing per-arch binding parser ("mode", maybe, would be a candidate
>> for this) or retrofit them in dom0less ("domid" is a very clear
>> candidate for this) and integrating in the larger series to be able to
>> actually boot domains.
>> 
>> The PoC does not go all the way due to time constraints, as I said, but
>> I think it goes far enough to convince me it's both feasible and
>> beneficial to go in this direction rather than that of a full
>> reimplementation on x86, specially seeing how Oleksii already aims to
>> have full code reuse on riscv.
>> 
>> A brave new world would reuse all data (including bootinfo) and code
>> (bootfdt.c/bootinfo.c), but I couldn't get that far, and I don't think
>> I'll be able to in a timely manner. IOW: I compiled-in device-tree.c,
>> but NOT bootfdt.c or bootinfo.c, or any of the others. I merely
>> extracted from those files the binding parsing bits of interest.
>> 
>> It'd be nice to use them, but the domain construction logic is just too
>> different at present. As for the code I tested with,  I need to do some 
>> serious
>> cleanup before it's ready for sharing, and even moreso for review, but 
>> before I
>> go through that I'd like to reach consensus on the following points before
>> investing any more of my time working on the side.
>> 
>> TL;DR:
>> 
>> I think we should aim to share binding code wherever possible, using common
>> datastructures (kernel_info and bootmodule) as dumping ground for the results
>> of the binding parsing functions. I seek agreement on the following 3 points
>> for the end goal of DTB multidomain boots on x86 before I start slicing
>> my hacks into reasonable chunks.
>> 
>>    1. We want common data structures, with arch-specific fields to hold
>>       information from xen,domain DT nodes
>>    2. We want to have a single collection of DTB parsers in the code.
>>    3. We want to operate on the unflattened DTB for the majority of parsing.
>>      (plus a minimal version on the FDT in order to bootstrap, also common)
>
> As for 3, I can repost my original analysis that went to the working 
> group on why using this is not the best idea. Doing 3 would require 
> doing at least two, if not three passes on the DTB for x86 with zero 
> benefit/need since, unlike Arm, we never have to read from it after 
> boot. The way the x86 parser is today, we are walking the DTB only once.

Note that some libfdt wrappers are deceptive and have hideous time
complexities. Doing one pass might not mean a monotonically increasing
cursor. Regardless, DTBs are always quite small (particularly so in
x86), so the time complexity of the parsing algorithm is of little
consequence. We could do 512 passes and the time it would take would
still be negligible.

>
> What I would suggest for 2 is that, perhaps, it might be an opportune 
> time to review the existing DTB parsing code. Providing a common 
> interface to parse standard/spec DTB structures regardless if it is 
> coming from an FTB or via the FTB index, aka "unflattened" DTB. Doing so 
> would enable a complete reuse within the DTB parsers and remove then 
> need for 3.

The biggest reason for using the DTB unflattened is (imo) having arm and
x86 parsing the same data structure. The unflattened tree is not merely
an FDT with pointers. There's logic to merge nodes with the same names,
remove duplicates, etc. A "quirky" DTB (e.g: with overlapping nodes),
would thus behave differently on arm and x86 unless they both use the
FDT or the unflattened DT.

Furthermore, parsing strictly in the order you find properties in
creates a danger to synthesise different systems when fed DTBs that have
properties in different orders. imagebuilder could be instructed to
reduce the danger of this by having a canonical order of properties, but
I really dislike having room for diverging interpretations of identical
source data. This isn't necessarily an issue with FDT parsing per se,
but it is with the way it's been done in the existing series, whereas
the code in dom0less looks up properties in a specific order.

>
>> (2) and (3) are tightly related. There's many reasons for wanting to use the
>> unflattened blobs as much as possible. They range from quirks in specific 
>> "dtc"
>> versions, to complexities parsing phandles, to corner cases involving 
>> duplicate
>> properties (i.e: due to .dtsi files), etc. Unflattening an FDT brings a
>> lots of "maybe-ok-after-sanitising" FDTs back into canonically correct DTs.
>> 
>> I'll share the PoC code as soon as as it's in a presentable state.
>> Hopefully by the end of the week. But I'm sending this ahead of time to
>> start collecting thoughts right away.
>> 
>> So. Thoughts?
>
>
> v/r,
> dps

All this said, the analysis (thanks for sharing) seems still relevant
and also seems like a reasonable goal for all architectures, but having
everyone parsing the DTB in the exact same way leads to far better
predictability. Once all the code is in common and in use, it'll be a
lot easier to simply modify arm to use the FDT on the grounds of keeping
arch-symmetry and the drop of code it'd mean for x86.

For the time being, however, I'm more interested in getting staging to a
working state where it can boot multiple domains off a DTB, without
endangering arch-symmetry. Then the rest is refinement and optimisation.

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.