[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [RFC PATCH 00/25] Introduce xenbindgen to autogen hypercall structs



On Thu Nov 21, 2024 at 5:47 PM GMT, Anthony PERARD wrote:

Hi,

> Hi Alejandro,
>
> Nice work :-).

Cheers! And thanks for having a look.

>
> On Fri, Nov 15, 2024 at 11:51:29AM +0000, Alejandro Vallejo wrote:
> > This series is the result of my "Interfacing Rust with Xen" talk in Xen 
> > Summit.
> > It adds a hypercall ABI IDL parser and generator to the xen tree, replaces a
> > couple of existing hypercalls, creates a Rust crate with autogenerated 
> > contents
> > an creates a CI job to ensure nothing goes out of sync.
> >
> > The changes are fairly invasive because the various autogenerated items 
> > appear
> > in many places (specially the domaincreate flags). However, the changes to 
> > the
> > hypervisor are all mechanical and not functional (not intentionally so, at
> > least).
>
> I tried to build QEMU with this series applied, and the build failed. In
> this case nothing important, the "autogen" directory just need to be
> installed. But I fear the changes introduce to the API (like change
> of case for the flags) will also be done to other API that project
> outside of the xen repo use, and thus introduce unneeded breakage.

That's bizarre, I run the series in CI and it came out green.

  https://gitlab.com/xen-project/people/agvallejo/xen/-/pipelines/1543402100

And I can do `make dist` without issues locally.

There might be some flaky dependency somewhere. I admit I'm not sure how the
headers are installed for the QEMU build.

> Should the changes also introduce a compatibility with the previous API?

Jan mentioned something to that effect when I first proposed the change to
grant_opts, but at the time the why was a bit lacking in substance. I then sent
this whole thing to show the why in context.

A more compatible alternative would be to retroactively widen the single
subfield inside *_opts to occupy the whole field. Then we can suitably extend
the masking macros to u32, keep them around for compatibility (outside the
autogenerated stuff; say, in xen.h) and the API would be preserved.

Does that sound like a better approach?

That said, I was under the impression the API to be maintained was in libxl and
everything else was fair game so long as libxc et al. were suitably updated.

What do we actually promise externally?

>
> > I've split the generator in reasonably small pieces, but it's still not a 
> > small
> > tool. The Rust backend remains monolithic in a single patch until the RFC 
> > goes
> > further. It mirrors the C backend for the most part.
> > 
> > The hypercall ABI is specified in a schema of TOML. Most of it should be 
> > fairly
> > obvious as to what it does and means, with the possible exception of the 
> > "typ"
> > field. That has the look of a dictionary because that helps the 
> > deserializer to
> > automatically resolve the typ to a convenient Rust enum type (Typ). In time,
> > that will become something nicer to write, but that's fairly far in my list 
> > of
> > priorities at the moment.
>
> Instead of creating your own IDL specification, did you look for
> existing project that would do just that? That is been able to describe
> the existing ABI in IDL and use an existing project to generate C and
> Rust headers.
>
> I kind of look into this, but there's quite a few project to explore and
> I didn't really spend enough time. Also, there's probably quite a lot
> that are for client-server interfaces rather than syscall/hypercalls, or
> they impose a data format.
>

I looked a fair bit. Alas, the biggest case for this is web microservices, so
the overwhelming majority of IDL projects focus on end-to-end RPC. That is,
given pairs of functions for producers/consumers and a byte-based comms channel
(typically a socket), they create their own ABI serialising on one side and
deserialising on the other. That's not adequate here because we care about the
precise semantics of the ABI at the hypercall boundary to avoid pushing a
deserialiser in the hypervisor.

Protocol buffers, flatbuffers and Cap'n Proto all fall in this category, and
gRPC is a higher level construct using protocol buffers or flatbuffers. So all
those are off the table, and virtually all others suffer from the same sin.

A notable exception is Kaitai Struct (https://kaitai.io/), because it was
designed to represent binary formats. I really wanted to use it, but Rust is
not officially supported and the last release dates from 2022. All in all, it
doesn't sound like something alive enough for use in a serious existing
project.

>
> Next, on the file format choice, is TOML the best for describing an ABI,
> or would other existing file format make it a bit easier to read, like
> JSON or YAML? (I quite like using YAML so I have a bias toward it :-),
> and that's the format used for the CI). I don't think it mater much for
> Serde which file format is used.

Sure. I don't really care which. I can use serde to convert anything to
anything else anyway. I happened to already have something set up for TOML, so
I shamelessly reused it. But I'm happy to use something else.

I'm halfway through formalising evtchn atm (with a few addition to the
generator), but I'll try migrating the specs to YAML and JSON to see how they
look like.

I'm only frontally opposed to XML :)

>
> > After the series sysctl::readconsole and domctl::createdomain are 
> > autogenerated
> > from their formalized forms. In the course of formalizing the ABI it became
> > apparent readconsole has a different ABI in 32 and 64 bits. While benign in
> > that particular case, it's yet one more reason to formalize the ABI in a
> > language agnostic way and have it machine-checked.
> > 
> > ======== The Plan ===========
> > 
> > So, the idea of the series is to adjust 2 meaningful hypercalls to 
> > TOML-based
> > specifications (sysctl::readconsole and domctl::createdomain). The series is
> > organised in the following chunks of work
> > 
> >   1. Sanitise domctl::createdomain to remove packed subfields.
> >   2. Introduce xenbindgen (IDL parser and generator for C).
> >   3. Specify hypercalls in TOML, and replace the hand-crafted public bits.
> >   4. Introduce Rust generator for xenbindgen.
> >   5. Introduce a xen-sys crate, with the autogenerated Rust constructs.
> >   6. Introduce CI checks for Rust linters, ABI validation and autogenerated
> >      file consistency.
> > 
> > Future work involves migrating more hypercalls, in the same way patch 12 
> > does.
> > Most hypercalls should not take the amount of churn createdomain did. With 
> > the
> > foundations laid down the involved work should be simple.
> > 
> > I have considered integrating the hypercall generation in the build process.
> > That forces the Rust toolchain to be in the list of build dependencies for
> > downstreams, which might be complicated or annoying. For the time being, I
> > think checking in the autogenerated files and confirming in CI that they are
> > in-sync is (imo) more than enough.
>
> Having the generated header files been committed sound like a good idea
> for now. For better or for worth we've got a few of those already, so
> it isn't a first.

So long as CI checks for consistency (and it does here), it shouldn't be a
problem and helps a lot with review. I have noticed a few times regressions
while developing merely because it became apparent in `git status`.

>
> But the way the different pieces are spread out in the repository in
> this patch series will make it difficult for future contributor to update
> the hypercall ABI. They'll be meet with an "autogenerated file, don't
> modify" with little clue as to how actually regenerate them. For that I
> think it would be better to have the IDL description (TOML files) in
> that "xen/public/include" directory or at the very least in "xen/".

I can move the specs to <root>/xen/abi, or something like that. Having it in
the include folder might risk installing them on the targets, and while that
shouldn't matter it's better if we only ship .h files there.

Regardless of this, I should add a bit more context to the message in the
headers they reference where the spec lives and some README.

> Second, with "xenbindgen" been in in "tools/", this introduce a soft
> dependency of "xen" on "tools", which should be avoided even if the
> build system of "xen/" doesn't call into xenbindgen today. So I think it
> would be better to have xenbindgen either live in "xen/" or in a
> different directory at the root of the repo. There's already Kconfig in
> "xen/" so xenbindgen isn't going to be the first parser/generator in
> "xen/" directory.

I don't disagree, but what do I do with xen-sys then? Should I put it with the
hypervisor somewhere in <root>/xen? <root>/xen/rust/xen-sys, maybe?

Otherwise the same coupling exists between xen and tools, except in the other
direction.

>
> > ======== Patch grouping ===========
> > 
> > Patches 1 and 2 remove packed subfields to allow encoding it in the TOML 
> > specs
> > (e.g: see patch 13, replace hand-crafted altp2m_mode). It's non-functional
> > changes aiming to reach simpler representability.
> > 
> >   Patch 1.  xen/domctl: Refine grant_opts into max_grant_version
> >   Patch 2.  xen/domctl: Replace altp2m_opts with altp2m_mode
> > 
> > Patches 3 to 10 are xenbindgen (with the C generator backend only). The
> > Cargo.lock file in patch 4 is required for build reproducibility and is
> > recommended to have checked in the repo.
> > 
> >   Patch 3.  tools/xenbindgen: Introduce a Xen hypercall IDL generator
> >   Patch 4.  tools/xenbindgen: Add a TOML spec reader
> >   Patch 5.  tools/xenbindgen: Add basic plumbing for the C backend
> >   Patch 6.  tools/xenbindgen: Add xenbindgen's Cargo.lock file
> >   Patch 7.  tools/xenbindgen: Add support for structs in TOML specs
> >   Patch 8.  tools/xenbindgen: Add support for enums in TOML specs
> >   Patch 9.  tools/xenbindgen: Add support for bitmaps in TOML specs
> >   Patch 10. tools/xenbindgen: Add support for includes in the TOML specs
> > 
> > Patch 11 goes a step beyond and validates the ABI has no implicit padding 
> > and
> > that all names are unique. In the future, when we define rules for stable 
> > ABIs,
> > all of those can be checked here, at generation time.
> > 
> >   Patch 11. tools/xenbindgen: Validate ABI rules at generation time
> > 
> > Patches 12 to 19 replace current items in the C headers with autogenerated
> > versions. They should all be mechanical translations.
> > 
> >   Patch 12. xen: Replace sysctl/readconsole with autogenerated version
> >   Patch 13. xen: Replace hand-crafted altp2m_mode descriptions with
> >             autogenerated ones
> >   Patch 14. xen: Replace common bitmaps in domctl.createdomain with
> >             autogenerated versions
> >   Patch 15. xen/arm: Replace hand-crafted xen_arch_domainconfig with
> >             autogenerated one
>
> I feel like you write "hand-crafted" in those patch description as if it
> is a bad thing. Yet, you replace the hand-crafted C headers by
> hand-crafted IDL. I think a better title could be "Translate
> xen_arch_domainconfig into IDL" to avoid what I feel like is a
> pejorative term.

Far from my intent. I merely meant "non-autogenerated". But point taken.

>
> Also, would it be possible to separate changes to the existing API from
> the patch that introduce the newly generated headers? I think it would
> be much easier to review that the generated headers don't introduce
> any changes over the current one.
>
> Cheers,

Sure, I can do that. createdomainflags is particularly nasty in that sense :/

Cheers,
Alejandro



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.