[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Summary: Re: [PATCH for-4.17 v3 00/15] OCaml fixes for Xen 4.17


  • To: Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx>
  • From: Edwin Torok <edvin.torok@xxxxxxxxxx>
  • Date: Sun, 13 Nov 2022 11:12:24 +0000
  • Accept-language: en-GB, en-US
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=P6bmGn9NFYDHljzIZ3nIV4aKlkw1EmJnaglQNUbX9L4=; b=TYzsy10JPhCG6NQxLUuO1hp79PUrho1bGC29OamQJxYwTtYw6qZYzTqDWVZM4+tkPrR32ErnfSRurnCy3Pc/bsxGCKURpqHknhc2+Fj9upNjgV1hJQ4A7Yok/L9mz5apQc3Nql18E6gqVJ0Qs8zaNett7xS9gGbDUnV0CDhJBiKgxGGPxTxGfwTMBG+flSTKl1n6Szh9J7Bfo/Z7J+nIlCnYLgN1kvK/B54QIkEk4zD63Ssrl1778WvVww/+GlnNbghWNOA5hZ5bvju5qt3eJDRzDUpZ2uiAsslnlT5jyRDaBhIHHvOq0+Fijo98AWUwWqrJM8bXjGSkyb3995FiVg==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=Vdq787LwmqBRogh5Ou1WcUW/Uztfhwu6LwXjHBGVT0y+0px9fZj+lQ52FbAp7m7Ms0W5c5nchuPno0pzf/bl8pCuCuucp8muzllRYkhggw69yOdOKhYYdQAd11hcFhgIk/MbBL1Wb0siZU9BBeX8+Su1uFaGsjHXzrBQR8htWT7/xHiL5HTr/otdKt/YByve6mkwpBskpYpOTI/iU8/OjpSMR+OiPUM90MAzLPfFRzamcKfWov5ih+WaQZaaS2tkUmgc+X4CXRAyWjVJJQGo3CPNrHNpsL1gOreMvOBfTg3Y6cpmwhxrR5lmA7mI1XJhDyVLWC5OViS1EkI9cCvUhA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>, Henry Wang <Henry.Wang@xxxxxxx>, Christian Lindig <christian.lindig@xxxxxxxxxx>, David Scott <dave@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, Anthony Perard <anthony.perard@xxxxxxxxxx>
  • Delivery-date: Sun, 13 Nov 2022 11:12:42 +0000
  • Ironport-data: A9a23:zfrzgaJ7+pzjVaaEFE+R4pQlxSXFcZb7ZxGr2PjKsXjdYENS0WBVz jMYDD+FPqqPZ2Ojeowkad/n8kxU7ZaGyoRnTANlqX01Q3x08seUXt7xwmUcnc+xBpaaEB84t ZV2hv3odp1coqr0/0/1WlTZhSAgk/rOHv+kUrWs1hlZHWdMUD0mhQ9oh9k3i4tphcnRKw6Ws Jb5rta31GWNglaYCUpJrfPdwP9TlK6q4mlB5wVlPakjUGL2zBH5MrpOfcldEFOgKmVkNrbSb /rOyri/4lTY838FYj9yuu+mGqGiaue60Tmm0hK6aYD76vRxjnVaPpIAHOgdcS9qZwChxLid/ jnvWauYEm/FNoWU8AgUvoIx/ytWZcWq85efSZSzXFD6I+QrvBIAzt03ZHzaM7H09c5vIGF0q fUdDgpOdwvY3tCNmJSeGsxV05FLwMnDZOvzu1lG5BSAV7MKZM6GRK/Ho9hFwD03m8ZCW+7EY NYUYiZuaxKGZABTPlAQC9Q1m+LAanvXKmUE7g7K4/dmpTGMlGSd05C0WDbRUtGGW8RT2Fqfv GXF12/4HgsbJJqUzj/tHneE1rOUxXOiBNN6+LuQzeU3vQah60kvKQA8SmC5p8iwsXCSRIcKQ 6AT0m90xUQoz2S7Q9+4UxCmrXqsuh8HR8EWA+A88BuKyKff/0CeHGdsZh5MbsY38vA/QzMC3 0WM2djuAFRHvLSLRFqH+7yTrDf0PjIaRUcdYQcUQA1D5MPsyLzflTrKR9dnVaKw0Nv8HGiqx yjQ9XdmwbIOkcQMyqO3u0jdhC6hrYTISQhz4RjLWmWi7UVyY4vNi5GU1GU3JM1odO6xJmRtd lBd8yRCxIji1a2wqRE=
  • Ironport-hdrordr: A9a23:d9GbTa3Kdz6/wCYKrUy9YAqjBZpxeYIsimQD101hICG9Lfb0qy n+pp4mPEHP4wr5AEtQ4uxpOMG7MBDhHQYc2/hdAV7QZnidhILOFvAv0WKC+UyrJ8SazIJgPM hbAs9D4bHLbGSSyPyKmDVQcOxQj+VvkprY49s2pk0FJW4FV0gj1XYBNu/xKDwVeOAyP+tcKH Pq3Lsjm9PPQxQqR/X+IkNAc/nIptXNmp6jSRkaByQ/4A3LoSK05KX8Gx242A5bdz9U278t/U XMjgS8v8yYwrCG4y6Z81WWw4VdmdPnxNcGLMuQivINIjGpphe0aJ9nU7iiuilwhO208l4lnP TFvh9lFcVu7HH6eH2zvHLWqkfd+Qdrz0Wn5U6TgHPlr8C8bik9EdB9iYVQdQacw1Y8vflnuZ g7nF6xht5yN1ftjS7979/HW1VBjUyvu0cvluYVkjh2TZYeUrlMtoYSlXklUqvoXRiKrbzPIt MeS/0018wmN29yqEqp51WH9ebcGkjb2C32GnTq9PbliAS+10oJsnfwjPZv4kvosqhNC6Wsrt 60TJhAhfVASNQbYrl6A/pEScyrCnbVSRaJK26KJ0/7fZt3ck4kO/bMkcoIDcyRCes1JaEJ6e L8eUIdsXR3d1PlCMWI0pEO+hfRQH+lVTCozs1F/ZB2trD1WbKuaES4ORsTutrlp+9aDtzQWv 61Np4TC/j/LXH2EYIM2wHlQZFdJXQXTcVQsNcmXFCFpN7NN+TRx6TmWeeWIKCoHScvW2v5DH dGVD/vJN9Y5kTuQXP8iAi5YQKYRqU+x+MELEH3xZlh9GFWDPw8juE8syXI2uibbTtfr6cxYE xyZLv6j6LTnxjFwVr1
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Thread-index: AQHY84ehGvE29j6O1UW/SkvcImf7eq46Nk6AgAECtwA=
  • Thread-topic: Summary: Re: [PATCH for-4.17 v3 00/15] OCaml fixes for Xen 4.17


> On 11 Nov 2022, at 20:46, Andrew Cooper <Andrew.Cooper3@xxxxxxxxxx> wrote:
> 
> Nothing here is critical enough to go into 4.17 at this juncture.

Agreed

> 
> Various notes/observations from having spent a day trying to untangle
> things.

Thanks.


> 
> 1) Patches 5/6 are a single bugfix and need merging.  Except there was
> also an error when taking feedback from the list, and the net result
> regresses the original optimisation.  I have a fix sorted in my local queue.
> 
> 2) The indentation fix (not attached to this series) should scope the
> logic, not delete a debug line which was presumably added for a good
> reason.  I've got a fix to this effect in my local queue, and we can
> discuss the pros/cons of the approach in due course.


I deleted the debug line to avoid reindenting code, which was frowned upon.
Either way it is fine for me.


> 
> 3) Patch 1, evtchn Ocaml 5.0 compat, is still missing some corrections
> which I gave on earlier postings.  I've fixed it up locally in my queue.
> 
> I also notice, while reviewing the whole, that stub_eventchn_init()
> passes NULL as a logger, which has the side effect of libxenevtchn
> instantiating a default logger which takes control of stdout/stderr. 
> Without starting the fight over toxic library behaviour yet again, it
> occurs to me in the context of Patch 13, uncaught exception handler,
> that in oxenstored, any logging from the C level needs to end up elsewhere.
> 
> While we do have ocaml bindings for xentoollog, nothing uses it, and
> none of the other libraries (save xl, which isn't used) have a way of
> passing the Ocaml Xentoollog down.  This probably wants rethinking, one
> way or another.

We should probably start by reviewing the xentoollog bindings, if they never 
got used they're probably buggy.
I think Pau might have some xentoollog bindings though.

> 
> 4) Patches 2/3.  All these libraries have far worse problems than
> evtchn, because they can easily use-after-free.  They all need to be
> Custom with a finaliser.

Indeed, the bindings aren't very safe, and that should be fixed separately.
I've got some patches somewhere to stop the mmap bindings from segfaulting on 
invalid data,
but I lost track whether that got commited or still in one of my local branches.

> 
> 5) Patch 4.  The commit message says "A better solution is being worked
> on for master", but this is master.  Also, it's not a prerequisite for a
> security fix; merely something to make a developers life easier.

It is to avoid having to add Makefile changes in each security patch commit 
(potentially).
Perhaps the commit message can be changed to say this is the minimal change to 
unbreak the build system,
and a more comprehensive solution is being worked on (using Dune, or dune 
generated makefiles).

> 
> 6) The re-indent patch.  Policies of when to do it aside, having tried
> using it, the format adjustment is incomplete (running ocp-indent gets
> me deltas in files I haven't touched),

Perhaps we need to use the strict_comments setting, I think it tries to leave 
comments untouched,
but that also means the outcome depends on the starting state.
And I hope it doesn't depend on ocp-indent version.

> and there needs to be some
> .gitignore changes.
> 
> That said, it is usually frowned upon to have logic depending on being
> in a git tree.  This was perhaps a bigger deal back when we used hg by
> default and mirrored into multiple SCMs, but it's still expected not to
> rely on this.

We could use 'find' instead.

> 
> 7) Patch 8, evtchn fdopen, is two separate patches.  One adding fdopen,
> and one adding a NOCLOEXEC argument to the existing init.
> 
> They want splitting in two.  fdopen() ought to pass flags so we don't
> have to break the ABI again

The ABI changes everytime a new variant is added (the interface hash will 
change, and you need to rebuild/relink),
so using a single flag variant doesn't avoid ABI changes like it would in C.

> when there is a flag needing passing, and
> cloexec probably shouldn't be a boolean.

The preferred way to bind CLOEXEC in OCaml is to use a boolean, see
Unix.html <https://v2.ocaml.org/api/Unix.html>, in particular
`val socket : ?cloexec:bool ->
socket_domain -> socket_type -> int -> file_descr`
Perhaps I should've spelled this out in the commit message.

>   We should either pass a raw
> int32, or a list-of-enums like we do in the xenctrl stubs.  Also, this
> patch has inherited errors from patch 1.
> 
> 9) Patches 8 thru 15 need to be the other side of the intent patch,
> because they need backporting to branches which will never get it.

This was done on purpose like this to ensure that indentation is backported in 
some way,
because the lack of indentation has previously broken backports/rebases (see 
the debug line introduced in the wrong place in the live update patch).
Without a comprehensive testsuite (which is being worked on, but not ready yet) 
it is then impossible to tell whether a backport is correct or not,
even if it compiles, it may have some things in the wrong place, and wrong 
indentation just makes reviewing those more difficult.

Otherwise I have to keep making changes with the wrong indentation or avoid 
indentation changes in patches, which has previously introduced bugs.
It is extra work, and all it does is decrease the quality of the end result and 
confuse both patch authors and reviewers.
Furthermore the indentation commit on its own is separate and can be proven to 
have no functional changes if you view the diff ignoring whitespace.

We first need to make oxenstored suitable to be developed on, which means 
starting at the basics, fixing up indentation, build systems
(all the bugs in the bindings you pointed out), etc.
I tried my best to avoid making changes like this within the XSA fix (which 
only contributed to lengthening the time to develop it),
but now that we're no longer constrained by XSA rules we should fix things the 
right way.

Keeping the status quo just for the sake of it only discourages contribution to 
oxenstored.

If it helps we can consider all past versions of oxenstored unsupported (by me) 
and support only master going forward, and once we have master in a reasonable 
shape
we can decide what and how to backport, and which versions to reinstate support 
on. That would avoid placing artificial constraints on master development.
I intend to change master substantially enough that most backports will be 
impossible unless you take an almost entirely new version of oxenstored.

In fact releases of oxenstored shouldn't be tied to a particular hypervisor 
version: there should be a single long term supported stable branch of 
oxenstored and a master branch.
(I understand that is not possible yet due to the use of the 2 unstable xenctrl 
APIs, one of which has an outstanding patch series to remove the dependency)
The current situation only creates the illusion of support for the backported 
versions, because there is no (comprehensive) testsuite to check that those 
backports work,
and XenServer only tests internally 4.13 and latest master, anything inbetween 
is technically untested, and may be more buggy than just running one version.

>   This
> is why bugfixes always go at the head of a patch series, and
> improvements at the tail.

I agree that is how it should generally be, but that means improvements can 
never be done because we always keep finding more and more bugs as we do 
improvements,
and then do a lot of risky rebases to get patches moved ahead, and without 
minimal things like automated tools to keep at least indentation consistent
the end result is a mess that cannot be trusted.


> 
> 10) Patch 12 talks about default log levels, but that's bogus
> reasoning.  The messages should be warnings because they non-fatal
> exceptional cases.
> 
> 11) Patch 14 talks about using caml_stat_strdup(), but doesn't.

The commit message for this is fixed on my github branch: 
https://github.com/edwintorok/xen/commit/dc893a079fd7cf2a9bb8ed03cca3d249a53e3c44
It initially had that function, but it is not available in 4.02.3 

Thanks for helping sort out the patch series.

Best regards,
--Edwin


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.