[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[MirageOS-devel] test, quality, tcpip
Hi folks,
Mort's writeup of notes from a conversation on testing raises some
questions on how to actually go about testing `tcpip`. I'm starting
a new thread to avoid potentially hijacking a more general
conversation on testing with this rather more specific proposal, but
most of what I'm discussing is quite related to problems and
solutions enumerated in "Notes on Testing" from a few hours ago, and
some item are reflected in ARs there. I thought I'd take a run at
writing down what I've gotten out of corridor and backchannel
conversations with many other Mirage-seekers around restructuring
the lump of code we currently keep in the `tcpip` repository
(probably the most-used set of `impl`s in Mirage).
TL;DR - We should have each protocol in `tcpip` expose serializing
and deserializing APIs for external use, in addition to including
its own test suite and having clearer documentation of expectations
around data ingress and egress (as included in Mort's notes). I
further propose that it may be useful to split each protocol into
its own package for use in situations other than "a full
ethernet/ip/tcp/udp/dhcp stack assembled by the mirage frontend
tool".
We should also 1) figure out which of the error exposing/handling
proposals won and 2) start using whichever one it is.
The long version:
What I've read, heard, and thought some reasonable goals are:
 * make it easier to test the code we have that does stuff with
packets (and if we do a good job, we might find a more generally
useful model)
 * write and automatically perform some tests on our
packet-handling code
 * make it clearer when errors occur at low levels so higher-level
applications can deal with them appropriately
 * make API contracts between layers clearer (what blocks?)
 * make it easier to reuse code across projects when that code
needs to know stuff about packets
Code reuse might be a surprising thing to see here mixed in with
quality and testing. Here's some stuff that's in at least two
places I know of for "a network stack", i.e.
/ethernet/arp-and-ipv4,6/icmp/udp/tcp:
* wire format definitions (i.e. Cstructs)
 - Wire_structs (from `tcpip`), the module for serializing and
deserializing wire formats of static size via the autogenerated
Cstruct interface, doesn't seem to be explicitly intended to be
exposed in `tcpip`.
 - ocaml-packet (all in `Packet.ml`)
 - ocaml-pcap (each format in its own .ml)
* serializing/deserializing code
 - all over `tcpip`, usually inaccessible or side-effecting
(notable exception: dhcp)
 - ocaml-packet . notably ocaml-packet currently depends on
`tcpip` for `tcpip_checksum.ml` and required externs
 - ocaml-pcap 0.4.0 also has serializing and deserializing code per
protocol, bundled with wire formats. ocaml-pcap doesn't currently
duplicate many application-layer protocols, although a more
featureful `ocaml-pcap` will likely want to either do that or make
common reference to parsing code.
 - mirage-nat and mirage-mimic, because I needed more than
Wire_structs but less than a full listener
Here's what *isn't* all over the place:
* state machines/side-effecting operations/protocol implementations
 - `tcpip` is the only library I know of that implements anything
beyond a parser/printer for a protocol below the application layer,
although I'm certainly one of the less knowledgeable people about
non-Mirage OCaml codebases on the list
* automated test code for any of the aforementioned
 * ocaml-packet checks that deserializing a type and then
serializing the result gets you the original type
 * `tcpip`: now with tests (TM) ...but it could use some more ;)
 * application-layer code is doing a better job, but that doesn't
get it much if the code beneath is unreliable
Things that might help with both reuse and accessibility for
testing:
* pull Wire_structs into a separate library and make its wider
usability clear and stable, where appropriate
 - currently some things are treated as constant-size cstructs that
shouldn't be, e.g. ipv4 packets which might have options and
therefore be larger than Wire_structs expects; exposing these widely
without richer options for serializing and deserializing might be
counterproductive. Libraries that duplicate Wire_structs seem to
all share this problem, so a common solution would likely be useful.
* pull per-protocol serializing and deserializing code into its own
modules
 - improve and formalize error communication for invalid data, and
express these in mirage-types module types
 - write tests for serializing and deserializing modules
* possibly pull the serializing/deserializing modules into their own
package (packages?) on which `tcpip` depends (or maybe a separate
library within `tcpip`?), for common use outside of `tcpip` itself
* attempt to improve and write tests on the code in `tcpip` that *isn't* digging around in or making
packets, which should be logic for acting on write and read requests
(at a low level)
 - improve and formalize (create?) error communication for bad
state transitions, syntactically OK but semantically bogus packets,
etc
 - possibly it makes sense to break this package up entirely into
separate protocol implementations and rely on the `mirage` frontend
tool (or user request, outside of the Mirage ecosystem) to manage
dependencies between them - see below
Most of the above is focused on splitting up the code in `tcpip`;
there are probably other ways to do it, and likely some of them make
more sense than what I'm proposing. It seems to me that there are
(at least) two reasonable ways to split up some of the code in
`tcpip`:
* horizontally layerwise, i.e. all code to do with ARP is in one
package (serializing/deserializing, state machine/side effects),
same for ipv4, ipv6, etc
* vertically concernwise, i.e. all serializing/deserializing code is
in the same place, all action-taking code (probably with some more
sensible subdivision) is in the same place, etc
* some other way I haven't thought of that makes way more sense
(your proposal here!)
A horizontal division is the only thing that really potentially
gives any benefit for users that I can see - linking in only the
layers we need makes for smaller binaries (at least until we have
dead code elimination). It also seems to have less potential for
external API breakage causing other repositories to become
unbuildable (thinking of the case where e.g. I want to support a new
DHCP option, which might require me to both expose a new variant
type for DHCP.Option.t and write some handling code in what's
currently dhcpv4_client.ml); in a regime where the
serializing/deserializing code and the actual implementations which
receive and send data are in separate packages, they need separate,
coordinated releases. As a bonus, we already have module types in
`mirage-types` for defining reasonable interfaces between these
layers.
Thanks for your consideration.
-Mindy
|
_______________________________________________
MirageOS-devel mailing list
MirageOS-devel@xxxxxxxxxxxxxxxxxxxx
http://lists.xenproject.org/cgi-bin/mailman/listinfo/mirageos-devel
|