Xen project Mailing List

Re: [Xen-devel] [RFC] Generating Go bindings for libxl

To: Nicholas Rosbrook <rosbrookn@xxxxxxxxxxxx>, "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>

From: George Dunlap <george.dunlap@xxxxxxxxxx>

Date: Wed, 31 Jul 2019 16:06:43 +0100

Authentication-results: esa6.hc3370-68.iphmx.com; dkim=none (message not signed) header.i=none; spf=None smtp.pra=george.dunlap@xxxxxxxxxx; spf=Pass smtp.mailfrom=George.Dunlap@xxxxxxxxxx; spf=None smtp.helo=postmaster@xxxxxxxxxxxxxxx

Autocrypt: addr=george.dunlap@xxxxxxxxxx; prefer-encrypt=mutual; keydata= mQINBFPqG+MBEACwPYTQpHepyshcufo0dVmqxDo917iWPslB8lauFxVf4WZtGvQSsKStHJSj 92Qkxp4CH2DwudI8qpVbnWCXsZxodDWac9c3PordLwz5/XL41LevEoM3NWRm5TNgJ3ckPA+J K5OfSK04QtmwSHFP3G/SXDJpGs+oDJgASta2AOl9vPV+t3xG6xyfa2NMGn9wmEvvVMD44Z7R W3RhZPn/NEZ5gaJhIUMgTChGwwWDOX0YPY19vcy5fT4bTIxvoZsLOkLSGoZb/jHIzkAAznug Q7PPeZJ1kXpbW9EHHaUHiCD9C87dMyty0N3TmWfp0VvBCaw32yFtM9jUgB7UVneoZUMUKeHA fgIXhJ7I7JFmw3J0PjGLxCLHf2Q5JOD8jeEXpdxugqF7B/fWYYmyIgwKutiGZeoPhl9c/7RE Bf6f9Qv4AtQoJwtLw6+5pDXsTD5q/GwhPjt7ohF7aQZTMMHhZuS52/izKhDzIufl6uiqUBge 0lqG+/ViLKwCkxHDREuSUTtfjRc9/AoAt2V2HOfgKORSCjFC1eI0+8UMxlfdq2z1AAchinU0 eSkRpX2An3CPEjgGFmu2Je4a/R/Kd6nGU8AFaE8ta0oq5BSFDRYdcKchw4TSxetkG6iUtqOO ZFS7VAdF00eqFJNQpi6IUQryhnrOByw+zSobqlOPUO7XC5fjnwARAQABtCRHZW9yZ2UgVy4g RHVubGFwIDxkdW5sYXBnQHVtaWNoLmVkdT6JAlcEEwEKAEECGwMFCwkIBwMFFQoJCAsFFgID AQACHgECF4ACGQEWIQTXqBy2bTNXPzpOYFimNjwxBZC0bQUCXEowWQUJDCJ7dgAKCRCmNjwx BZC0beKvEACJ75YlJXd7TnNHgFyiCJkm/qPeoQ3sFGSDZuZh7SKcdt9+3V2bFEb0Mii1hQaz 3hRqZb8sYPHJrGP0ljK09k3wf8k3OuNxziLQBJyzvn7WNlE4wBEcy/Ejo9TVBdA4ph5D0YaZ nqdsPmxe/xlTFuSkgu4ep1v9dfVP1TQR0e+JIBa/Ss+cKC5intKm+8JxpOploAHuzaPu0L/X FapzsIXqgT9eIQeBEgO2hge6h9Jov3WeED/vh8kA7f8c6zQ/gs5E7VGALwsiLrhr0LZFcKcw kI3oCCrB/C/wyPZv789Ra8EXbeRSJmTjcnBwHRPjnjwQmetRDD1t+VyrkC6uujT5jmgOBzaj KCqZ8PcMAssOzdzQtKmjUQ2b3ICPs2X13xZ5M5/OVs1W3TG5gkvMh4YoHi4ilFnOk+v3/j7q 65FG6N0JLb94Ndi80HkIOQQ1XVGTyu6bUPaBg3rWK91Csp1682kD/dNVF3FKHrRLmSVtmEQR 5rK0+VGc/FmR6vd4haKGWIRuPxzg+pBR77avIZpU7C7+UXGuZ5CbHwIdY8LojJg2TuUdqaVj yxmEZLOA8rVHipCGrslRNthVbJrGN/pqtKjCClFZHIAYJQ9EGLHXLG9Pj76opfjHij3MpR3o pCGAh6KsCrfrsvjnpDwqSbngGyEVH030irSk4SwIqZ7FwLkBDQRUWmc6AQgAzpc8Ng5Opbrh iZrn69Xr3js28p+b4a+0BOvC48NfrNovZw4eFeKIzmI/t6EkJkSqBIxobWRpBkwGweENsqnd 0qigmsDw4N7J9Xx0h9ARDqiWxX4jr7u9xauI+CRJ1rBNO3VV30QdACwQ4LqhR/WA+IjdhyMH wj3EJGE61NdP/h0zfaLYAbvEg47/TPThFsm4m8Rd6bX7RkrrOgBbL/AOnYOMEivyfZZKX1vv iEemAvLfdk2lZt7Vm6X/fbKbV8tPUuZELzNedJvTTBS3/l1FVz9OUcLDeWhGEdlxqXH0sYWh E9+PXTAfz5JxKH+LMetwEM8DbuOoDIpmIGZKrZ+2fQARAQABiQNbBBgBCgAmAhsCFiEE16gc tm0zVz86TmBYpjY8MQWQtG0FAlxKMJ4FCQnQ/OQBKcBdIAQZAQoABgUCVFpnOgAKCRCyFcen x4Qb7cXrCAC0qQeEWmLa9oEAPa+5U6wvG1t/mi22gZN6uzQXH1faIOoDehr7PPESE6tuR/vI CTTnaSrd4UDPNeqOqVF07YexWD1LDcQG6PnRqC5DIX1RGE3BaSaMl2pFJP8y+chews11yP8G DBbxaIsTcHZI1iVIC9XLhoeegWi84vYc8F4ziADVfowbmbvcVw11gE8tmALCwTeBeZVteXjh 0OELHwrc1/4j4yvENjIXRO+QLIgk43kB57Upr4tP2MEcs0odgPM+Q+oETOJ00xzLgkTnLPim C1FIW2bOZdTj+Uq6ezRS2LKsNmW+PRRvNyA5ojEbA/faxmAjMZtLdSSSeFK8y4SoCRCmNjwx BZC0bevWEACRu+GyQgrdGmorUptniIeO1jQlpTiP5WpVnk9Oe8SiLoXUhXXNj6EtzyLGpYmf kEAbki+S6WAKnzZd3shL58AuMyDxtFNNjNeKJOcl6FL7JPBIIgIp3wR401Ep+/s5pl3Nw8Ii 157f0T7o8CPb54w6S1WsMkU78WzTxIs/1lLblSMcvyz1Jq64g4OqiWI85JfkzPLlloVf1rzy ebIBLrrmjhCE2tL1RONpE/KRVb+Q+PIs5+YcZ+Q1e0vXWA7NhTWFbWx3+N6WW6gaGpbFbopo FkYRpj+2TA5cX5zW148/xU5/ATEb5vdUkFLUFVy5YNUSyeBHuaf6fGmBrDc47rQjAOt1rmyD 56MUBHpLUbvA6NkPezb7T6bQpupyzGRkMUmSwHiLyQNJQhVe+9NiJJvtEE3jol0JVJoQ9WVn FAzPNCgHQyvbsIF3gYkCYKI0w8EhEoH5FHYLoKS6Jg880IY5rXzoAEfPvLXegy6mhYl+mNVN QUBD4h9XtOvcdzR559lZuC0Ksy7Xqw3BMolmKsRO3gWKhXSna3zKl4UuheyZtubVWoNWP/bn vbyiYnLwuiKDfNAinEWERC8nPKlv3PkZw5d3t46F1Dx0TMf16NmP+azsRpnMZyzpY8BL2eur feSGAOB9qjZNyzbo5nEKHldKWCKE7Ye0EPEjECS1gjKDwbkBDQRUWrq9AQgA7aJ0i1pQSmUR 6ZXZD2YEDxia2ByR0uZoTS7N0NYv1OjU8v6p017u0Fco5+Qoju/fZ97ScHhp5xGVAk5kxZBF DT4ovJd0nIeSr3bbWwfNzGx1waztfdzXt6n3MBKr7AhioB1m+vuk31redUdnhbtvN7O40MC+ fgSk5/+jRGxY3IOVPooQKzUO7M51GoOg4wl9ia3H2EzOoGhN2vpTbT8qCcL92ZZZwkBRldoA Wn7c1hEKSTuT3f1VpSmhjnX0J4uvKZ1V2R7rooKJYFBcySC0wa8aTmAtAvLgfcpe+legOtgq DKzLuN45xzEjyjCiI521t8zxNMPJY9FiCPNv0sCkDwARAQABiQI8BBgBCgAmAhsMFiEE16gc tm0zVz86TmBYpjY8MQWQtG0FAlxKNJYFCQnQrVkACgkQpjY8MQWQtG2Xxg//RrRP+PFYuNXt 9C5hec/JoY24TkGPPd2tMC9usWZVImIk7VlHlAeqHeE0lWU0LRGIvOBITbS9izw6fOVQBvCA Fni56S12fKLusWgWhgu03toT9ZGxZ9W22yfw5uThSHQ4y09wRWAIYvhJsKnPGGC2KDxFvtz5 4pYYNe8Icy4bwsxcgbaSFaRh+mYtts6wE9VzyJvyfTqbe8VrvE+3InG5rrlNn51AO6M4Wv20 iFEgYanJXfhicl0WCQrHyTLfdB5p1w+072CL8uryHQVfD0FcDe+J/wl3bmYze+aD1SlPzFoI MaSIXKejC6oh6DAT4rvU8kMAbX90T834Mvbc3jplaWorNJEwjAH/r+v877AI9Vsmptis+rni JwUissjRbcdlkKBisoUZRPmxQeUifxUpqgulZcYwbEC/a49+WvbaYUriaDLHzg9xisijHwD2 yWV8igBeg+cmwnk0mPz8tIVvwi4lICAgXob7HZiaqKnwaDXs4LiS4vdG5s/ElnE3rIc87yru 24n3ypeDZ6f5LkdqL1UNp5/0Aqbr3EiN7/ina4YVyscy9754l944kyHnnMRLVykg0v+kakj0 h0RJ5LbfLAMM8M52KIA3y14g0Fb7kHLcOUMVcgfQ3PrN6chtC+5l6ouDIlSLR3toxH8Aam7E rIFfe2Dk+lD9A9BVd2rfoHA=

Cc: "anthony.perard@xxxxxxxxxx" <anthony.perard@xxxxxxxxxx>, "ian.jackson@xxxxxxxxxxxxx" <ian.jackson@xxxxxxxxxxxxx>, Brendan Kerrigan <kerriganb@xxxxxxxxxxxx>, Nicolas Belouin <nicolas.belouin@xxxxxxxxx>, "wl@xxxxxxx" <wl@xxxxxxx>

Delivery-date: Wed, 31 Jul 2019 15:06:52 +0000

Ironport-sdr: hlZJ3fN6TOEVDAY228+dQbZ7ZrT1ZK/m9EW4OTpr8cm43yz1UgOh+TBMGfdG0ZLKV8k6/GhCCl rHmsNlwZUIXd/Tm5xtxoL0s+5+qZlEcLyl4d3A2DHsStmwUzKsdcFnBBaQrwhHFtYW5fpZRxqC zIGHSaff+vUFQDlbSc9ZjGmdx+Um7K/5VlzgbthQXvd/wmAyWPnM1a7riflE2hVq96Q/Jjivda 3+Z8JyBuptFzku8miWMAvnA7iewQP0p3z6Om+lvzyVu5pCEhRKmcSS3ROkhe4Lw8TX4Dp6rqZS ezM=

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Openpgp: preference=signencrypt

On 7/30/19 10:52 PM, Nicholas Rosbrook wrote: >> All that said, the first question I think is, what the generated code >> needs to look like. Then, if c-for-go can be configured to do that, >> then we can consider it; otherwise, making our own generator from the >> IDL will be the only option. > > Writing a custom Go code generator means that all C symbols used need > to be known at generation time. E.g., the Go code generator needs to know > the signature of libxl_domain_create_new for C.libxl_domain_create_new(...) > to work. Currently, such knowledge is gained by parsing C code, which makes > sense given the nature of cgo. I return to the question I stated before. At the moment, your bindings have the following call chain: * DomainInfo(), hand-crafted. Calls domainInfo(). * domainInfo(), automaticall generated. Calls C.libxl_domain_info(). The in-tree bindings have the following call chain: * DomainInfo(), hand-crafted. Calls C.libxl_domain_info(). Since DomainInfo() is hand-crafted in both cases, what's the advantage of having domainInfo() at all? > AFAICT, the IDL describes how to generate C types > and boiler-plate functions like libxl_<type>_dispose(). How would the IDL > alone be able to > generate valid Go code without significant expansion? So just to clarify terminology: The IDL is the description language itself, which at the moment only contains information about the libxl structures. We have generators for various C bits of libxl which read the IDL and spit out boilerplate C. The idea would be that we write a new generator for Go which reads the IDL and spits out boilerplate Go. If you look at gentypes.py, you'll see that it's not doing anything fancy; it's very much just spitting out strings based on simple rules. I think a large amount of the boilerplate Go code (the C <-> Go marshalling code in particular) can be done in the same way. >> Out of curiosity, have you looked at the existing in-tree bindings? Any >> particular opinions? > > Yes, my process started by using the existing bindings. > > One thing is that they do not conform to basic go standards. For > example: naming conventions, naked returns are used everywhere, and I find > it strange that there is an exported Context variable. But, obviously those > are > very minor things and easy to change. See [1] for general information on this. I looked at the thing about naked returns, and didn't really understand it; but anyway I'm happy to have things modified to be more Go-like. I definitely "speak" Go with a funny accent. The exported Ctx variable is probably vestigial; I don't think I'd argue against removing it. Can I say -- I've been going open-source for so long, that I feel almost unsafe when nobody reviews my stuff. Most of this code was written by me and reviewed by nobody (since I was the only person interested); it's good to have someone else take a critical look at it. > I also thought it looked very tedious to do by hand, and would be hard to > extend > in the future. Hence the search for a cgo generator. Right; and the intention was always to begin to do it by hand to see how we wanted it to look, and then write a generator to do the tedious work by reading the IDL. >> There are two major differences I note. >> >> First, is that in your version, there seems to be two layers: libxl.go >> is generated by c-for-go, and contains simple function calls; e.g.: >> domainInfo(), which takes a *Ctx as an argument and calls >> C.libxl_domain_info. Then you have libxl_wrappers.go, which is written >> manually, defining DomainInfo as a method on Ctx, and calls domainInfo(). >> >> So you're writing the "idiomatic Go" part by hand anyway; I don't really >> see why having a hand-written Go function call an automatically >> generated Go function to call a C function is better than having a >> hand-written Go function call a C function directly. > > I'm sure you would agree that writing all of that cgo code by hand was a PITA. Writing the .toC() and .toGo() methods is certainly a pain, and wants a generator. I didn't find writing: func (Ctx *Context) DomainInfo(Id Domid) (di *Dominfo, err error) { err = Ctx.CheckOpen() if err != nil { return } var cdi C.libxl_dominfo C.libxl_dominfo_init(&cdi) defer C.libxl_dominfo_dispose(&cdi) ret := C.libxl_domain_info(Ctx.ctx, &cdi, C.uint32_t(Id)) if ret != 0 { err = Error(-ret) return } di = cdi.toGo() return } to be terribly much more work than writing something like: func (c *Context) DomainInfo(domid DomID) (*DomInfo, error) { di := DomInfo{} if ret := domainInfo(c.Ctx, &di, uint32(domid)); ret != 0 { return nil, fmt.Errorf("unable to retrieve domain info: %v", ret) } di.Deref() return &di, nil } If we "more discipline to have defer'd dispose/free calls", the code will probably look largely the same. And if we had a an IDL for the libxl functions, we could have it generate the code above for the vast majority of cases. >> In fact, there's a Go-like clone of libxl_domain_config, but none for >> the elements of it; DeviceDisk, for instance, is simply defined as >> C.libxl_device_disk, and config->disks simply copied to the Disks >> element of the struct. That's just all wrong -- it's actually a C >> array; Go can only access the first element of it. How are you supposed >> to create a domain with more than one disk? > > This is simply because my fork is still WIP. If you look at [2], I'm telling > c-for-go > to not generate a new Go type for libxl_device_disk, whereas at [3] I tell > c-for-go > how to create libxl_domain_config. > >> Furthermore, these pointers are not re-set to `nil` after; but the biggest >> argument against it is not using Go-native types. <type>.Free() >> is called. This just seems very dangerous: It would be way to easy to >> introduce a use-after-free bug. > > In [4], the C pointer is set to nil inside the call to DomInfo.Free(). I meant something like DomainConfig. DomainConfig.Deref() will set DomainConfig.Disks (a pointer) equal to domain_config.disks. But DomainConfig.Free() doesn't set DomainConfig.Disks back to nil. So suppose you had the following code: dc DomainConfig; [do something to fill dc.ref87e08e4d] dc.Deref(); // Sets dc.CInfo to non-nil. // Do something dc.Free(); // Sets dc.ref* to nil, but leaves dc.CInfo alone // forget you called Free if dc.Disks.blah { // ## Use-after-free } I realize that deference is a clear bug either way; but if Disks was nil, it would cause a program crash, which is much better than whatever random corruption might happen otherwise. And libxl_domain_info.disks, this brings be back to the point about the IDL having more information. libxl_domain_info.disks isn't a pointer to an individual struct, it's a pointer to an array; and libxl_domain_info.num_disks is the size of the array. But the IDL doesn't actually have num_disks; it has Array(libxl_device_disks, "disks"), and the C generator generates the num_disks elemet from that. If we wrote a generator from the IDL, we could make it smart enough to use []Disks as the type there, and make the marshallers know how to use num_disks to appropriately size the resulting slice and copy the right number of values across. To do that with c-for-go, we'd have to do a lot of work teaching it what to do, if that's even possible. >> The in-tree bindings generally only create C structures temporarily, and >> do a full marshal and unmarshall into and out of Go structures. This >> means a lot of copying on every function call. But it also means that >> the callers can generally treat the Go structures like normal Go >> structures -- they don't have to worry about keeping track of them and >> freeing them or cleaning them up; they can let the GC deal with it, just >> like they deal with everything else. > > AFAICT, the generated code provides the ability to do this. The wrappers > just need more discipline to have defer'd dispose/free calls. If the wrapper > hides the calls to freeing/disposing the C pointers, then the caller can still > rely on the garbage collector like normal. So you mean, for example, after DomainInfo() calls DomInfo.Deref(), it will then call libxl_dominfo_dispose() on the C struct? >> 1. Keep separate structures, and do a full "deep copy", as the in-tree >> bindings do. Advantage: Callers can use GC like normal Go functions. >> Structure elements are translated to go-native types. Disadvantage: >> Copying overhead on every function call. > > Personally, I'm not worried about optimizing just yet. > >> 2. Use C types; do explicit allocate / free. Advantage: No copying on >> every function call. Disadvantage: Needing to remember to clean up / no >> GC; can't use Go-native types. > > We definitely don't want to export the C types through the Go API. > >> 4. Attempt to use SetFinalizer() to automatically do frees / structure >> clean-up [1]. Advantage: No / less copying on every function call, but >> can still treat structures like they'll be GC'd. Disadvantage: Requires >> careful thinking; GC may not be as effective if C-allocated memory >> greatly exceeds Go-allocated memory; can't use Go-native types for elements. > > If we start looking to use the Go runtime, we've gone in the wrong direction. I'm mostly trying to be complete in my analysis. I agree relying on this sort of magic isn't very appealing; but the biggest argument against it is not using Go-native types. That makes it more or less a non-starter. >> c-for-go seems to take the worst bits of #1 and #2: It requires explicit >> allocate / free, but also actually does a full copy of each structure >> whenever one "half" of it changes. > > Either way, we need to worry about the allocate/free. But that will at least > be > hidden by the wrappers. Not sure how to get around that part in any case. Of course "we the library authors" have to worry about allocate / free; but it's a much nicer interface for "them the library consumers" not to have to do so. At the moment you're both, but I'm hoping we'll have lots of "library consumers" who are not "library authors". :-) > I agree that doing a full copy and allowing callers to simply rely on Go's > garbage collector is the best approach. I'm just not entirely convinced that > cannot be accomplished using c-for-go. Right, and personally I'm not in principle opposed to using c-for-go, if it can be made to generate code the way we like it. My main fear is that we'll spend a bunch of time tweaking c-for-go, and after going back and forth and investing a lot of time in it, we'll end up either 1) giving up and writing our own generator anyway, or 2) accepting something sub-optimal because it's the best thing we could make c-for-go do. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.