[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [RFC] Generating Go bindings for libxl
On 7/30/19 10:52 PM, Nicholas Rosbrook wrote: >> All that said, the first question I think is, what the generated code >> needs to look like. Then, if c-for-go can be configured to do that, >> then we can consider it; otherwise, making our own generator from the >> IDL will be the only option. > > Writing a custom Go code generator means that all C symbols used need > to be known at generation time. E.g., the Go code generator needs to know > the signature of libxl_domain_create_new for C.libxl_domain_create_new(...) > to work. Currently, such knowledge is gained by parsing C code, which makes > sense given the nature of cgo. I return to the question I stated before. At the moment, your bindings have the following call chain: * DomainInfo(), hand-crafted. Calls domainInfo(). * domainInfo(), automaticall generated. Calls C.libxl_domain_info(). The in-tree bindings have the following call chain: * DomainInfo(), hand-crafted. Calls C.libxl_domain_info(). Since DomainInfo() is hand-crafted in both cases, what's the advantage of having domainInfo() at all? > AFAICT, the IDL describes how to generate C types > and boiler-plate functions like libxl_<type>_dispose(). How would the IDL > alone be able to > generate valid Go code without significant expansion? So just to clarify terminology: The IDL is the description language itself, which at the moment only contains information about the libxl structures. We have generators for various C bits of libxl which read the IDL and spit out boilerplate C. The idea would be that we write a new generator for Go which reads the IDL and spits out boilerplate Go. If you look at gentypes.py, you'll see that it's not doing anything fancy; it's very much just spitting out strings based on simple rules. I think a large amount of the boilerplate Go code (the C <-> Go marshalling code in particular) can be done in the same way. >> Out of curiosity, have you looked at the existing in-tree bindings? Any >> particular opinions? > > Yes, my process started by using the existing bindings. > > One thing is that they do not conform to basic go standards. For > example: naming conventions, naked returns are used everywhere, and I find > it strange that there is an exported Context variable. But, obviously those > are > very minor things and easy to change. See [1] for general information on this. I looked at the thing about naked returns, and didn't really understand it; but anyway I'm happy to have things modified to be more Go-like. I definitely "speak" Go with a funny accent. The exported Ctx variable is probably vestigial; I don't think I'd argue against removing it. Can I say -- I've been going open-source for so long, that I feel almost unsafe when nobody reviews my stuff. Most of this code was written by me and reviewed by nobody (since I was the only person interested); it's good to have someone else take a critical look at it. > I also thought it looked very tedious to do by hand, and would be hard to > extend > in the future. Hence the search for a cgo generator. Right; and the intention was always to begin to do it by hand to see how we wanted it to look, and then write a generator to do the tedious work by reading the IDL. >> There are two major differences I note. >> >> First, is that in your version, there seems to be two layers: libxl.go >> is generated by c-for-go, and contains simple function calls; e.g.: >> domainInfo(), which takes a *Ctx as an argument and calls >> C.libxl_domain_info. Then you have libxl_wrappers.go, which is written >> manually, defining DomainInfo as a method on Ctx, and calls domainInfo(). >> >> So you're writing the "idiomatic Go" part by hand anyway; I don't really >> see why having a hand-written Go function call an automatically >> generated Go function to call a C function is better than having a >> hand-written Go function call a C function directly. > > I'm sure you would agree that writing all of that cgo code by hand was a PITA. Writing the .toC() and .toGo() methods is certainly a pain, and wants a generator. I didn't find writing: func (Ctx *Context) DomainInfo(Id Domid) (di *Dominfo, err error) { err = Ctx.CheckOpen() if err != nil { return } var cdi C.libxl_dominfo C.libxl_dominfo_init(&cdi) defer C.libxl_dominfo_dispose(&cdi) ret := C.libxl_domain_info(Ctx.ctx, &cdi, C.uint32_t(Id)) if ret != 0 { err = Error(-ret) return } di = cdi.toGo() return } to be terribly much more work than writing something like: func (c *Context) DomainInfo(domid DomID) (*DomInfo, error) { di := DomInfo{} if ret := domainInfo(c.Ctx, &di, uint32(domid)); ret != 0 { return nil, fmt.Errorf("unable to retrieve domain info: %v", ret) } di.Deref() return &di, nil } If we "more discipline to have defer'd dispose/free calls", the code will probably look largely the same. And if we had a an IDL for the libxl functions, we could have it generate the code above for the vast majority of cases. >> In fact, there's a Go-like clone of libxl_domain_config, but none for >> the elements of it; DeviceDisk, for instance, is simply defined as >> C.libxl_device_disk, and config->disks simply copied to the Disks >> element of the struct. That's just all wrong -- it's actually a C >> array; Go can only access the first element of it. How are you supposed >> to create a domain with more than one disk? > > This is simply because my fork is still WIP. If you look at [2], I'm telling > c-for-go > to not generate a new Go type for libxl_device_disk, whereas at [3] I tell > c-for-go > how to create libxl_domain_config. > >> Furthermore, these pointers are not re-set to `nil` after; but the biggest >> argument against it is not using Go-native types. <type>.Free() >> is called. This just seems very dangerous: It would be way to easy to >> introduce a use-after-free bug. > > In [4], the C pointer is set to nil inside the call to DomInfo.Free(). I meant something like DomainConfig. DomainConfig.Deref() will set DomainConfig.Disks (a pointer) equal to domain_config.disks. But DomainConfig.Free() doesn't set DomainConfig.Disks back to nil. So suppose you had the following code: dc DomainConfig; [do something to fill dc.ref87e08e4d] dc.Deref(); // Sets dc.CInfo to non-nil. // Do something dc.Free(); // Sets dc.ref* to nil, but leaves dc.CInfo alone // forget you called Free if dc.Disks.blah { // ## Use-after-free } I realize that deference is a clear bug either way; but if Disks was nil, it would cause a program crash, which is much better than whatever random corruption might happen otherwise. And libxl_domain_info.disks, this brings be back to the point about the IDL having more information. libxl_domain_info.disks isn't a pointer to an individual struct, it's a pointer to an array; and libxl_domain_info.num_disks is the size of the array. But the IDL doesn't actually have num_disks; it has Array(libxl_device_disks, "disks"), and the C generator generates the num_disks elemet from that. If we wrote a generator from the IDL, we could make it smart enough to use []Disks as the type there, and make the marshallers know how to use num_disks to appropriately size the resulting slice and copy the right number of values across. To do that with c-for-go, we'd have to do a lot of work teaching it what to do, if that's even possible. >> The in-tree bindings generally only create C structures temporarily, and >> do a full marshal and unmarshall into and out of Go structures. This >> means a lot of copying on every function call. But it also means that >> the callers can generally treat the Go structures like normal Go >> structures -- they don't have to worry about keeping track of them and >> freeing them or cleaning them up; they can let the GC deal with it, just >> like they deal with everything else. > > AFAICT, the generated code provides the ability to do this. The wrappers > just need more discipline to have defer'd dispose/free calls. If the wrapper > hides the calls to freeing/disposing the C pointers, then the caller can still > rely on the garbage collector like normal. So you mean, for example, after DomainInfo() calls DomInfo.Deref(), it will then call libxl_dominfo_dispose() on the C struct? >> 1. Keep separate structures, and do a full "deep copy", as the in-tree >> bindings do. Advantage: Callers can use GC like normal Go functions. >> Structure elements are translated to go-native types. Disadvantage: >> Copying overhead on every function call. > > Personally, I'm not worried about optimizing just yet. > >> 2. Use C types; do explicit allocate / free. Advantage: No copying on >> every function call. Disadvantage: Needing to remember to clean up / no >> GC; can't use Go-native types. > > We definitely don't want to export the C types through the Go API. > >> 4. Attempt to use SetFinalizer() to automatically do frees / structure >> clean-up [1]. Advantage: No / less copying on every function call, but >> can still treat structures like they'll be GC'd. Disadvantage: Requires >> careful thinking; GC may not be as effective if C-allocated memory >> greatly exceeds Go-allocated memory; can't use Go-native types for elements. > > If we start looking to use the Go runtime, we've gone in the wrong direction. I'm mostly trying to be complete in my analysis. I agree relying on this sort of magic isn't very appealing; but the biggest argument against it is not using Go-native types. That makes it more or less a non-starter. >> c-for-go seems to take the worst bits of #1 and #2: It requires explicit >> allocate / free, but also actually does a full copy of each structure >> whenever one "half" of it changes. > > Either way, we need to worry about the allocate/free. But that will at least > be > hidden by the wrappers. Not sure how to get around that part in any case. Of course "we the library authors" have to worry about allocate / free; but it's a much nicer interface for "them the library consumers" not to have to do so. At the moment you're both, but I'm hoping we'll have lots of "library consumers" who are not "library authors". :-) > I agree that doing a full copy and allowing callers to simply rely on Go's > garbage collector is the best approach. I'm just not entirely convinced that > cannot be accomplished using c-for-go. Right, and personally I'm not in principle opposed to using c-for-go, if it can be made to generate code the way we like it. My main fear is that we'll spend a bunch of time tweaking c-for-go, and after going back and forth and investing a lot of time in it, we'll end up either 1) giving up and writing our own generator anyway, or 2) accepting something sub-optimal because it's the best thing we could make c-for-go do. -George _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |