[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.
On 22/06/16 14:29, Wei Liu wrote: > On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote: >> On 22/06/16 12:21, Wei Liu wrote: >>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote: >>>> On 22/06/16 09:38, Paulina Szubarczyk wrote: >>>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..) >>>>> system call is invoked. In mini-os the operation is yet not >>>>> implemented. For other OSs there is a dummy implementation. >>>> [...] >>>>> --- a/tools/libs/gnttab/linux.c >>>>> +++ b/tools/libs/gnttab/linux.c >>>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt, >>>>> return 0; >>>>> } >>>>> >>>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt, >>>>> + uint32_t count, >>>>> + xengnttab_grant_copy_segment_t *segs) >>>>> +{ >>>>> + int i, rc; >>>>> + int fd = xgt->fd; >>>>> + struct ioctl_gntdev_grant_copy copy; >>>>> + >>>>> + copy.segments = calloc(count, sizeof(struct >>>>> ioctl_gntdev_grant_copy_segment)); >>>>> + copy.count = count; >>>>> + for (i = 0; i < count; i++) >>>>> + { >>>>> + copy.segments[i].flags = segs[i].flags; >>>>> + copy.segments[i].len = segs[i].len; >>>>> + if (segs[i].flags == GNTCOPY_dest_gref) >>>>> + { >>>>> + copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref; >>>>> + copy.segments[i].dest.foreign.domid = >>>>> segs[i].dest.foreign.domid; >>>>> + copy.segments[i].dest.foreign.offset = >>>>> segs[i].dest.foreign.offset; >>>>> + copy.segments[i].source.virt = segs[i].source.virt; >>>>> + } >>>>> + else >>>>> + { >>>>> + copy.segments[i].source.foreign.ref = >>>>> segs[i].source.foreign.ref; >>>>> + copy.segments[i].source.foreign.domid = >>>>> segs[i].source.foreign.domid; >>>>> + copy.segments[i].source.foreign.offset = >>>>> segs[i].source.foreign.offset; >>>>> + copy.segments[i].dest.virt = segs[i].dest.virt; >>>>> + } >>>>> + } >>>>> + >>>>> + rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, ©); >>>>> + if (rc) >>>>> + { >>>>> + GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno); >>>>> + } >>>>> + else >>>>> + { >>>>> + for (i = 0; i < count; i++) >>>>> + segs[i].status = copy.segments[i].status; >>>>> + } >>>>> + >>>>> + free(copy.segments); >>>>> + return rc; >>>>> +} >>>> >>>> I know Wei asked for this but you've replaced what should be a single >>>> pointer assignment with a memory allocation and two loops over all the >>>> segments. >>>> >>>> This is a hot path and the two structures (the libxengnttab one and the >>>> Linux kernel one) are both part of their respective ABIs and won't >>>> change so Wei's concern that they might change in the future is unfounded. >>>> >>> >>> The fundamental question is: will the ABI between the library and the >>> kernel ever go mismatch? >>> >>> My answer is "maybe". My rationale is that everything goes across >>> boundary of components need to be considered with caution. And I tend to >>> assume the worst things will happen. >>> >>> To guarantee that they will never go mismatch is to have >>> >>> typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t; >>> >>> But that's not how the code is written. >>> >>> I would like to hear a third opinion. Is my concern unfounded? Am I too >>> cautious? Is there any compelling argument that I missed? >>> >>> Somewhat related, can we have some numbers please? It could well be the >>> cost of the two loops is much cheaper than whatever is going on inside >>> the kernel / hypervisor. And it could turn out that the numbers render >>> this issue moot. >> >> I did some (very) adhoc measurements and with the worst case of single >> short segments for each ioctl, the optimized version of >> osdep_gnttab_grant_copy() looks to be ~5% faster. >> >> This is enough of a difference that we should use the optimized version. >> >> The unoptimized version also adds an additional failure path (the >> calloc) which would be best avoided. >> > > Your test case includes a lot of noise in libc allocator, so... > > Can you give try the following patch (apply on top of Paulina's patch)? > The basic idea is to provide scratch space for the structures. Note, the > patch is compile test only. [...] > +#define COPY_SEGMENT_CACHE_SIZE 1024 Arbitrary limit on number of segments. > + copy.segments = xgt->osdep_data; Not thread safe. I tried using alloca() which has <1% performance penalty but the failure mode for alloca() is really bad so I would not recommend it. I think the best solution is to allow the osdep code to provide the implementation of xengnttab_grant_copy_segment_t, allowing the Linux code to do: typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t You should still provide the generic structure as well, for those platforms that don't provide their own optimized version. David _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |