|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 1/2] Interface for grant copy operation in libs.
On 22/06/16 14:29, Wei Liu wrote:
> On Wed, Jun 22, 2016 at 01:37:50PM +0100, David Vrabel wrote:
>> On 22/06/16 12:21, Wei Liu wrote:
>>> On Wed, Jun 22, 2016 at 10:37:24AM +0100, David Vrabel wrote:
>>>> On 22/06/16 09:38, Paulina Szubarczyk wrote:
>>>>> In a linux part an ioctl(gntdev, IOCTL_GNTDEV_GRANT_COPY, ..)
>>>>> system call is invoked. In mini-os the operation is yet not
>>>>> implemented. For other OSs there is a dummy implementation.
>>>> [...]
>>>>> --- a/tools/libs/gnttab/linux.c
>>>>> +++ b/tools/libs/gnttab/linux.c
>>>>> @@ -235,6 +235,51 @@ int osdep_gnttab_unmap(xengnttab_handle *xgt,
>>>>> return 0;
>>>>> }
>>>>>
>>>>> +int osdep_gnttab_grant_copy(xengnttab_handle *xgt,
>>>>> + uint32_t count,
>>>>> + xengnttab_grant_copy_segment_t *segs)
>>>>> +{
>>>>> + int i, rc;
>>>>> + int fd = xgt->fd;
>>>>> + struct ioctl_gntdev_grant_copy copy;
>>>>> +
>>>>> + copy.segments = calloc(count, sizeof(struct
>>>>> ioctl_gntdev_grant_copy_segment));
>>>>> + copy.count = count;
>>>>> + for (i = 0; i < count; i++)
>>>>> + {
>>>>> + copy.segments[i].flags = segs[i].flags;
>>>>> + copy.segments[i].len = segs[i].len;
>>>>> + if (segs[i].flags == GNTCOPY_dest_gref)
>>>>> + {
>>>>> + copy.segments[i].dest.foreign.ref = segs[i].dest.foreign.ref;
>>>>> + copy.segments[i].dest.foreign.domid =
>>>>> segs[i].dest.foreign.domid;
>>>>> + copy.segments[i].dest.foreign.offset =
>>>>> segs[i].dest.foreign.offset;
>>>>> + copy.segments[i].source.virt = segs[i].source.virt;
>>>>> + }
>>>>> + else
>>>>> + {
>>>>> + copy.segments[i].source.foreign.ref =
>>>>> segs[i].source.foreign.ref;
>>>>> + copy.segments[i].source.foreign.domid =
>>>>> segs[i].source.foreign.domid;
>>>>> + copy.segments[i].source.foreign.offset =
>>>>> segs[i].source.foreign.offset;
>>>>> + copy.segments[i].dest.virt = segs[i].dest.virt;
>>>>> + }
>>>>> + }
>>>>> +
>>>>> + rc = ioctl(fd, IOCTL_GNTDEV_GRANT_COPY, ©);
>>>>> + if (rc)
>>>>> + {
>>>>> + GTERROR(xgt->logger, "ioctl GRANT COPY failed %d ", errno);
>>>>> + }
>>>>> + else
>>>>> + {
>>>>> + for (i = 0; i < count; i++)
>>>>> + segs[i].status = copy.segments[i].status;
>>>>> + }
>>>>> +
>>>>> + free(copy.segments);
>>>>> + return rc;
>>>>> +}
>>>>
>>>> I know Wei asked for this but you've replaced what should be a single
>>>> pointer assignment with a memory allocation and two loops over all the
>>>> segments.
>>>>
>>>> This is a hot path and the two structures (the libxengnttab one and the
>>>> Linux kernel one) are both part of their respective ABIs and won't
>>>> change so Wei's concern that they might change in the future is unfounded.
>>>>
>>>
>>> The fundamental question is: will the ABI between the library and the
>>> kernel ever go mismatch?
>>>
>>> My answer is "maybe". My rationale is that everything goes across
>>> boundary of components need to be considered with caution. And I tend to
>>> assume the worst things will happen.
>>>
>>> To guarantee that they will never go mismatch is to have
>>>
>>> typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t;
>>>
>>> But that's not how the code is written.
>>>
>>> I would like to hear a third opinion. Is my concern unfounded? Am I too
>>> cautious? Is there any compelling argument that I missed?
>>>
>>> Somewhat related, can we have some numbers please? It could well be the
>>> cost of the two loops is much cheaper than whatever is going on inside
>>> the kernel / hypervisor. And it could turn out that the numbers render
>>> this issue moot.
>>
>> I did some (very) adhoc measurements and with the worst case of single
>> short segments for each ioctl, the optimized version of
>> osdep_gnttab_grant_copy() looks to be ~5% faster.
>>
>> This is enough of a difference that we should use the optimized version.
>>
>> The unoptimized version also adds an additional failure path (the
>> calloc) which would be best avoided.
>>
>
> Your test case includes a lot of noise in libc allocator, so...
>
> Can you give try the following patch (apply on top of Paulina's patch)?
> The basic idea is to provide scratch space for the structures. Note, the
> patch is compile test only.
[...]
> +#define COPY_SEGMENT_CACHE_SIZE 1024
Arbitrary limit on number of segments.
> + copy.segments = xgt->osdep_data;
Not thread safe.
I tried using alloca() which has <1% performance penalty but the failure
mode for alloca() is really bad so I would not recommend it.
I think the best solution is to allow the osdep code to provide the
implementation of xengnttab_grant_copy_segment_t, allowing the Linux
code to do:
typedef ioctl_gntdev_grant_copy_segment xengnttab_grant_copy_segment_t
You should still provide the generic structure as well, for those
platforms that don't provide their own optimized version.
David
_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |