[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 5/6] xen-gntalloc: Userspace grant allocation driver



On 12/15/2010 08:05 PM, Jeremy Fitzhardinge wrote:
> On 12/15/2010 06:18 AM, Daniel De Graaf wrote:
>> On 12/14/2010 05:40 PM, Jeremy Fitzhardinge wrote:
>>> On 12/14/2010 02:06 PM, Daniel De Graaf wrote:
>>>>>> +static int gntalloc_mmap(struct file *filp, struct vm_area_struct *vma)
>>>>>> +{
>>>>>> +        struct gntalloc_file_private_data *priv = filp->private_data;
>>>>>> +        struct gntalloc_gref *gref;
>>>>>> +
>>>>>> +        if (debug)
>>>>>> +                printk("%s: priv %p, page %lu\n", __func__,
>>>>>> +                       priv, vma->vm_pgoff);
>>>>>> +
>>>>>> +        /*
>>>>>> +         * There is a 1-to-1 correspondence of grant references to 
>>>>>> shared
>>>>>> +         * pages, so it only makes sense to map exactly one page per
>>>>>> +         * call to mmap().
>>>>>> +         */
>>>>> Single-page mmap makes sense if the only possible use-cases are for
>>>>> single-page mappings, but if you're talking about framebuffers and the
>>>>> like is seems like a very awkward way to use mmap.  It would be cleaner
>>>>> from an API perspective to have a user-mode defined flat address space
>>>>> indexed by pgoff which maps to an array of grefs, so you can sensibly do
>>>>> a multi-page mapping.
>>>>>
>>>>> It would also allow you to hide the grefs from usermode entirely.  Then
>>>>> its just up to usermode to choose suitable file offsets for itself.
>>>> I considered this, but wanted to keep userspace compatability with the
>>>> previously created interface.
>>> Is that private to you, or something in broader use?
>> This module was used as part of Qubes (http://www.qubes-os.org). The device
>> path has changed (/dev/gntalloc to /dev/xen/gntalloc), and the API change
>> adds useful functionality, so I don't think we must keep compatibility. This
>> will also allow cleaning up the interface to remove parameters that make no
>> sense (owner_domid, for example).
> 
> Ah, right.  Well that means it has at least been prototyped, but I don't
> think we should be constrained by the original ABI if we can make clear
> improvements.
> 
>>>>  If there's no reason to avoid doing so, I'll
>>>> change the ioctl interface to allocate an array of grants and calculate the
>>>> offset similar to how gntdev does currently (picks a suitable open slot).
>>> I guess there's three options: you could get the kernel to allocate
>>> extents, make usermode do it, or have one fd per extent and always start
>>> from offset 0.  I guess the last could get very messy if you want to
>>> have lots of mappings...  Making usermode define the offsets seems
>>> simplest and most flexible, because then they can stitch together the
>>> file-offset space in any way that's convenient to them (you just need to
>>> deal with overlaps in that space).
>> Would it be useful to also give userspace control over the offsets in gntdev?
>>
>> One argument for doing it in the kernel is to avoid needing to track what
>> offsets are already being used (and then having the kernel re-check that).
> 
> Hm, yeah, that could be a bit fiddly.  I guess you'd need to stick them
> into an rbtree or something.

Another option that provides more flexibility - have a flag in the create
operation, similar to MAP_FIXED in mmap, that allows userspace to mandate
the offset if it wants control, but default to letting the kernel handle it.
We already have a flags field for making the grant writable, this is just
another bit.

>> While this isn't hard, IOCTL_GNTDEV_GET_OFFSET_FOR_VADDR only exists in
>> order to relieve userspace of the need to track its mappings, so this
>> seems to have been a concern before.
> 
> It would be nice to have them symmetric.  However, its easy to implement
> GET_OFFSET_FOR_VADDR either way - given a vaddr, you can look up the vma
> and return its pgoff.
> 
> It looks like GET_OFFSET_FOR_VADDR is just used in xc_gnttab_munmap() so
> that libxc can recover the offset and the page count from the vaddr, so
> that it can pass them to IOCTL_GNTDEV_UNMAP_GRANT_REF.
> 
> Also, it seems to fail unmaps which don't exactly correspond to a
> MAP_GRANT_REF.  I guess that's OK, but it looks a bit strange.

So, implementing an IOCTL_GNTALLOC_GET_OFFSET_FOR_VADDR would be useful in
order to allow gntalloc munmap() to be similar to gnttab's. If we want to
allow a given offset to be mapped to multiple domains, we couldn't just
return the offset; it would have to be a list of grant references, and
the destroy ioctl would need to take the grant reference.

>> Another use case of gntalloc that may prove useful is to have more than
>> one application able to map the same grant within the kernel.
> 
> So you mean have gntalloc allocate one page and the allow multiple
> processes to map and use it?  In that case it would probably be best
> implemented as a filesystem, so you can give proper globally visible
> names to the granted regions, and mmap them as normal files, like shm.

That seems like a better way to expose this functionality. I didn't have
a use case for multiple processes mapping a grant, just didn't want to
prevent doing it in the future if it was a trivial change. Since it's
more complex to implement a filesystem, I think someone needs to find a
use for it before it's written. I believe the current code lets you map
the areas in multiple processes if you pass the file descriptor around
with fork() or using unix sockets; that seems sufficient to me.
 
>> Agreed; once mapped, the frame numbers (GFN & MFN) won't change until
>> they are unmapped, so pre-populating them will be better.
> 
> Unless of course you don't want to map the pages in dom0 at all; if you
> just want dom0 to be a facilitator for shared pages between two other
> domains.  Does Xen allow a page to be granted to more than one domain at
> once?

I think so; you'd have to use multiple grant table entries in order to do
that, and it might trigger some hypervisor warnings when the shared page has
an unexpectedly high refcount when being unmapped (HVM mappings already
trigger such warnings, so perhaps they need to be removed). I can't think
of an immediate use for a 3-domain shared page, but that doesn't mean that
such a use doesn't exist. Perhaps some kind of shared page cache, exported
using read-only grants by dom0?

Having dom0 create a pair of grant table entries to let two domUs
communicate seems like a hack to get around the lack of gntalloc in either
domU. This module works perfectly fine in a domU, and the use of a shared
page must be asymmetric no matter how you set it up, so you don't save all
that much code by making the setup identical.

Anyway, you don't have to call mmap() to let another domain access the shared
pages; they are mappable as soon as the ioctl() returns, and remain so
until you call the removal ioctl(). So if you do call mmap(), you probably
expect to use the mapping.

-- 
Daniel De Graaf
National Security Agency

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.