[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH 3/3] Port of mmap_batch_v2 to support paging in Xen



Hi Konrad,

Thanks for the quick turnaround. I'll incorporate your feedback and
resend. Some NOTES are inline below.

On Sat, Dec 17, 2011 at 9:40 AM, Konrad Rzeszutek Wilk
<konrad@xxxxxxxxxx> wrote:
> On Fri, Dec 16, 2011 at 10:22:21PM -0500, Adin Scannell wrote:
>> This wasn't ported from any patch, but was rewritten based on the XCP 2.6.32
>> tree.  The code structure is significantly different and this patch mirrors 
>> the
>> existing Linux code.
>>
>> The primary reason for need the V2 interface is to support foreign mappings
>> (i.e. qemu) of paged-out pages.  The libxc code will already retry mappings
>> when an ENOENT is returned.  The V2 interface provides a richer error value,
>> so the user-space code is capable of handling these errors specifically.
>
> Can you give more details on how to use paged-out pages. Perhaps a
> pointer to the xen's docs?

Hrm, in usual fashion the docs are a bit lacking.

Simply: the kernel has to do nothing to support paging (other than the
backend drivers which need to handle the grant EAGAIN case -- other
patch).  The only reason the V2 interface is needed here is to pass
back full error codes from the mmu_update()'s. Xen and libxc have a
mutual understanding that when you receive an ENOENT error code, you
back off and try again.

>>
>> Signed-off-by: Adin Scannell <adin@xxxxxxxxxxx>
>>
>> Index: linux/drivers/xen/xenfs/privcmd.c
>> ===================================================================
>> ---
>>  drivers/xen/xenfs/privcmd.c |   90 
>> ++++++++++++++++++++++++++++++++++++++++++-
>
> So that file just moved to drivers/xen/privcmd.c

Of course it has :)

>>  include/xen/privcmd.h       |   10 +++++
>>  2 files changed, 99 insertions(+), 1 deletions(-)
>>
>> diff --git a/drivers/xen/xenfs/privcmd.c b/drivers/xen/xenfs/privcmd.c
>> index dbd3b16..21cbb5a 100644
>> --- a/drivers/xen/xenfs/privcmd.c
>> +++ b/drivers/xen/xenfs/privcmd.c
>> @@ -70,7 +70,7 @@ static void free_page_list(struct list_head *pages)
>>   */
>>  static int gather_array(struct list_head *pagelist,
>>                       unsigned nelem, size_t size,
>> -                     void __user *data)
>> +                     const void __user *data)
>>  {
>>       unsigned pageidx;
>>       void *pagedata;
>> @@ -245,6 +245,15 @@ struct mmap_batch_state {
>>       xen_pfn_t __user *user;
>>  };
>>
>> +struct mmap_batch_v2_state {
>> +     domid_t domain;
>> +     unsigned long va;
>> +     struct vm_area_struct *vma;
>> +     int paged_out;
>
> Should this be unsigned int?

Noted below.

>> +
>> +     int __user *err;
>> +};
>> +
>>  static int mmap_batch_fn(void *data, void *state)
>>  {
>>       xen_pfn_t *mfnp = data;
>> @@ -260,6 +269,20 @@ static int mmap_batch_fn(void *data, void *state)
>>       return 0;
>>  }
>>
>> +static int mmap_batch_v2_fn(void *data, void *state)
>> +{
>> +     xen_pfn_t *mfnp = data;
>> +     struct mmap_batch_v2_state *st = state;
>> +
>> +     int rc = xen_remap_domain_mfn_range(st->vma, st->va & PAGE_MASK, 
>> *mfnp, 1,
>> +                                    st->vma->vm_page_prot, st->domain);
>
> You don't want to check that st is not NULL?

This could be an assertion as it's only used in the
ioctl_mmap_batch_v2 function where it's guaranteed to pass in a
non-null state.

I'll add an additional patch to the queue that adds these assertions
to both mmap_batch_fn and mmap_batch_fn_v2.

>> +             st->paged_out++;
> Is it possible that this ends overflowing and hitting 0?

I don't think this is an issue practically, as the only way to trigger
this would be to be on a 64bit machine and map an ungodly number of
pages with a single mmap_batch (MAX_INT).  For correctness though, I
can update this variable and the err variable in mmap_batch_state
which suffers from the same problem -- turn them into unsigned long.
This will be included in the additional patch.

>> +                        m.arr);
>> +
>> +     if (ret || list_empty(&pagelist))
>> +             goto out;
>> +
>> +     down_write(&mm->mmap_sem);
>> +
>> +     vma = find_vma(mm, m.addr);
>> +     ret = -EINVAL;
>> +     /* We allow multiple shots here, because this interface
>> +      * is used by libxc and mappings for specific pages will
>> +      * be retried when pages are paged-out (ENOENT). */
>> +     if (!vma ||
>> +         vma->vm_ops != &privcmd_vm_ops ||
>> +         (m.addr < vma->vm_start) ||
>> +         ((m.addr + (nr_pages << PAGE_SHIFT)) > vma->vm_end)) {
>> +             up_write(&mm->mmap_sem);
>> +             goto out;
>> +     }
>> +
>> +     state.domain = m.dom;
>
> Should you check the m.dom for incorrect ones? Like -1? or DOMID_IO?

I followed the example for mmap_batch, which just tries the call and
returns whatever error Xen gives.  I think that's the right approach
for these.

Cheers!
-Adin

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxx
http://lists.xensource.com/xen-devel


 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.