[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v4 06/10] memory: batch processing in acquire_resource()



On 03.07.2020 13:17, Julien Grall wrote:
> Hi,
> 
> On 03/07/2020 11:52, Paul Durrant wrote:
>>> -----Original Message-----
>>> From: Julien Grall <julien@xxxxxxx>
>>> Sent: 03 July 2020 11:36
>>> To: Michał Leszczyński <michal.leszczynski@xxxxxxx>; 
>>> xen-devel@xxxxxxxxxxxxxxxxxxxx
>>> Cc: luwei.kang@xxxxxxxxx; tamas.lengyel@xxxxxxxxx; Andrew Cooper 
>>> <andrew.cooper3@xxxxxxxxxx>; George
>>> Dunlap <george.dunlap@xxxxxxxxxx>; Ian Jackson <ian.jackson@xxxxxxxxxxxxx>; 
>>> Jan Beulich
>>> <jbeulich@xxxxxxxx>; Stefano Stabellini <sstabellini@xxxxxxxxxx>; Wei Liu 
>>> <wl@xxxxxxx>; paul@xxxxxxx
>>> Subject: Re: [PATCH v4 06/10] memory: batch processing in acquire_resource()
>>>
>>> (+ Paul as the author XENMEM_acquire_resource)
>>>
>>> Hi,
>>>
>>> On 30/06/2020 13:33, Michał Leszczyński wrote:
>>>> From: Michal Leszczynski <michal.leszczynski@xxxxxxx>
>>>>
>>>> Allow to acquire large resources by allowing acquire_resource()
>>>> to process items in batches, using hypercall continuation.
>>>>
>>>> Signed-off-by: Michal Leszczynski <michal.leszczynski@xxxxxxx>
>>>> ---
>>>>    xen/common/memory.c | 32 +++++++++++++++++++++++++++++---
>>>>    1 file changed, 29 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/xen/common/memory.c b/xen/common/memory.c
>>>> index 714077c1e5..3ab06581a2 100644
>>>> --- a/xen/common/memory.c
>>>> +++ b/xen/common/memory.c
>>>> @@ -1046,10 +1046,12 @@ static int acquire_grant_table(struct domain *d, 
>>>> unsigned int id,
>>>>    }
>>>>
>>>>    static int acquire_resource(
>>>> -    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg)
>>>> +    XEN_GUEST_HANDLE_PARAM(xen_mem_acquire_resource_t) arg,
>>>> +    unsigned long *start_extent)
>>>>    {
>>>>        struct domain *d, *currd = current->domain;
>>>>        xen_mem_acquire_resource_t xmar;
>>>> +    uint32_t total_frames;
>>>>        /*
>>>>         * The mfn_list and gfn_list (below) arrays are ok on stack for the
>>>>         * moment since they are small, but if they need to grow in future
>>>> @@ -1077,8 +1079,17 @@ static int acquire_resource(
>>>>            return 0;
>>>>        }
>>>>
>>>> +    total_frames = xmar.nr_frames;
>>>
>>> On 32-bit, the start_extent would be 26-bits wide which is not enough to
>>> cover all the xmar.nr_frames. Therefore, you want that check that it is
>>> possible to encode a continuation. Something like:
>>>
>>> /* Is the size too large for us to encode a continuation? */
>>> if ( unlikely(xmar.nr_frames > (UINT_MAX >> MEMOP_EXTENT_SHIFT)) )
>>>
>>>> +
>>>> +    if ( *start_extent ) > +    {
>>>> +        xmar.frame += *start_extent;
>>>> +        xmar.nr_frames -= *start_extent;
>>>
>>> As start_extent is exposed to the guest, you want to check if it is not
>>> bigger than xmar.nr_frames.
>>>
>>>> +        guest_handle_add_offset(xmar.frame_list, *start_extent);
>>>> +    }
>>>> +
>>>>        if ( xmar.nr_frames > ARRAY_SIZE(mfn_list) )
>>>> -        return -E2BIG;
>>>> +        xmar.nr_frames = ARRAY_SIZE(mfn_list);
>>>
>>> The documentation of the hypercall suggests that if you pass NULL, then
>>> it will return the maximum number value for nr_frames supported by the
>>> implementation. So technically a domain cannot use more than
>>> ARRAY_SIZE(mfn_list).
>>>
>>> However, you new addition conflict with the documentation. Can you
>>> clarify how a domain will know that it can use more than
>>> ARRAY_SIZE(mfn_list)?
>>
>> The domain should not need to know. It should be told the maximum number of 
>> frames of the type it wants. If we have to carve that up into batches inside 
>> Xen then the caller should not need to care, right?
> 
> In the current implementation, we tell the guest how many frames it can 
> request in a batch. This number may be much smaller that the maximum 
> number of frames of the type.
> 
> Furthermore this value is not tie to the xmar.type. Therefore, it is 
> valid for a guest to call this hypercall only once at boot to figure out 
> the maximum batch.
> 
> So while the change you suggest looks a good idea, I don't think it is 
> possible to do that with the current hypercall.

Doesn't the limit simply change to UINT_MAX >> MEMOP_EXTENT_SHIFT,
which then is what should be reported?

>>>> @@ -1135,6 +1146,14 @@ static int acquire_resource(
>>>>            }
>>>>        }
>>>>
>>>> +    if ( !rc )
>>>> +    {
>>>> +        *start_extent += xmar.nr_frames;
>>>> +
>>>> +        if ( *start_extent != total_frames )
>>>> +            rc = -ERESTART;
>>>> +    }
>>>> +
>>>>     out:
>>>>        rcu_unlock_domain(d);
>>>>
>>>> @@ -1600,7 +1619,14 @@ long do_memory_op(unsigned long cmd, 
>>>> XEN_GUEST_HANDLE_PARAM(void) arg)
>>>>
>>>>        case XENMEM_acquire_resource:
>>>>            rc = acquire_resource(
>>>> -            guest_handle_cast(arg, xen_mem_acquire_resource_t));
>>>> +            guest_handle_cast(arg, xen_mem_acquire_resource_t),
>>>> +            &start_extent);
>>>
>>> Hmmm... it looks like we forgot to check that start_extent is always 0
>>> when the hypercall was added.
>>>
>>> As this is exposed to the guest, it technically means that there no
>>> guarantee that start_extent will always be 0.
>>>
>>
>> I don't follow. A start extent != 0 means you are in a continuation. How can 
>> you check for 0 without breaking continuations?
> 
> I think you misundertood my point. My point is we never checked that 
> start_extent was 0. So a guest could validly pass a non-zero value to 
> start_extent and not break on older Xen release.
> 
> When this patch will be merged, such guest would behave differently. Or 
> did I miss any check/documentation for the start_extent value?

I think we may have done the same in the past already when enabling
sub-ops for use of continuations. A guest specifying a non-zero
start_extent itself is effectively a request for an undefined sub-op.
With, as a result, undefined behavior.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.