[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH v9 02/11] xen/domain: Add vmtrace_size domain creation parameter



On 03.02.2021 17:04, Andrew Cooper wrote:
> On 02/02/2021 09:04, Jan Beulich wrote:
>> On 02.02.2021 00:26, Andrew Cooper wrote:
>>> --- a/xen/common/domain.c
>>> +++ b/xen/common/domain.c
>>> @@ -132,6 +132,56 @@ static void vcpu_info_reset(struct vcpu *v)
>>>      v->vcpu_info_mfn = INVALID_MFN;
>>>  }
>>>  
>>> +static void vmtrace_free_buffer(struct vcpu *v)
>>> +{
>>> +    const struct domain *d = v->domain;
>>> +    struct page_info *pg = v->vmtrace.pg;
>>> +    unsigned int i;
>>> +
>>> +    if ( !pg )
>>> +        return;
>>> +
>>> +    v->vmtrace.pg = NULL;
>>> +
>>> +    for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ )
>>> +    {
>>> +        put_page_alloc_ref(&pg[i]);
>>> +        put_page_and_type(&pg[i]);
>>> +    }
>>> +}
>>> +
>>> +static int vmtrace_alloc_buffer(struct vcpu *v)
>>> +{
>>> +    struct domain *d = v->domain;
>>> +    struct page_info *pg;
>>> +    unsigned int i;
>>> +
>>> +    if ( !d->vmtrace_size )
>>> +        return 0;
>>> +
>>> +    pg = alloc_domheap_pages(d, get_order_from_bytes(d->vmtrace_size),
>>> +                             MEMF_no_refcount);
>>> +    if ( !pg )
>>> +        return -ENOMEM;
>>> +
>>> +    for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ )
>>> +        if ( unlikely(!get_page_and_type(&pg[i], d, PGT_writable_page)) )
>>> +            goto refcnt_err;
>>> +
>>> +    /*
>>> +     * We must only let vmtrace_free_buffer() take any action in the 
>>> success
>>> +     * case when we've taken all the refs it intends to drop.
>>> +     */
>>> +    v->vmtrace.pg = pg;
>>> +    return 0;
>>> +
>>> + refcnt_err:
>>> +    while ( i-- )
>>> +        put_page_and_type(&pg[i]);
>>> +
>>> +    return -ENODATA;
>> Would you mind at least logging how many pages may be leaked
>> here? I also don't understand why you don't call
>> put_page_alloc_ref() in the loop - that's fine to do prior to
>> the put_page_and_type(), and will at least limit the leak.
>> The buffer size here typically isn't insignificant, and it
>> may be helpful to not unnecessarily defer the freeing to
>> relinquish_resources() (assuming we will make that one also
>> traverse the list of "extra" pages, but I understand that's
>> not going to happen for 4.15 anymore anyway).
>>
>> Additionally, while I understand you're not in favor of the
>> comments we have on all three similar code paths, I think
>> replicating their comments here would help easily spotting
>> (by grep-ing e.g. for "fishy") all instances that will need
>> adjusting once we have figured a better overall solution.
> 
> How is:
> 
>     for ( i = 0; i < (d->vmtrace_size >> PAGE_SHIFT); i++ )
>         if ( unlikely(!get_page_and_type(&pg[i], d, PGT_writable_page)) )
>             /*
>              * The domain can't possibly know about this page yet, so
> failure
>              * here is a clear indication of something fishy going on.
>              */
>             goto refcnt_err;
> 
>     /*
>      * We must only let vmtrace_free_buffer() take any action in the success
>      * case when we've taken all the refs it intends to drop.
>      */
>     v->vmtrace.pg = pg;
>     return 0;
> 
>  refcnt_err:
>     /*
>      * We can theoretically reach this point if someone has taken 2^43
> refs on
>      * the frames in the time the above loop takes to execute, or
> someone has
>      * made a blind decrease reservation hypercall and managed to pick the
>      * right mfn.  Free the memory we safely can, and leak the rest.
>      */
>     while ( i-- )
>     {
>         put_page_alloc_ref(&pg[i]);
>         put_page_and_type(&pg[i]);
>     }
> 
>     return -ENODATA;
> 
> this?

Much better, thanks. Remains the question of logging the
suspected leak of perhaps many pages. But either way
Acked-by: Jan Beulich <jbeulich@xxxxxxxx>

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.