[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v3 0/1] xen/blkback: Squeeze page pools if a memory pressure



On Mon, 9 Dec 2019 12:08:10 +0100 "Jürgen Groß" <jgross@xxxxxxxx> wrote:

>On 09.12.19 11:52, SeongJae Park wrote:
>> On Mon, 9 Dec 2019 11:15:22 +0100 "Jürgen Groß" <jgross@xxxxxxxx> wrote:
>>
>>> On 09.12.19 10:46, Durrant, Paul wrote:
>>>>> -----Original Message-----
>>>>> From: Jürgen Groß <jgross@xxxxxxxx>
>>>>> Sent: 09 December 2019 09:39
>>>>> To: Park, Seongjae <sjpark@xxxxxxxxxx>; axboe@xxxxxxxxx;
>>>>> konrad.wilk@xxxxxxxxxx; roger.pau@xxxxxxxxxx
>>>>> Cc: linux-block@xxxxxxxxxxxxxxx; linux-kernel@xxxxxxxxxxxxxxx; Durrant,
>>>>> Paul <pdurrant@xxxxxxxxxx>; sj38.park@xxxxxxxxx; xen-
>>>>> devel@xxxxxxxxxxxxxxxxxxxx
>>>>> Subject: Re: [PATCH v3 0/1] xen/blkback: Squeeze page pools if a memory
>>>>> pressure
>>>>>
>>>>> On 09.12.19 09:58, SeongJae Park wrote:
>>>>>> Each `blkif` has a free pages pool for the grant mapping.  The size of
>>>>>> the pool starts from zero and be increased on demand while processing
>>>>>> the I/O requests.  If current I/O requests handling is finished or 100
>>>>>> milliseconds has passed since last I/O requests handling, it checks and
>>>>>> shrinks the pool to not exceed the size limit, `max_buffer_pages`.
>>>>>>
>>>>>> Therefore, `blkfront` running guests can cause a memory pressure in the
>>>>>> `blkback` running guest by attaching a large number of block devices and
>>>>>> inducing I/O.
>>>>>
>>>>> I'm having problems to understand how a guest can attach a large number
>>>>> of block devices without those having been configured by the host admin
>>>>> before.
>>>>>
>>>>> If those devices have been configured, dom0 should be ready for that
>>>>> number of devices, e.g. by having enough spare memory area for ballooned
>>>>> pages.
>>>>>
>>>>> So either I'm missing something here or your reasoning for the need of
>>>>> the patch is wrong.
>>>>>
>>>>
>>>> I think the underlying issue is that persistent grant support is hogging 
>>>> memory in the backends, thereby compromising scalability. IIUC this patch 
>>>> is essentially a band-aid to get back to the scalability that was possible 
>>>> before persistent grant support was added. Ultimately the right answer 
>>>> should be to get rid of persistent grants support and use grant copy, but 
>>>> such a change is clearly more invasive and would need far more testing.
>>>
>>> Persistent grants are hogging ballooned pages, which is equivalent to
>>> memory only in case of the backend's domain memory being equal or
>>> rather near to its max memory size.
>>>
>>> So configuring the backend domain with enough spare area for ballooned
>>> pages should make this problem much less serious.
>>>
>>> Another problem in this area is the amount of maptrack frames configured
>>> for a driver domain, which will limit the number of concurrent foreign
>>> mappings of that domain.
>>
>> Right, similar problems from other backends are possible.
>>
>>>
>>> So instead of having a blkback specific solution I'd rather have a
>>> common callback for backends to release foreign mappings in order to
>>> enable a global resource management.
>>
>> This patch is also based on a common callback, namely the shrinker callback
>> system.  As the shrinker callback is designed for the general memory pressure
>> handling, I thought this is a right one to use.  Other backends having 
>> similar
>> problems can use this in their way.
>
> But this is addressing memory shortage only and it is acting globally.
>
> What I'd like to have in some (maybe distant) future is a way to control
> resource usage per guest. Why would you want to throttle performance of
> all guests instead of only the one causing the pain by hogging lots of
> resources?

Good point.  I was also concerned about the performance fairness at first, but
settled in this ugly but simple solution mainly because my worst-case
performance test (detailed in 1st patch's commit msg) shows no visible
performance degradation, though it is a minimal test on my test environment.

Anyway, I agree with your future direction.

>
> The new backend callback should (IMO) have a domid as parameter for
> specifying which guest should be taken away resources (including the
> possibility to select "any domain").
>
> It might be reasonable to have your shrinker hook in e.g. xenbus for
> calling the backend callbacks. And you could have another agent in the
> grant driver reacting on shortage of possible grant mappings.
>
> I don't expect you to implement all of that at once, but I think having
> that idea in mind when addressing current issues would be nice. So as a
> starting point you could move the shrinker hook to xenbus, add the
> generic callback to struct xenbus_driver, populate that callback in
> blkback and call it in the shrinker hook with "any domain". This would
> enable a future extension to other backends and a dynamic resource
> management in a natural way.

Appreciate this kind and detailed advice.  I will post the second version
applying your comments, soon.


Thanks,
SeongJae Park

>
>
>Juergen
>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.