[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Rework CACHE to use a FreeList



On 22/08/2022 08:24, Owen Smith wrote:


-----Original Message-----
From: win-pv-devel <win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On Behalf Of 
Paul Durrant
Sent: 19 August 2022 17:23
To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] Rework CACHE to use a FreeList

[CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments 
unless you have verified the sender and know the content is safe.

On 19/08/2022 11:15, Owen Smith wrote:
The slab allocation method will allogate about a PAGE worth of
objects, and every object will be initialized. If the objects
initializer allocates any resources, this can result in resource
starvation. A particular bad example of this is the grant table cache, where a 
page of gnttab objects is 253 objects.
This is highlighted by xenvif's queues, where the receiver requires
257 grant references (1 for the ring, and 256 for the ring slots)
which results in 2 slabs, or 506 gnttab objects, reserving 506 grant references.

Use a FreeList to contain individual objects that are not in use. This
trades an increase in smaller allocations for reducing the wastage of unused 
objects.

Signed-off-by: Owen Smith <owen.smith@xxxxxxxxxx>

Sorry I didn't comment on the RFC; I was on PTO and then snowed under with mail 
etc.
I think this is a sledgehammer to crack a nut. I agree that XENVIF is being a 
grant ref hog... but the correct thing to do there is to re-work the grant 
table cache, not the underlying slab allocator; which I think is fine. The 
problem is (ab)using the slab allocator's Ctor to get the reference. So a free 
list implementation is fine... just in the gnttab code, rather than the cache 
code.

    Paul



This also hits any other use of the CACHE interface where the object's Ctor 
allocates a resource and there are many objects in a slab. XenVbd's segment 
cache allocates a page, which can lead to failures with large number of VBDs 
(one of our automated test cases fails with a 0x4B NO_PAGES_AVAILABLE when run 
with 256 VBDs - I cant remember exactly how many VBDs are required).

The alternative here would be to rework the GNTTAB interface to use a free list 
rather than the CACHE interface, and rework any other uses where the Ctor could 
allocate significant resources (possibly using a lazy allocation of memory, but 
this could Fill without enough Spill to still lead to resource exhaustion)


Ok, so the problem is not with the use of slabs per se; it's with the eager calls to the Ctor. How about, rather than constructing the entire slab, we construct a batch and have a separate 'constructed' and 'allocated' masks in the control structure to track what we've done. When we run out of constructed objects, we construct another batch. When we have more than two batches free, we free up one of them. How does that sound?

  Paul




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.