[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] Rework CACHE to use a FreeList



On 22/08/2022 11:13, Owen Smith wrote:


-----Original Message-----
From: Paul Durrant <xadimgnik@xxxxxxxxx>
Sent: 22 August 2022 09:22
To: Owen Smith <owen.smith@xxxxxxxxxx>; win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] Rework CACHE to use a FreeList

[CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments 
unless you have verified the sender and know the content is safe.

On 22/08/2022 08:24, Owen Smith wrote:


-----Original Message-----
From: win-pv-devel <win-pv-devel-bounces@xxxxxxxxxxxxxxxxxxxx> On
Behalf Of Paul Durrant
Sent: 19 August 2022 17:23
To: win-pv-devel@xxxxxxxxxxxxxxxxxxxx
Subject: Re: [PATCH] Rework CACHE to use a FreeList

[CAUTION - EXTERNAL EMAIL] DO NOT reply, click links, or open attachments 
unless you have verified the sender and know the content is safe.

On 19/08/2022 11:15, Owen Smith wrote:
The slab allocation method will allogate about a PAGE worth of
objects, and every object will be initialized. If the objects
initializer allocates any resources, this can result in resource
starvation. A particular bad example of this is the grant table cache, where a 
page of gnttab objects is 253 objects.
This is highlighted by xenvif's queues, where the receiver requires
257 grant references (1 for the ring, and 256 for the ring slots)
which results in 2 slabs, or 506 gnttab objects, reserving 506 grant references.

Use a FreeList to contain individual objects that are not in use.
This trades an increase in smaller allocations for reducing the wastage of 
unused objects.

Signed-off-by: Owen Smith <owen.smith@xxxxxxxxxx>

Sorry I didn't comment on the RFC; I was on PTO and then snowed under with mail 
etc.
I think this is a sledgehammer to crack a nut. I agree that XENVIF is being a 
grant ref hog... but the correct thing to do there is to re-work the grant 
table cache, not the underlying slab allocator; which I think is fine. The 
problem is (ab)using the slab allocator's Ctor to get the reference. So a free 
list implementation is fine... just in the gnttab code, rather than the cache 
code.

     Paul



This also hits any other use of the CACHE interface where the object's Ctor 
allocates a resource and there are many objects in a slab. XenVbd's segment 
cache allocates a page, which can lead to failures with large number of VBDs 
(one of our automated test cases fails with a 0x4B NO_PAGES_AVAILABLE when run 
with 256 VBDs - I cant remember exactly how many VBDs are required).

The alternative here would be to rework the GNTTAB interface to use a
free list rather than the CACHE interface, and rework any other uses
where the Ctor could allocate significant resources (possibly using a
lazy allocation of memory, but this could Fill without enough Spill to
still lead to resource exhaustion)


Ok, so the problem is not with the use of slabs per se; it's with the eager 
calls to the Ctor. How about, rather than constructing the entire slab, we 
construct a batch and have a separate 'constructed' and 'allocated' masks in 
the control structure to track what we've done.
When we run out of constructed objects, we construct another batch. When we 
have more than two batches free, we free up one of them. How does that sound?

    Paul


Yes, if Ctors are too eager in allocating finite resources, they tend to exhaust 
resources too quickly. I did look at lazy initialization in the Cache, but 
haven’t got too far yet, and it likely will need a new Cache interface version 
to pass the step-size


I can take a look over the next few days; shouldn't be too much churn in the code.

  Paul





 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.