[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH RESEND RFC 0/8] Memory scrubbing from idle loop
>> This series adds support for scrubbing released pages from idle loop, >> making guest destruction significantly faster. For example, destroying a >> 1TB guest can now be completed in slightly over 1 minute as opposed to >> about 9 minutes using existing scrubbing algorithm. > What is this 1m now? I'd have thought that freeing memory would become > an O(1) splice from the domain pagelist onto the to-be-scrubbed list, > although I see that isn't quite how you have implemented it. I believe we are spending lots of time in relinquish_memory() due to hypercall restarts. And in general relinquish_memory() is linear in terms of number of pages. > > As a couple of other thoughts, how about getting rid of the boot time > explicit scrub by initialising the entire heap as needing scrubbing? > That way, scrubbing will start happening as soon as the APs hit their > idle loops, rather than later in one explicit action once dom0 is > constructed. I can try this. > > How about marking pages as needing scrubbing at the end of alloc(), > rather than in free() ? That way you don't need to walk the pagelists > just to set the pending_scrub flag. Do we know that all allocations need a scrubbing at the release time? We have 'scrub = !!d->is_dying' so it's possible that not all freed pages need to be scrubbed. I was also thinking that really we need to only mark the head as needing a scrub so perhaps we don't need to walk the whole list setting the dirty bit. > >> The downside of this series is that we sometimes fail to allocate high-order >> sets of pages since dirty pages may not yet be merged into higher-order sets >> while they are waiting to be scrubbed. > I presume there is no sensible way to try and prioritise the pages which > would have greatest effect for recreating higher-order sets? I am currently scrubbing as 'for ( order = MAX_ORDER; order >= 0; order-- )' but I wonder whether I should do this in ascending order so that dirty chunks are merged up sooner. > >> Briefly, the new algorithm places dirty pages at the end of heap's page list >> for each node/zone/order to avoid having to scan full list while searching >> for dirty pages. One processor form each node checks whether the node has any >> dirty pages and, if such pages are found, scrubs them. Scrubbing itself >> happens without holding heap lock so other users may access heap in the >> meantime. If while idle loop is scrubbing a particular chunk of pages this >> chunk is requested by the heap allocator, scrubbing is immediately stopped. > Why not maintain two lists? That way it is O(1) to find either a clean > or dirty free page. Since dirty pages are always at the tail of page lists we are not really searching the lists. As soon as a clean page is found (starting from the tail) we can stop. > >> On the allocation side, alloc_heap_pages() first tries to satisfy allocation >> request using only clean pages. If this is not possible, the search is >> repeated and dirty pages are scrubbed by the allocator. >> >> This series needs to be fixed up for ARM and may have a few optimizations >> added (and possibly some cleanup). I also need to decide what to do about >> nodes without processors. >> >> I originally intended to add per-node heap locks > Is this why scrubbing is limited to a single processor per node? What > kind of performance impact does this give? My reasoning was that one processor would already make the memory controller busy enough so that adding another core banging on it wouldn't make things any better. But I haven't made any comparisons with multiple cores doing this. -boris _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx https://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |