[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v3 07/10] xen/arm: Add handling write fault for dirty-page tracing
On Thu, 2013-08-15 at 13:24 +0900, Jaeyong Yoo wrote: > > Why don't we just context switch the slots for now, only for domains where > > log dirty is enabled, and then we can measure and see how bad it is etc. > > > Here goes the measurement results: Wow, that was quick, thanks. > For better understanding of trade-off between vlpt and page-table > walk in dirty-page handling, let's consider the following two cases: > - Migrating a single domain at a time: > - Migrating multiple domains concurrently: > > For each case, the metrics that we are going to see is the following: > - page-table walk overhead: for handling a single dirty-page, > page-table requires 6us and vlpt (improved version) requires 1.5us. > From this, we consider 4.5 us for pure overhead compared to vlpt. > And it happens every dirty-pages. map_domain_page is has a hash table structure in which the PTE entires are reference counted, however we don't clear the pte when the ref reaches 0 so if we immediately use it again we don't need to flush. But we may need to flush if there is a hash table collision. So in practice there will be a bit more overhead, I'm not sure how significant that will be. I suppose the chance of collision depends on the side of the guest. > - vlpt overhead: the only vlpt overhead is the flushes at context > switch. And flushing 34MB (which is for supporting 16GB domU) > virtual address range requires 130us. And it happens when two > migrating domUs are contexted switched. > > Here goes the results: > > - Migrating a domain at a time: > * page-table walk overhead: 4.5us * 611 times = 2.7ms > * vlpt overhead: 0 (no flush required) > > - Migrating two domains concurrently: > * page-table walk overhead: 4.5us * 8653 times = 39 ms > * vlpt overhead: 130us * 357 times = 46 ms The 611, 8653 and 357's in here are from an actual test, right? Out of interest what was the total time for each case? > Although page-table walk gives little bit better performance in > migrating two domains, I think it is better to choose vlpt due to > the following reasons: > - In the above tests, I did not run any workloads at migrating domU, > and IIRC, when I run gzip or bonnie++ in domU, the dirty-pages grow > to few thousands. Then, page-table walk overhead becomes few hundred > milli-seconds even in migrating a domain. > - I would expect that migrating a single domain would be used more > Frequently than migrating multiple domains at a time. Both of those seem like sound arguments to me. > One more thing: regarding your comments about tlb lockdown, which is: > > It occurs to me now that with 16 slots changing on context switch and > > a further 16 aliasing them (and hence requiring maintenance too) for > > the super pages it is possible that the TLB maintenance at context > > switch might get prohibitively expensive. We could address this by > > firstly only doing it when switching to/from domains which have log > > dirty mode enabled and then secondly by seeing if we can make use of > > global or locked down mappings for the static Xen .text/.data/.xenheap > > mappings and therefore allow us to use a bigger global flush. > > Unfortunately Cortex A15 looks like not supporting tlb lockdown. > http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.ddi0438d/CHDGEDA > E.html Oh well. > And, I am not sure that setting global of page table entry prevents being > flushed from TLB flush operation. > If it works, we may decrease the vlpt overhead a lot. yes, this is something to investigate, but not urgently I don't think. Ian. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |