[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] [Xen-ia64-devel] [PATCH] Enable hash vtlb
This patch is to enable hash vtlb on para domains. The kernel build time is about 2040s without this patch. The kernel build time is about 2085s with this patch. Means this patch loses 2% performance. But this is due to below two reasons. 1. This patch enables dom0 support non-contiguous memory, though the memory allocated to dom0 is contiguous. No tlbs with page size > 16K will be inserted into machine TLB cache. 2. Fully emulate itc instruction to fix potential issue. Previously emulation of itc is to only insert one 16k tlb into VHPT without purging VHPT. The issue comes up when guest inserting a >16k tlb mapping, the old tlb mapping is not purged. Why this issue doesn’t comes up, the reason is all tlb mappings With page size> 16K are identity mappings, there is no mappings change and mapping attributes change. But if considering hugetlb, the issue pops up. See below scenario. 1. A process uses hugetlb to map a file, and create a child process which shares this memory block. 2. Linux kernel uses copy-on-write to handle this sharing, that means at this time this hugetlb is readonly for child process. 3. The child processe may read this memory block, which cause many 16k Tlbs with readonly attribute inserted into VHPT. 4. Then one processes may write this memory block, that will cause a Hugetlb with r/w attribute is inserted, according the emulation of itc, only one 16k tlb is inserted to VHPT without VHPT purge. So many old tlbs with readonly attribute still reside in VHPT. When child process accesses memory with readonly attribute, a ACCESS_RIGHT fault is delivered to linux kernel, The linux kernel get confused, this area has already been r/w attribute, why there is ACCESS_RIGHT happening on this area. I don't know the exact result, but this is definitely not correct. Another issue about emulation of itc is, hypervisor should check if there are guest trs which is overlapped with this mapping, if yes, mca happens on guest OS. Adding above handlings, hypervisor can fully virtualizes itc instruction. Moreover, this patch implements collision chain of long VHPT, there are many spaces we can tune the performance. 1. What should the ratio of memory space of hash table and memory space of Collision chain? Current is 1:1. 2. What's the max collision chain length? Current is 15. 3. How to cycle collision chain? Current implementation is cycle all collision chain. 4. What's the best way of mangling rid? Current we exchange 1,3 byte. ..... Comments welcome Signed-off-by: Anthony Xu <anthony.xu@xxxxxxxxx> Thanks, -Anthony Attachment:
enable_hash_vtlb_0407.diff _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |