[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-ia64-devel] RE: rid virtualization
On Tue, Sep 06, 2005 at 08:31:26AM +0800, Dong, Eddie wrote: > Do u have any measurement data for the locality in LVHPT Linux > code? You are the right person know this :-) I don't have any good measurement data, however in Linux: (1) RIDs of processes which communicate tend to be close together, as: - the communicating processes may have been started together, or - one process was created from the other process with fork(), which will assign two new (probably sequential) RIDs, or - one process was created from a server process using fork() in response to a client request. Now all three will likely have close together RIDs. (2) VPNs tend to be close together, and clustered at the bottom of regions. Text, data, and libraries are all allocated sequentially from the bottom of regions 2, 3 and 1 respectively. Communicating processes may often have similar address space layout (either because they are the same binary, or use similar libraries). Thus, given that most of the entropy is in the bottom bits for both RID and VPN, RID xor VPN is a very bad hash function. We need to give the bottom RID bits higher significance. I now realise that the current Xen mangling function, which moves them bottom bits into bits 16..23, actually doesn't achieve this, unless the VHPT is very large. bit 16 of the RID produces bit 21 of the thash address, i.e. sequential RIDs are spaced 2MB apart in the VHPT. RIDs spaced 8 apart, as consecutive Linux processes are, are spaced 16MB apart in the VHPT. I think the size of the VHPT in Xen is 16MB, so actually Linux processes with consecutive RIDs collide, and it is almost as pathological as not having mangling. I think the ideal mangling function would be to reverse the bottom n bits of the RID, where n is the number of bits used in the hash (and depends on the VHPT size). Thus consecutive RIDs would result in accessing diametrically opposite portions of the VHPT, while consecutive VPNs achieve cache locality within the halves. Further apart processes would be progressively more likely to collide, but are also less likely to be communicating. Unfortunately, Itanium doesn't provide bit-reversal instructions, which is why in the Linux long VHPT work I decided to just do byte-reversal on the bottom n/8 bytes, to approximate this bit-reversal. For typical VHPT sizes n/8 is around 2. Obviously one could write functions that better approximate bit- reversal, in the extreme case using a lookup table for each byte or nibble, though I'm not sure whether that's worthwhile. I think it would be worth changing the Xen mangling so that it switches bytes 1 and 2 instead of 1 and 3, and seeing if that makes an improvement. Matt _______________________________________________ Xen-ia64-devel mailing list Xen-ia64-devel@xxxxxxxxxxxxxxxxxxx http://lists.xensource.com/xen-ia64-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |