[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v9 15/27] xsplice, symbols: Implement fast symbol names -> virtual addresses lookup



>>> On 25.04.16 at 17:35, <konrad.wilk@xxxxxxxxxx> wrote:
> The current mechanism is geared towards fast virtual address ->
> symbol names lookup. This is fine for the normal use cases
> (BUG_ON, WARN_ON, etc), but for xSplice - where we need to find
> hypervisor symbols - it is slow.
> 
> To understand this patch, a description of the existing
> method is explained first. For folks familar go to 'NEW CODE:'.
> 
> HOW IT WORKS:
> 
> The symbol table lookup mechanism uses a simple encoding mechanism
> where it extracts the common ascii characters that the symbol's use.
> 
> This saves us space. The lookup mechanism is geared towards looking
> up symbols based on address. We have one 0..N (where N is
> the number of symbols, so 6849 for example) table:
> 
> symbols_addresses[0..N]
> 
> And an 1-1 (in a loose fashion) of the symbols (encoded) in a
> symbols_names stream of size N.
> 
> The N is variable (later on that below)
> 
> The symbols_names are sorted based on symbols_addresses, which
> means that the decoded entries inside symbols_names are not in
> ascending or descending order.
> 
> There is also the encoding mechanism - the table of 255 entries
> called symbols_token_index[]. And the symbols_token_table which
> is an stream of ASCIIZ characters, such as (it really
> is not a table as the values are variable):
> 
> @0   .asciz  "credit"
> @6   .asciz  "mask"
> ..
> @300 .asciz  "S"
> 
> And the symbols_token_index:
> @0        .short  0
> @1        .short  7
> @2        .short  12
> @4        .short  16
> ...
> @84         .short  300
> 
> The relationship between them is that the symbols_token_index
> gives us the offset to symbols_token_table.
> 
> The symbol_names[] array is a stream of encoded values. Each value
> follows the same pattern - <len> followed by <encoding values>.
> And the another <len> followed by <encoding values>.
> 
> Hence to find the right one you need to read <len>, add <len>
> (to skip over), read <len>, add <len>, and so on until one
> finds the right tuple offset.
> 
> The <encoding values> are the indicies into the symbols_token_index.
> 
> Meaning if you have:
>   0x04, 0x54, 0xda, 0xe2, 0x74
>   [4, 84, 218, 226, 116 in human numbering]
> 
> The 0x04 tells us that the symbol is four bytes past this one (so next
> symbol offset starts at 5). If we lookup symbols_token_index[84] we get 300.
> symbols_token[300] gets us the "S". And so on, the string eventually
> end up being decode to be 'S_stext'. The first character is the type,
> then optionally follwed by the filename (and # right after filename)
> and then lastly the symbol, such as:
> 
> tvpmu_intel.c#core2_vpmu_do_interrupt
> 
> Keep in mind that there are two fixed sized tables:
> symbols_addresses[0..symbols_num_syms], and
> symbols_markers[0..symbols_num_syms/255].
> 
> The symbols_markers is used to speed searching for the right address.
> It gives us the offsets within symbol_names that start at the <len><encoded 
>value>.
> 
> The way to find a symbol based on the address is:
> 1) Figure out the 'tuple offset' from symbols_address[0..symbols_num_syms].
>    This table is sorted by virtual addresses so finding the value is simple.
> 2) Get starting offset of symbol_names by retrieving value of
>    symbol_markers['tuple offset' / 255].
> 3). Iterate up to 'tuple_offset & 255' in symbols_markers stream starting
>    at 'offset'.
> 4). Decode the <len><encoded value>
> 
> This however does not work very well if we want to search the other
> way - we have the symbol name and want to find the address.
> 
> NEW CODE:
> 
> To make that work we add one fixed size table called symbols_sorted_offsets 
> which
> has two elements: offset in symbol stream, offset in the symbol-address.
> 
> This whole array is sorted on the original symbol name during build-time
> (in case of collision we also take into account the type).
> 
> The values are for example:
> 
> symbols_sorted_offsets:
>     .long 83363, 6302 # [.bss, len=5]
>     .long 80459, 6084 # [.data, len=5]
> ..
> [The # added for clarity]
> 
> Which makes it incredibly easy to get in the symbols_names and also
> symbols_addresses (or symbols_offsets)
> 
> Searching for symbols is simplified as we can do a binary search
> on symbols_sorted_offsets. Since the symbols are sorted it takes on
> average 13 calls to symbols_expand_symbol.
> 
> Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx>

Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx>

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.