[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v4 3/4] libelf: rewrite symtab/strtab loading



El 29/2/16 a les 10:31, Jan Beulich ha escrit:
>>>> On 26.02.16 at 18:02, <roger.pau@xxxxxxxxxx> wrote:
>> El 26/2/16 a les 14:15, Jan Beulich ha escrit:
>>>>>> On 16.02.16 at 18:37, <roger.pau@xxxxxxxxxx> wrote:
>>>> --- a/xen/common/libelf/libelf-loader.c
>>>> +++ b/xen/common/libelf/libelf-loader.c
>>>> @@ -164,20 +164,33 @@ void elf_parse_bsdsyms(struct elf_binary *elf, 
>>>> uint64_t pstart)
>>>>      sz = sizeof(uint32_t);
>>>>  
>>>>      /* Space for the elf and elf section headers */
>>>> -    sz += (elf_uval(elf, elf->ehdr, e_ehsize) +
>>>> -           elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize));
>>>> +    sz += elf_uval(elf, elf->ehdr, e_ehsize) +
>>>> +          3 * elf_uval(elf, elf->ehdr, e_shentsize);
>>>
>>> I think a literal 3 like this either needs a #define (at once serving
>>> as documentation) or a comment. Perhaps rather the former,
>>> considering that the (same?) 3 appears again further down. Plus -
>>> what guarantees there are 3 section headers in the image?
>>
>> This is not related to the image itself, but to the metadata that libelf
>> places at the end of the loaded kernel in order to stash the symtab and
>> strtab for the guest to use.
> 
> I understand this, yet imo the literal 3 still should be replaced.

Yes, I agree.

>> The layout is as follows (I should add this to the patch itself as a
>> comment, since I guess this is still quite confusing):
>>
>> +------------------------+
>> |                        |
>> | KERNEL                 |
>> |                        |
>> +------------------------+ pend
>> | ALIGNMENT              |
>> +------------------------+ bsd_symtab_pstart
>> |                        |
>> | size                   | bsd_symtab_pend - bsd_symtab_pstart
>> |                        |
>> +------------------------+
>> |                        |
>> | ELF header             |
>> |                        |
>> +------------------------+
>> |                        |
>> | ELF section header 0   | Dummy section header
>> |                        |
>> +------------------------+
>> |                        |
>> | ELF section header 1   | SYMTAB section header
>> |                        |
>> +------------------------+
>> |                        |
>> | ELF section header 2   | STRTAB section header
>> |                        |
>> +------------------------+
>> |                        |
>> | SYMTAB                 |
>> |                        |
>> +------------------------+
>> |                        |
>> | STRTAB                 |
>> |                        |
>> +------------------------+ bsd_symtab_pend
>>
>> There are always only tree sections because that's all we need, section
>> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is
>> used to describe the STRTAB.
> 
> All fine, but this still doesn't clarify how the kernel learns where
> bsd_symtab_pstart is.

The BSDs linker scripts places an "end" symbol after all the loaded
sections:

https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870&view=co

If you execute:

$ nm -n /boot/kernel/kernel

Against a FreeBSD kernel the output is as follows:

[...]
ffffffff81dc4760 B cpu_info
ffffffff81dc4b60 B dpcpu
ffffffff81dc4b68 B smp_tlb_pmap
ffffffff81dc4b70 B drq_rman
ffffffff81dc4bb8 B mem_rman
ffffffff81dc4c00 B irq_rman
ffffffff81dc4c48 B port_rman
ffffffff81dc4c90 B tsc_is_invariant
ffffffff81dc4c94 B tsc_perf_stat
ffffffff81dc4c98 B tsc_freq
ffffffff81dc4ca0 B smp_tsc
ffffffff81dc4ca8 B HYPERVISOR_shared_info
ffffffff81dc4cb0 B xen_vector_callback_enabled
ffffffff81dc4cb8 B HYPERVISOR_start_info
ffffffff81dc4cc0 A _end
ffffffff81dc4cc0 A end

>> According to the ELF spec there can only be
>> one SYMTAB, so that's all we need.
> 
> Did you perhaps overlook the spec saying "... but this restriction
> may be relaxed in the future", plus the fact that we're talking
> about an executable file here (which commonly have two symbol
> tables - .dynsym and .symtab), not an object one? (This isn't to
> say that you need to make the code handle multiple ones, if you
> know that *BSD will only ever have one.)

DYNSYM is just a subset of SYMTAB, BSDs prefer (it's not a strict
requirement) the SYMTAB because of the in-kernel debugger. Also DYNSYM
is already loaded by default because it's covered by the program headers.

I can add support for loading multiple SYMTABs/STRTABs, but shouldn't we
wait until the spec is updated? As I read the spec ATM, an ELF file with
multiple SYMTABs is invalid. I assume that if the ELF format ever allows
more than one SYMTAB, the version is going to be bumped at least (or
some other field is going to be used to signal this change).

>>>>      sz = elf_round_up(elf, sz);
>>>>  
>>>> -    /* Space for the symbol and string tables. */
>>>> +    /* Space for the symbol and string table. */
>>>>      for ( i = 0; i < elf_shdr_count(elf); i++ )
>>>>      {
>>>>          shdr = elf_shdr_by_index(elf, i);
>>>>          if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>>>>              /* input has an insane section header count field */
>>>>              break;
>>>> -        type = elf_uval(elf, shdr, sh_type);
>>>> -        if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) )
>>>> -            sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
>>>> +
>>>> +        if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB )
>>>> +            continue;
>>>> +
>>>> +        sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size));
>>>> +        shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link));
>>>> +
>>>> +        if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) )
>>>> +            /* input has an insane section header count field */
>>>> +            break;
>>>> +
>>>> +        if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB )
>>>> +            /* Invalid symtab -> strtab link */
>>>> +            break;
>>>
>>> This is not sufficient - what if sh_link is out of bounds, but the
>>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly
>>> enough you have at least an SHN_UNDEF check in the second
>>> loop below.)
>>
>> The out-of-bounds access would be detected by elf_access_ok (if it's out
>> of the scope of the ELF file, which is all we care about). In fact the
>> elf_access_ok above could be removed because elf_uval already performs
>> out-of-bounds checks. The result is definitely no worse that what we are
>> doing ATM.
> 
> No, the out of bounds check should be more strict than just
> considering the whole image: The image is broken if the link
> points to a non-existing section.

Ah, do you mean I should mark the image as broken if either of the
checks fail?

In this case the image is broken if the sh_link field points to anything
different than a STRTAB section, so I should add a elf_mark_broken
before the break. elf_access_ok already marks the image as broken if an
out-of-bounds access is attempted.

>>>> @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, 
>>>> uint64_t pstart)
>>>>  
>>>>  static void elf_load_bsdsyms(struct elf_binary *elf)
>>>>  {
>>>> -    ELF_HANDLE_DECL(elf_ehdr) sym_ehdr;
>>>> -    unsigned long sz;
>>>> -    elf_ptrval maxva;
>>>> -    elf_ptrval symbase;
>>>> -    elf_ptrval symtab_addr;
>>>> -    ELF_HANDLE_DECL(elf_shdr) shdr;
>>>> -    unsigned i, type;
>>>> +    /*
>>>> +     * Header that is placed at the end of the kernel and allows
>>>> +     * the OS to find where the symtab and strtab have been loaded.
>>>> +     * It mimics a valid ELF file header, although it only contains
>>>> +     * a symtab and a strtab section.
>>>> +     *
>>>> +     * NB: according to the ELF spec there's only ONE symtab per ELF
>>>> +     * file, and accordingly we will only load the corresponding
>>>> +     * strtab, so we only need three section headers in our fake ELF
>>>> +     * header (first section header is always a dummy).
>>>> +     */
>>>> +    struct {
>>>> +        uint32_t size;
>>>> +        struct {
>>>> +            elf_ehdr header;
>>>> +            elf_shdr section[3];
>>>> +        } __attribute__((packed)) elf_header;
>>>> +    } __attribute__((packed)) header;
>>>
>>> If this is placed at the end, how can the OS reasonably use it
>>> without knowing that there are exactly 3 section header
>>> entries? I.e. how would it find the base of this structure?
>>
>> See the commend below, the base is found at the end of the kernel, and
>> then the ELF header contains the right pointers to the sections headers
>> (by appropriately setting e_shoff).
> 
> Thing is - I can't see which "comment below" you try to refer me to.

I was trying to point you to the big diagram/layout that I've posted at
the start of the email.

It doesn't need to know there are exactly 3 sections, this is fetched
form the ELF header e_shnum field. And the ELF header is fetched using
the "end" symbol as a reference to pend.

>>>> +    ELF_HANDLE_DECL(elf_ehdr) header_handle;
>>>> +    unsigned long shdr_size;
>>>> +    ELF_HANDLE_DECL(elf_shdr) section_handle;
>>>> +    ELF_HANDLE_DECL(elf_shdr) image_handle;
>>>> +    unsigned int i, link;
>>>> +    elf_ptrval header_base;
>>>> +    elf_ptrval elf_header_base;
>>>> +    elf_ptrval symtab_base;
>>>> +    elf_ptrval strtab_base;
>>>>  
>>>>      if ( !elf->bsd_symtab_pstart )
>>>>          return;
>>>>  
>>>> -#define elf_hdr_elm(_elf, _hdr, _elm, _val)     \
>>>> -do {                                            \
>>>> -    if ( elf_64bit(_elf) )                      \
>>>> -        elf_store_field(_elf, _hdr, e64._elm, _val);  \
>>>> -    else                                        \
>>>> -        elf_store_field(_elf, _hdr, e32._elm, _val);  \
>>>> +#define elf_store_field_bitness(_elf, _hdr, _elm, _val)             \
>>>> +do {                                                                \
>>>> +    if ( elf_64bit(_elf) )                                          \
>>>> +        elf_store_field(_elf, _hdr, e64._elm, _val);                \
>>>> +    else                                                            \
>>>> +        elf_store_field(_elf, _hdr, e32._elm, _val);                \
>>>>  } while ( 0 )
>>>>  
>>>> -    symbase = elf_get_ptr(elf, elf->bsd_symtab_pstart);
>>>> -    symtab_addr = maxva = symbase + sizeof(uint32_t);
>>>> -
>>>> -    /* Set up Elf header. */
>>>> -    sym_ehdr = ELF_MAKE_HANDLE(elf_ehdr, symtab_addr);
>>>> -    sz = elf_uval(elf, elf->ehdr, e_ehsize);
>>>> -    elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(sym_ehdr), 
>>>> ELF_HANDLE_PTRVAL(elf->ehdr), sz);
>>>> -    maxva += sz; /* no round up */
>>>> +#define SYMTAB_INDEX    1
>>>> +#define STRTAB_INDEX    2
>>>>  
>>>> -    elf_hdr_elm(elf, sym_ehdr, e_phoff, 0);
>>>> -    elf_hdr_elm(elf, sym_ehdr, e_shoff, elf_uval(elf, elf->ehdr, 
>>>> e_ehsize));
>>>> -    elf_hdr_elm(elf, sym_ehdr, e_phentsize, 0);
>>>> -    elf_hdr_elm(elf, sym_ehdr, e_phnum, 0);
>>>> +    /* Allow elf_memcpy_safe to write to symbol_header. */
>>>> +    elf->caller_xdest_base = &header;
>>>> +    elf->caller_xdest_size = sizeof(header);
>>>>  
>>>> -    /* Copy Elf section headers. */
>>>> -    shdr = ELF_MAKE_HANDLE(elf_shdr, maxva);
>>>> -    sz = elf_shdr_count(elf) * elf_uval(elf, elf->ehdr, e_shentsize);
>>>> -    elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(shdr),
>>>> -                    ELF_IMAGE_BASE(elf) + elf_uval(elf, elf->ehdr, 
>>>> e_shoff),
>>>> -                    sz);
>>>> -    maxva = elf_round_up(elf, (unsigned long)maxva + sz);
>>>> +    /*
>>>> +     * Calculate the position of the various elements in GUEST MEMORY 
>>>> SPACE.
>>>> +     * This addresses MUST only be used with elf_load_image.
>>>> +     *
>>>> +     * NB: strtab_base cannot be calculated at this point because we don't
>>>> +     * know the size of the symtab yet, and the strtab will be placed 
>>>> after it.
>>>> +     */
>>>> +    header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart);
>>>> +    elf_header_base = elf_get_ptr(elf, elf->bsd_symtab_pstart) +
>>>> +                      sizeof(uint32_t);
>>>> +    symtab_base = elf_round_up(elf, header_base + sizeof(header));
>>>> +
>>>> +    /* Fill the ELF header, copied from the original ELF header. */
>>>> +    header_handle = ELF_MAKE_HANDLE(elf_ehdr,
>>>> +                                
>>>> ELF_REALPTR2PTRVAL(&header.elf_header.header));
>>>> +    elf_memcpy_safe(elf, ELF_HANDLE_PTRVAL(header_handle),
>>>> +                    ELF_HANDLE_PTRVAL(elf->ehdr),
>>>> +                    elf_uval(elf, elf->ehdr, e_ehsize));
>>>> +
>>>> +    /* Set the offset to the shdr array. */
>>>> +    elf_store_field_bitness(elf, header_handle, e_shoff,
>>>> +                            offsetof(typeof(header.elf_header), section));
>>>> +
>>>> +    /* Set the right number of section headers. */
>>>> +    elf_store_field_bitness(elf, header_handle, e_shnum, 3);
>>>> +
>>>> +    /* Clear a couple of fields we don't use. */
>>>> +    elf_store_field_bitness(elf, header_handle, e_phoff, 0);
>>>> +    elf_store_field_bitness(elf, header_handle, e_phentsize, 0);
>>>> +    elf_store_field_bitness(elf, header_handle, e_phnum, 0);
>>>
>>> Perhaps better just give the structure above an initializer?
>>
>> Not really, the problem is that we copy the original header from the ELF
>> binary, and then we mangle it. This is just a fixup to make it clear
>> that no program headers are present (we just use section headers in
>> order to describe the SYMTAB and STRTAB positions).
> 
> Ah, I see now.
> 
>>>> +    /* Zero the dummy section. */
>>>
>>> Dummy? And anyway I think you mean "section header" here.
>>
>> Yes, the right comment would be: zero the dummy section header.
> 
> But then still - why "dummy"? The 0-th section header has a purpose
> beyond being a filler, even if that purpose doesn't matter for the
> intentions you have here.

OK, what about if I replace the comment with:

/* Zero the undefined section header. */

I think that's more inline with the specification. I could also fill it
with specific values if you think it's clearer:

elf_store_field_bitness(..., sh_name, 0);
elf_store_field_bitness(..., sh_type, SHT_NULL);
elf_store_field_bitness(..., sh_flags, 0);
[...]

Which is equivalent to zeroing it.

Thanks for the review, Roger.


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxx
http://lists.xen.org/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.