[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v4 3/4] libelf: rewrite symtab/strtab loading
>>> On 29.02.16 at 11:57, <roger.pau@xxxxxxxxxx> wrote: > El 29/2/16 a les 10:31, Jan Beulich ha escrit: >>>>> On 26.02.16 at 18:02, <roger.pau@xxxxxxxxxx> wrote: >>> The layout is as follows (I should add this to the patch itself as a >>> comment, since I guess this is still quite confusing): >>> >>> +------------------------+ >>> | | >>> | KERNEL | >>> | | >>> +------------------------+ pend >>> | ALIGNMENT | >>> +------------------------+ bsd_symtab_pstart >>> | | >>> | size | bsd_symtab_pend - bsd_symtab_pstart >>> | | >>> +------------------------+ >>> | | >>> | ELF header | >>> | | >>> +------------------------+ >>> | | >>> | ELF section header 0 | Dummy section header >>> | | >>> +------------------------+ >>> | | >>> | ELF section header 1 | SYMTAB section header >>> | | >>> +------------------------+ >>> | | >>> | ELF section header 2 | STRTAB section header >>> | | >>> +------------------------+ >>> | | >>> | SYMTAB | >>> | | >>> +------------------------+ >>> | | >>> | STRTAB | >>> | | >>> +------------------------+ bsd_symtab_pend >>> >>> There are always only tree sections because that's all we need, section >>> 0 is ignored, section 1 is used to describe the SYMTAB, and section 2 is >>> used to describe the STRTAB. >> >> All fine, but this still doesn't clarify how the kernel learns where >> bsd_symtab_pstart is. > > The BSDs linker scripts places an "end" symbol after all the loaded > sections: > > https://svnweb.freebsd.org/base/head/sys/conf/ldscript.amd64?revision=284870& > view=co That's fine. But how do they know it is legitimate to even touch what follows, not to speak of assign meaning to the value found there? And are there no alignment/padding considerations necessary? >>> According to the ELF spec there can only be >>> one SYMTAB, so that's all we need. >> >> Did you perhaps overlook the spec saying "... but this restriction >> may be relaxed in the future", plus the fact that we're talking >> about an executable file here (which commonly have two symbol >> tables - .dynsym and .symtab), not an object one? (This isn't to >> say that you need to make the code handle multiple ones, if you >> know that *BSD will only ever have one.) > > DYNSYM is just a subset of SYMTAB, BSDs prefer (it's not a strict > requirement) the SYMTAB because of the in-kernel debugger. Also DYNSYM > is already loaded by default because it's covered by the program headers. > > I can add support for loading multiple SYMTABs/STRTABs, but shouldn't we > wait until the spec is updated? As I read the spec ATM, an ELF file with > multiple SYMTABs is invalid. I assume that if the ELF format ever allows > more than one SYMTAB, the version is going to be bumped at least (or > some other field is going to be used to signal this change). As said I don't see the need for you to add multiple symbol table support. I only meant to point out that the potential exists. >>>>> sz = elf_round_up(elf, sz); >>>>> >>>>> - /* Space for the symbol and string tables. */ >>>>> + /* Space for the symbol and string table. */ >>>>> for ( i = 0; i < elf_shdr_count(elf); i++ ) >>>>> { >>>>> shdr = elf_shdr_by_index(elf, i); >>>>> if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>>> /* input has an insane section header count field */ >>>>> break; >>>>> - type = elf_uval(elf, shdr, sh_type); >>>>> - if ( (type == SHT_STRTAB) || (type == SHT_SYMTAB) ) >>>>> - sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>>> + >>>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_SYMTAB ) >>>>> + continue; >>>>> + >>>>> + sz = elf_round_up(elf, sz + elf_uval(elf, shdr, sh_size)); >>>>> + shdr = elf_shdr_by_index(elf, elf_uval(elf, shdr, sh_link)); >>>>> + >>>>> + if ( !elf_access_ok(elf, ELF_HANDLE_PTRVAL(shdr), 1) ) >>>>> + /* input has an insane section header count field */ >>>>> + break; >>>>> + >>>>> + if ( elf_uval(elf, shdr, sh_type) != SHT_STRTAB ) >>>>> + /* Invalid symtab -> strtab link */ >>>>> + break; >>>> >>>> This is not sufficient - what if sh_link is out of bounds, but the >>>> bogusly accessed field happens to hold SHT_STRTAB? (Oddly >>>> enough you have at least an SHN_UNDEF check in the second >>>> loop below.) >>> >>> The out-of-bounds access would be detected by elf_access_ok (if it's out >>> of the scope of the ELF file, which is all we care about). In fact the >>> elf_access_ok above could be removed because elf_uval already performs >>> out-of-bounds checks. The result is definitely no worse that what we are >>> doing ATM. >> >> No, the out of bounds check should be more strict than just >> considering the whole image: The image is broken if the link >> points to a non-existing section. > > Ah, do you mean I should mark the image as broken if either of the > checks fail? Perhaps, but my main point continue to be that there is a check missing here. >>>>> @@ -186,79 +199,169 @@ void elf_parse_bsdsyms(struct elf_binary *elf, >>>>> uint64_t pstart) >>>>> >>>>> static void elf_load_bsdsyms(struct elf_binary *elf) >>>>> { >>>>> - ELF_HANDLE_DECL(elf_ehdr) sym_ehdr; >>>>> - unsigned long sz; >>>>> - elf_ptrval maxva; >>>>> - elf_ptrval symbase; >>>>> - elf_ptrval symtab_addr; >>>>> - ELF_HANDLE_DECL(elf_shdr) shdr; >>>>> - unsigned i, type; >>>>> + /* >>>>> + * Header that is placed at the end of the kernel and allows >>>>> + * the OS to find where the symtab and strtab have been loaded. >>>>> + * It mimics a valid ELF file header, although it only contains >>>>> + * a symtab and a strtab section. >>>>> + * >>>>> + * NB: according to the ELF spec there's only ONE symtab per ELF >>>>> + * file, and accordingly we will only load the corresponding >>>>> + * strtab, so we only need three section headers in our fake ELF >>>>> + * header (first section header is always a dummy). >>>>> + */ >>>>> + struct { >>>>> + uint32_t size; >>>>> + struct { >>>>> + elf_ehdr header; >>>>> + elf_shdr section[3]; >>>>> + } __attribute__((packed)) elf_header; >>>>> + } __attribute__((packed)) header; >>>> >>>> If this is placed at the end, how can the OS reasonably use it >>>> without knowing that there are exactly 3 section header >>>> entries? I.e. how would it find the base of this structure? >>> >>> See the commend below, the base is found at the end of the kernel, and >>> then the ELF header contains the right pointers to the sections headers >>> (by appropriately setting e_shoff). >> >> Thing is - I can't see which "comment below" you try to refer me to. > > I was trying to point you to the big diagram/layout that I've posted at > the start of the email. > > It doesn't need to know there are exactly 3 sections, this is fetched > form the ELF header e_shnum field. And the ELF header is fetched using > the "end" symbol as a reference to pend. Ah, okay, so the ABI is _not_ for the kernel to start looking from the end, but to start looking from its own image end. >>>>> + /* Zero the dummy section. */ >>>> >>>> Dummy? And anyway I think you mean "section header" here. >>> >>> Yes, the right comment would be: zero the dummy section header. >> >> But then still - why "dummy"? The 0-th section header has a purpose >> beyond being a filler, even if that purpose doesn't matter for the >> intentions you have here. > > OK, what about if I replace the comment with: > > /* Zero the undefined section header. */ > > I think that's more inline with the specification. I could also fill it > with specific values if you think it's clearer: > > elf_store_field_bitness(..., sh_name, 0); > elf_store_field_bitness(..., sh_type, SHT_NULL); > elf_store_field_bitness(..., sh_flags, 0); > [...] > > Which is equivalent to zeroing it. Just zeroing is fine afaic. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxx http://lists.xen.org/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |