[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH RFC] x86/lld: fix symbol map generation


  • To: Jan Beulich <jbeulich@xxxxxxxx>
  • From: Roger Pau Monné <roger.pau@xxxxxxxxxx>
  • Date: Tue, 3 May 2022 11:15:02 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=citrix.com; dmarc=pass action=none header.from=citrix.com; dkim=pass header.d=citrix.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=EU3lfkaVuRNiVoZVgnR/BzYCqQvkfqCtqROFVEbSrUQ=; b=KZsn2pUiH3607hp7oEYDaBUFF5/7NOtvDAl/kPjh1IF0edj8qyR+IFcsiLFCHx4kvu6F6gK0J8FX2n4AbSLIxSMC60pwEvPRe4SI5yogJeKv3n3jgHGj90eDU9wnSMCDfwErURRjn76obeur3Nj9R9DO8VpNeuSCKAkPtsyewW1yYrPC8e+1DFj/4lP35tna1e0alJIOibbKnp6VVvBDP/VOVRvFDiMU9Zyw8qKQ2cXRUoj1dnDnX2c+7BfblvFklCVpfFY23L97/ETcS9i9Oc/S9H4XiYBMaKkgxsuKWB2fAS2sT15xi06KQa3F5cVppQNAx5iy0SZHe8uT9S9sjA==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=BsM0wsywgqOgJIuc+wsrDUuzO2UhoEPG4TEX9s5kk5nm+6KpOSIFOqdpmjDv+fJmgX8+CJ6jdcQ9dC1Jm8ZBsFee9dBcVq7mXOqcv+yL3GuLORiDjt8vz5UijW018NR5SxIX5vvTBCieShUQRdMRVu9oCRubL1kVDIRBdLGps4n77GNzYAETdvYE0xAMQR5qiao1JhgI8zqzxeBBncDwUYhJZ24ooOXVeLoD4gYA2M5vhhRC9EU3Aufzqng/9saakxpTxk539xWVICXaeJxE/qfTV5p6+T0CoviKpjrje9qVsqPs07coCqUdqbbR/jkOE/plgz4XGYaqaXMX77SWMA==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=citrix.com;
  • Cc: Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Wei Liu <wl@xxxxxxx>, xen-devel@xxxxxxxxxxxxxxxxxxxx
  • Delivery-date: Tue, 03 May 2022 09:15:29 +0000
  • Ironport-data: A9a23:GVGaAqLrWQJyV2bQFE+RpZQlxSXFcZb7ZxGr2PjKsXjdYENS1GFVz mAcUG7QPPmCZGHxco9+bNyw9EJXv5HVzdE3TQBlqX01Q3x08seUXt7xwmUcns+xwm8vaGo9s q3yv/GZdJhcokf0/0vrav67xZVF/fngqoDUUYYoAQgsA149IMsdoUg7wbRh39Yz2YHR7z6l4 rseneWOYDdJ5BYsWo4kw/rrRMRH5amaVJsw5zTSVNgT1LPsvyB94KE3fMldG0DQUIhMdtNWc s6YpF2PEsE1yD92Yj+tuu6TnkTn2dc+NyDW4pZdc/DKbhSvOkXee0v0XRYRQR4/ttmHozx+4 NZpm62aUhxqBfLVx9wRDT1/GnxQAoQTrdcrIVDn2SCS52vvViK2ht9IXAQxN4Be/ftrC2ZT8 /BeMCoKch2Im+OxxvS8V/VogcMgasLsOevzuFk5lW2fUalgHM2FGvubjTNb9G5YasRmB/HRa tBfcTNyRB/BfwdOKhEcD5dWcOKA2SGkKWMJ9wr9Sawf+EyM0VF+wKHREvnRUMWNS/10s2Dfq TeTl4j+KlRAXDCF8hKH+H+xgu7EnQvgRZkfUra/85ZCn1m71mEVThoMWjOTsfS/z0KzRd9bA 0gV4TY167g/8lSxSdvwVAH+p2SL1iPwQPJVGuw+rQuLmqzd5l/DAnBeF2AQLts7qMUxWDomk EeTmM/kDiBut7vTTm+B8rCTrnW5Pi19wXI+WBLohDAtu7HLyLzfRDqWJjq/OMZZVuHIJAw=
  • Ironport-hdrordr: A9a23:b9RhAKjvNb8asZZZdZiYohsgwHBQX1N13DAbv31ZSRFFG/FwyP rCoB1L73XJYWgqM03I+eruBEBPewK4yXdQ2/hoAV7EZnichILIFvAa0WKG+VHd8kLFltK1uZ 0QEJSWTeeAd2SS7vyKnzVQcexQp+VvmZrA7Ym+854ud3ANV0gJ1XYENu/xKDwTeOApP+taKH LKjfA32gZINE5nJviTNz0gZazuttfLnJXpbVovAAMm0hCHiXeN5KThGxaV8x8CW3cXqI1Su1 Ttokjc3OGOovu7whjT2yv66IlXosLozp9mCNaXgsYYBz3wgkKDZZhnWZeFoDcpydvfo2oCoZ 3pmVMNLs5z43TeciWcpgbs4RDp1HIU53rr2Taj8AzeSWCQfkNIN+NxwaZiNjfJ4Uspu99xlI hR2XiCipZRBRTc2Azg+tnhTXhR5wSJiEtntdRWo21UUIMYZrMUh5cY5llpHJAJGz+/wJw7Ed NpENrX6J9tABynhkjizylSKeGXLzcO9k/seDlBhiXV6UkboJlB9TpY+CRF9U1wsa7USPF/lp D52+pT5fVzp/QtHNNA7dc6MLWK41P2MGLx2UKpUCLa/fI8SjvwQ6Ce2sRG2MiaPLo18bAVpL PtFHtliE9aQTOaNSTJ5uwHzizw
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On Tue, May 03, 2022 at 10:17:44AM +0200, Jan Beulich wrote:
> On 02.05.2022 17:20, Roger Pau Monne wrote:
> > The symbol map generation (and thus the debug info attached to Xen) is
> > partially broken when using LLVM LD.  That's due to LLD converting
> > almost all symbols from global to local in the last linking step, and
> 
> I'm puzzled by "almost" - is there a pattern of which ones aren't
> converted?

This is the list of the ones that aren't converted:

__x86_indirect_thunk_r11
s3_resume
start
__image_base__
__high_start
wakeup_stack
wakeup_stack_start
handle_exception
dom_crash_sync_extable
common_interrupt
__x86_indirect_thunk_rbx
__x86_indirect_thunk_rcx
__x86_indirect_thunk_rax
__x86_indirect_thunk_rdx
__x86_indirect_thunk_rbp
__x86_indirect_thunk_rsi
__x86_indirect_thunk_rdi
__x86_indirect_thunk_r8
__x86_indirect_thunk_r9
__x86_indirect_thunk_r10
__x86_indirect_thunk_r12
__x86_indirect_thunk_r13
__x86_indirect_thunk_r14
__x86_indirect_thunk_r15

I assume there's some kind of pattern, but I haven't yet been able to
spot where triggers the conversion from global to local in lld.

> Also "last linking step" is ambiguous, as we link three binaries and
> aiui the issue is present on every of these passes. May I suggest
> "... when linking actual executables" or (still somewhat ambiguous)
> "... when linking final binaries"?
> 
> > thus confusing tools/symbols into adding a file prefix to all text
> > symbols, the results looks like:
> > 
> > Xen call trace:
> >    [<ffff82d040449fe8>] R xxhash64.c#__start_xen+0x3938/0x39c0
> >    [<ffff82d040203734>] F __high_start+0x94/0xa0
> > 
> > In order to workaround this create a list of global symbols prior to
> > the linking step, and use objcopy to convert the symbols in the final
> > binary back to global before processing with tools/symbols.
> > 
> > Signed-off-by: Roger Pau Monné <roger.pau@xxxxxxxxxx>
> > ---
> > I haven't found a way to prevent LLD from converting the symbols, so
> > I've come up with this rather crappy workaround.
> 
> Perhaps a map file (like we use for shared libraries in tools/) would
> allow doing so? But of course this would want to be machine-generated,
> not manually maintained.
> 
> Have you gained any insight into _why_ they are doing what they do?

I've informally asked on IRC but got no reply.  I've now created this:

https://discourse.llvm.org/t/conversion-of-text-symbols-from-global-to-local

> > Not applied to EFI, partially because I don't have an environment with
> > LLD capable of generating the EFI binary.
> > 
> > Obtaining the global symbol list could likely be a target on itself,
> > if it is to be shared between the ELF and the EFI binary generation.
> 
> If, as the last paragraph of the description is worded, you did this
> just once (as a prereq), I could see this working.

Yes, my comment was about splitting the:

$(NM) -pa --format=bsd $< | awk '{ if($$2 == "T") print $$3}' \
      > $(@D)/.$(@F).global-syms

rune into a separate $(TARGET)-syms.global-syms target or some such.
Not sure it's really worth it.

> Otherwise (as you
> have it now, with it done 3 times) it would first require splitting
> the linking rules into many separate ones (which has been the plan
> anyway, but so far I didn't get to it).
> 
> > --- a/xen/arch/x86/Makefile
> > +++ b/xen/arch/x86/Makefile
> > @@ -134,24 +134,34 @@ $(TARGET): $(TARGET)-syms $(efi-y) $(obj)/boot/mkelf32
> >  CFLAGS-$(XEN_BUILD_EFI) += -DXEN_BUILD_EFI
> >  
> >  $(TARGET)-syms: $(objtree)/prelink.o $(obj)/xen.lds
> > +   # Dump global text symbols before the linking step
> > +   $(NM) -pa --format=bsd $< | awk '{ if($$2 == "T") print $$3}' \
> > +       > $(@D)/.$(@F).global-syms
> >     $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
> > -       $(objtree)/common/symbols-dummy.o -o $(@D)/.$(@F).0
> > +       $(objtree)/common/symbols-dummy.o -o $(@D)/.$(@F).0.tmp
> > +   # LLVM LD has converted global symbols into local ones as part of the
> > +   # linking step, convert those back to global before using tools/symbols.
> > +   $(OBJCOPY) --globalize-symbols=$(@D)/.$(@F).global-syms \
> > +       $(@D)/.$(@F).0.tmp $(@D)/.$(@F).0
> >     $(NM) -pa --format=sysv $(@D)/.$(@F).0 \
> >             | $(objtree)/tools/symbols $(all_symbols) --sysv --sort \
> >             >$(@D)/.$(@F).0.S
> >     $(MAKE) $(build)=$(@D) $(@D)/.$(@F).0.o
> >     $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
> > -       $(@D)/.$(@F).0.o -o $(@D)/.$(@F).1
> > +       $(@D)/.$(@F).0.o -o $(@D)/.$(@F).1.tmp
> > +   $(OBJCOPY) --globalize-symbols=$(@D)/.$(@F).global-syms \
> > +       $(@D)/.$(@F).1.tmp $(@D)/.$(@F).1
> >     $(NM) -pa --format=sysv $(@D)/.$(@F).1 \
> >             | $(objtree)/tools/symbols $(all_symbols) --sysv --sort 
> > $(syms-warn-dup-y) \
> >             >$(@D)/.$(@F).1.S
> >     $(MAKE) $(build)=$(@D) $(@D)/.$(@F).1.o
> >     $(LD) $(XEN_LDFLAGS) -T $(obj)/xen.lds -N $< $(build_id_linker) \
> > -       $(orphan-handling-y) $(@D)/.$(@F).1.o -o $@
> > +       $(orphan-handling-y) $(@D)/.$(@F).1.o -o $@.tmp
> > +   $(OBJCOPY) --globalize-symbols=$(@D)/.$(@F).global-syms $@.tmp $@
> 
> Is this very useful? It only affects ...
> 
> >     $(NM) -pa --format=sysv $(@D)/$(@F) \
> >             | $(objtree)/tools/symbols --all-symbols --xensyms --sysv 
> > --sort \
> >             >$(@D)/$(@F).map
> 
> ... the actual map file; what's in the binary and in this map file doesn't
> depend on local vs global anymore (and you limit this to text symbols
> anyway; I wonder in how far livepatching might also be affected by the
> same issue with data symbols).

If I don't add this step then the map file will also end up with lines
like:

0xffff82d0405b6968 b lib/xxhash64.c#iommuv2_enabled
0xffff82d0405b6970 b lib/xxhash64.c#nr_ioapic_sbdf
0xffff82d0405b6980 b lib/xxhash64.c#ioapic_sbdf

I see the same happen with other non-text symbols, so I would likely
need to extend the fixing to preserve all global symbols from the
input file, not just text ones.

> In any event I would like to ask that the objcopy invocations be tied to
> lld being in use. No matter that it shouldn't, objcopy can alter binaries
> even if no actual change is being made (I've just recently observed this
> with xen.efi, see the thread rooted at "EFI: strip xen.efi when putting it
> on the EFI partition", and recall that at least for GNU binutils objcopy
> and strip are effectively [almost] the same binary).

Right, that's fine.  I would still hope to find a better solution,
this is quite crappy IMO.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.