[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v8 01/28] linkage: new macros for assembler symbols



Hi,

this time a more detailed look. :)

> Subject: Re: [PATCH v8 01/28] linkage: new macros for assembler symbols

Patch subject needs a verb, like, for example:

"linkage: Introduce new macros for assembler symbols"

On Thu, Aug 08, 2019 at 12:38:27PM +0200, Jiri Slaby wrote:
> Introduce new C macros for annotations of functions and data in
> assembly. There is a long-standing mess in macros like ENTRY, END,
> ENDPROC and similar. They are used in different manners and sometimes
> incorrectly.
> 
> So introduce macros with clear use to annotate assembly as follows:
> 
> a) Support macros for the ones below
>    SYM_T_FUNC -- type used by assembler to mark functions
>    SYM_T_OBJECT -- type used by assembler to mark data
>    SYM_T_NONE -- type used by assembler to mark entries of unknown type
> 
>    They are defined as STT_FUNC, STT_OBJECT, and STT_NOTYPE
>    respectively. According to the gas manual, this is the most portable
>    way. I am not sure about other assemblers, so we can switch this back
>    to %function and %object if this turns into a problem. Architectures
>    can also override them by something like ", @function" if they need.
> 
>    SYM_A_ALIGN, SYM_A_NONE -- align the symbol?
>    SYM_L_GLOBAL, SYM_L_WEAK, SYM_L_LOCAL -- linkage of symbols
> 
> b) Mostly internal annotations, used by the ones below
>    SYM_ENTRY -- use only if you have to (for non-paired symbols)
>    SYM_START -- use only if you have to (for paired symbols)
>    SYM_END -- use only if you have to (for paired symbols)
> 
> c) Annotations for code
>    SYM_INNER_LABEL_ALIGN -- only for labels in the middle of code
>    SYM_INNER_LABEL -- only for labels in the middle of code
> 
>    SYM_FUNC_START_LOCAL_ALIAS -- use where there are two local names for
>       one function
>    SYM_FUNC_START_ALIAS -- use where there are two global names for one
>       function
>    SYM_FUNC_END_ALIAS -- the end of LOCAL_ALIASed or ALIASed function
> 
>    SYM_FUNC_START -- use for global functions
>    SYM_FUNC_START_NOALIGN -- use for global functions, w/o alignment
>    SYM_FUNC_START_LOCAL -- use for local functions
>    SYM_FUNC_START_LOCAL_NOALIGN -- use for local functions, w/o
>       alignment
>    SYM_FUNC_START_WEAK -- use for weak functions
>    SYM_FUNC_START_WEAK_NOALIGN -- use for weak functions, w/o alignment
>    SYM_FUNC_END -- the end of SYM_FUNC_START_LOCAL, SYM_FUNC_START,
>       SYM_FUNC_START_WEAK, ...
> 
>    For functions with special (non-C) calling conventions:
>    SYM_CODE_START -- use for non-C (special) functions
>    SYM_CODE_START_NOALIGN -- use for non-C (special) functions, w/o
>       alignment
>    SYM_CODE_START_LOCAL -- use for local non-C (special) functions
>    SYM_CODE_START_LOCAL_NOALIGN -- use for local non-C (special)
>       functions, w/o alignment
>    SYM_CODE_END -- the end of SYM_CODE_START_LOCAL or SYM_CODE_START
> 
> d) For data
>    SYM_DATA_START -- global data symbol
>    SYM_DATA_START_LOCAL -- local data symbol
>    SYM_DATA_END -- the end of the SYM_DATA_START symbol
>    SYM_DATA_END_LABEL -- the labeled end of SYM_DATA_START symbol
>    SYM_DATA -- start+end wrapper around simple global data
>    SYM_DATA_LOCAL -- start+end wrapper around simple local data
> 
> ==========
> 
> The macros allow to pair starts and ends of functions and mark functions
> correctly in the output ELF objects.
> 
> All users of the old macros in x86 are converted to use these in further
> patches.
> 
> [v2]
> * use SYM_ prefix and sane names
> * add SYM_START and SYM_END and parametrize all the macros
> 
> [v3]
> * add SYM_DATA, SYM_DATA_LOCAL, and SYM_DATA_END_LABEL
> 
> [v4]
> * add _NOALIGN versions of some macros
> * add _CODE_ derivates of _FUNC_ macros
> 
> [v5]
> * drop "SIMPLE" from data annotations
> * switch NOALIGN and ALIGN variants of inner labels
> * s/visibility/linkage/; s@SYM_V_@SYM_L_@
> * add Documentation
> 
> [v6]
> * fixed typos found by Randy Dunlap
> * remove doubled INNER_LABEL macros, one pair was unused
> 
> [v8]
> * use lkml.kernel.org for links
> * link the docs from index.rst (by bpetkov)
> * fixed typos on the docs

Patch version history which doesn't belong in the commit message goes...

> Signed-off-by: Jiri Slaby <jslaby@xxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Boris Ostrovsky <boris.ostrovsky@xxxxxxxxxx>
> Cc: hpa@xxxxxxxxx
> Cc: Ingo Molnar <mingo@xxxxxxxxxx>
> Cc: jpoimboe@xxxxxxxxxx
> Cc: Juergen Gross <jgross@xxxxxxxx>
> Cc: Len Brown <len.brown@xxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: linux-kernel@xxxxxxxxxxxxxxx
> Cc: linux-pm@xxxxxxxxxxxxxxx
> Cc: mingo@xxxxxxxxxx
> Cc: Pavel Machek <pavel@xxxxxx>
> Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Cc: "Rafael J. Wysocki" <rjw@xxxxxxxxxxxxx>
> Cc: Thomas Gleixner <tglx@xxxxxxxxxxxxx>
> Cc: xen-devel@xxxxxxxxxxxxxxxxxxxx
> Cc: x86@xxxxxxxxxx
> ---

... here, under the "---" so that git can ignore it when applying.

>  Documentation/asm-annotations.rst | 217 ++++++++++++++++++++++++++
>  Documentation/index.rst           |   8 +
>  arch/x86/include/asm/linkage.h    |  10 +-
>  include/linux/linkage.h           | 245 +++++++++++++++++++++++++++++-
>  4 files changed, 469 insertions(+), 11 deletions(-)
>  create mode 100644 Documentation/asm-annotations.rst
> 
> diff --git a/Documentation/asm-annotations.rst 
> b/Documentation/asm-annotations.rst
> new file mode 100644
> index 000000000000..a967648e4325
> --- /dev/null
> +++ b/Documentation/asm-annotations.rst
> @@ -0,0 +1,217 @@
> +Assembler Annotations
> +=====================
> +
> +Copyright (c) 2017-2019 Jiri Slaby
> +
> +This document describes the new macros for annotation of data and code in
> +assembly. In particular, it contains information about ``SYM_FUNC_START``,
> +``SYM_FUNC_END``, ``SYM_CODE_START``, and similar.
> +
> +Rationale
> +---------
> +Some code like entries, trampolines, or boot code needs to be written in
> +assembly. The same as in C, we group such code into functions and accompany

"we" is? Make this passive: "... such code is grouped into... "

> +them with data. Standard assemblers do not force users into precisely marking
> +these pieces as code, data, or even specifying their length. Nevertheless,
> +assemblers provide developers with such marks to aid debuggers throughout

s/marks/annotations/

> +assembly. On the top of that, developers also want to stamp some functions as

On top of that, ...

s/stamp/mark/

> +*global* to be visible outside of their translation units.

s/to be visible/in order to be visible/

> +
> +Over time, the Linux kernel took over macros from various projects (like

s/took over/has adopted/

> +``binutils``) to ease these markings.

s/ease these markings/facilitate such annotations/

> So for historic reasons, we have been

s/we have been/people have been/

> +using ``ENTRY``, ``END``, ``ENDPROC``, and other annotations in assembly. Due
> +to the lack of their documentation, the macros are used in rather wrong
> +contexts at some locations. Clearly, ``ENTRY`` was intended for starts of
> +global symbols (be it data or code).

s/for starts of global symbols/to denote the beginning of global symbols/

> ``END`` used to be the end of data or end

s/used to be/used to mark/

> +of special functions with *non-standard* calling convention. In contrast,
> +``ENDPROC`` should annotate only ends of *standard* functions.
> +
> +When these macros are used correctly, they help assemblers to generate a nice

s/to //

> +object with both sizes and types set correctly. For example the result of

comma: For example, the result of...

> +``arch/x86/lib/putuser.S``::
> +
> +   Num:    Value          Size Type    Bind   Vis      Ndx Name
> +    25: 0000000000000000    33 FUNC    GLOBAL DEFAULT    1 __put_user_1
> +    29: 0000000000000030    37 FUNC    GLOBAL DEFAULT    1 __put_user_2
> +    32: 0000000000000060    36 FUNC    GLOBAL DEFAULT    1 __put_user_4
> +    35: 0000000000000090    37 FUNC    GLOBAL DEFAULT    1 __put_user_8
> +
> +This is not only important for debugging purposes. When we have properly
> +marked objects like this, we can run tools on them and let the tools generate
> +more useful information. In particular, on properly marked objects, we can 
> run
> +``objtool`` and let it check and fix the object if needed.

Passive formulation pls, i.e., get rid of the "we". Pls check and fix
the whole text.

> Currently, it can
> +report missing frame pointer setup/destruction in functions. It can also
> +automatically generate annotations for *ORC unwinder* (cf.
> +<Documentation/x86/orc-unwinder.txt>) for most code. Both of these are
> +especially important to support reliable stack traces which are in turn
> +necessary for *Kernel live patching* (see
> +<Documentation/livepatch/livepatch.txt>).
> +
> +Caveat and Discussion
> +---------------------
> +As one might realize, there were only three macros previously. That is indeed
> +insufficient to cover all the combinations of cases:
> +
> +* standard/non-standard function
> +* code/data
> +* global/local symbol
> +
> +We had a discussion_ and instead of extending the current ``ENTRY/END*``
> +macros, it was decided that we should introduce brand new macros instead::
> +
> +    So how about using macro names that actually show the purpose, instead
> +    of importing all the crappy, historic, essentially randomly chosen
> +    debug symbol macro names from the binutils and older kernels?
> +
> +.. _discussion: 
> https://lkml.kernel.org/r/20170217104757.28588-1-jslaby@xxxxxxx
> +
> +Macros Description
> +------------------
> +
> +The new macros are prefixed with the ``SYM_`` prefix and can be divided into
> +three main groups:
> +
> +1. ``SYM_FUNC_*`` -- to annotate C-like functions. This means functions with
> +   standard C calling conventions, i.e. the stack contains a return address 
> at
> +   the predefined place and a return from the function can happen in a
> +   standard way. When frame pointers are enabled, save/restore of frame
> +   pointer shall happen at the start/end of a function, respectively, too.
> +
> +   Checking tools like ``objtool`` should ensure such marked functions 
> conform
> +   to these rules. The tools can also easily annotate these functions with
> +   debugging information (like *ORC data*) automatically.
> +
> +2. ``SYM_CODE_*`` -- special functions called with special stack. Be it
> +   interrupt handlers with special stack content, trampolines, or startup
> +   functions.
> +
> +   Checking tools mostly ignore checking of these functions. But some debug
> +   information still can be generated automatically. For correct debug data,
> +   this code needs hints like ``UNWIND_HINT_REGS`` provided by developers.
> +
> +3. ``SYM_DATA*`` -- obviously data belonging to ``.data`` sections and not to
> +   ``.text``. Data do not contain instructions, so they have to be treated
> +   specially by the tools: they should not treat the bytes as instructions,
> +   nor assign any debug information to them.
> +
> +Instruction Macros
> +~~~~~~~~~~~~~~~~~~
> +This section covers ``SYM_FUNC_*`` and ``SYM_CODE_*`` enumerated above.
> +
> +* ``SYM_FUNC_START`` and ``SYM_FUNC_START_LOCAL`` are supposed to be **the
> +  most frequent markings**. They are used for functions with standard calling
> +  conventions -- global and local. Like in C, they both align the functions 
> to
> +  architecture specific ``__ALIGN`` bytes. There are also ``_NOALIGN`` 
> variants
> +  for special cases where developers do not want this implicit alignment.
> +
> +  We offer also ``SYM_FUNC_START_WEAK`` and ``SYM_FUNC_START_WEAK_NOALIGN``
> +  marks as an assembler counterpart of the *weak* attribute known from C.

s/marks/markings/

                        ... counterpart to the *weak*...

> +
> +  All of these **shall** be coupled with ``SYM_FUNC_END``. First, it marks
> +  the sequence of instructions as a function and computes its size to the
> +  generated object file. Second, it also eases checking and processing such
> +  object files as the tools can trivially find exact start and end of a
> +  function.

I'd change that to "... find exact function boundaries."

> +
> +  So in most cases, developers should write something like in the following
> +  example, having more instructions in between the macros, of course::
> +
> +    SYM_FUNC_START(function_hook)
> +        retq
> +    SYM_FUNC_END(function_hook)

Just use "...asm insns ..." in there so that you don't have to say
"having more instructions in between the macros, of course":

        SYM_FUNC_START(function_hook)
                ... asm insns ...
        SYM_FUNC_END(function_hook)


> +  In fact, this kind of annotation corresponds to now deprecated ``ENTRY`` 
> and
> +  ``ENDPROC``.

"... corresponds to the now deprecated ``ENTRY`` and ``ENDPROC`` macros."

> +
> +* ``SYM_FUNC_START_ALIAS`` and ``SYM_FUNC_START_LOCAL_ALIAS`` serve for those
> +  who decided to have two or more names for one function. The typical use 
> is::
> +
> +    SYM_FUNC_START_ALIAS(__memset)
> +    SYM_FUNC_START(memset)
> +        ...
> +    SYM_FUNC_END(memset)
> +    SYM_FUNC_END_ALIAS(__memset)
> +
> +  In this example, one can call ``__memset`` or ``memset`` with the same
> +  result. Except the debug information for the instructions is generated to
> +  the object file only once -- for the non-``ALIAS`` case.

One sentence:

"... with the same result, except for the debugging information which is
generated in the object file only once - for the non-aliasing function."

> +
> +* ``SYM_CODE_START`` and ``SYM_CODE_START_LOCAL`` should be used only in
> +  special cases -- if you know what you are doing. This is used exclusively
> +  for interrupt handlers and similar where the calling convention is not the 
> C
> +  one. ``_NOALIGN`` variants exist too. The use is the same as for the 
> ``FUNC``
> +  category above::
> +
> +    SYM_CODE_START_LOCAL(bad_put_user)
> +        movl $-EFAULT,%eax
> +        EXIT

        ... asm insns ...

> +    SYM_CODE_END(bad_put_user)
> +
> +  Again, every ``SYM_CODE_START*`` **shall** be coupled by ``SYM_CODE_END``.

Btw, this coupling: I haven't gone through the whole patchset but do we
have automatic checking which makes sure a starting macro is coupled
with the correct ending macro?

> +
> +  To some extent, this category corresponds to deprecated ``ENTRY`` and
> +  ``END``. Except ``END`` had several other meanings too.
> +
> +* ``SYM_INNER_LABEL*`` is used to denote a label inside some
> +  ``SYM_{CODE,FUNC}_START`` and ``SYM_{CODE,FUNC}_END``.  They are very 
> similar
> +  to C labels, except they can be made global. An example of use::
> +
> +    SYM_CODE_START(ftrace_caller)
> +        /* save_mcount_regs fills in first two parameters */
> +        ...
> +
> +    SYM_INNER_LABEL(ftrace_caller_op_ptr, SYM_L_GLOBAL)
> +        /* Load the ftrace_ops into the 3rd parameter */
> +        ...
> +
> +    SYM_INNER_LABEL(ftrace_call, SYM_L_GLOBAL)
> +        call ftrace_stub
> +        ...
> +        retq
> +    SYM_CODE_END(ftrace_caller)
> +
> +Data Macros
> +~~~~~~~~~~~
> +Similar to instructions, we have a couple of macros to describe data in the
> +assembly.

> Again, they help debuggers to understand the layout of the resulting
> +object files.

No need for that sentence - I think it is clear by now what they're for.
:-)

> +
> +* ``SYM_DATA_START`` and ``SYM_DATA_START_LOCAL`` mark the start of some data
> +  and shall be used in conjunction with either ``SYM_DATA_END``, or
> +  ``SYM_DATA_END_LABEL``. The latter adds also a label to the end, so that
> +  people can use ``lstack`` and (local) ``lstack_end`` in the following
> +  example::
> +
> +    SYM_DATA_START_LOCAL(lstack)
> +        .skip 4096
> +    SYM_DATA_END_LABEL(lstack, SYM_L_LOCAL, lstack_end)
> +
> +* ``SYM_DATA`` and ``SYM_DATA_LOCAL`` are variants for simple, mostly 
> one-line
> +  data::
> +
> +    SYM_DATA(HEAP,     .long rm_heap)
> +    SYM_DATA(heap_end, .long rm_stack)
> +
> +  In the end, they expand to ``SYM_DATA_START`` with ``SYM_DATA_END``
> +  internally.
> +
> +Support Macros
> +~~~~~~~~~~~~~~
> +All the above reduce themselves to some invocation of ``SYM_START``,
> +``SYM_END``, or ``SYM_ENTRY`` at last. Normally, developers should avoid 
> using
> +these.
> +
> +Further, in the above examples, one could see ``SYM_L_LOCAL``. There are also
> +``SYM_L_GLOBAL`` and ``SYM_L_WEAK``. All are intended to denote linkage of a
> +symbol marked by them. They are used either in ``_LABEL`` variants of the
> +earlier macros, or in ``SYM_START``.
> +
> +
> +Overriding Macros
> +~~~~~~~~~~~~~~~~~
> +Architecture can also override any of the macros in their own

"Other architectures... "

> +``asm/linkage.h``, including macros specifying the type of a symbol
> +(``SYM_T_FUNC``, ``SYM_T_OBJECT``, and ``SYM_T_NONE``).  As every macro
> +described in this file is surrounded by ``#ifdef`` + ``#endif``, it is enough
> +to define the macros differently in the aforementioned architecture-dependent
> +header.

Thx.

-- 
Regards/Gruss,
    Boris.

Good mailing practices for 400: avoid top-posting and trim the reply.

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.