[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH 7/7] qobject atomics osdep: Make a few macros more hygienic



On Thu, Aug 31, 2023 at 03:25:46PM +0200, Markus Armbruster wrote:

[This paragraph written last: Bear with my stream of consciousness
review below, where I end up duplicating some of the conslusions you
reached before the point where I saw where the patch was headed]

> Variables declared in macros can shadow other variables.  Much of the
> time, this is harmless, e.g.:
> 
>     #define _FDT(exp)                                                  \
>         do {                                                           \
>             int ret = (exp);                                           \
>             if (ret < 0) {                                             \
>                 error_report("error creating device tree: %s: %s",   \
>                         #exp, fdt_strerror(ret));                      \
>                 exit(1);                                               \
>             }                                                          \
>         } while (0)

Which is why I've seen some projects require a strict namespace
separation: if all macro parameters and any identifiers declared in
macros use either a leading or a trailing _ (I prefer a trailing one,
to avoid risking conflicts with libc reserved namespace; but leading
is usually okay), and all other identifiers avoid that namespace, then
you will never have shadowing by calling a macro from normal code.

> 
> Harmless shadowing in h_client_architecture_support():
> 
>         target_ulong ret;
> 
>         [...]
> 
>         ret = do_client_architecture_support(cpu, spapr, vec, fdt_bufsize);
>         if (ret == H_SUCCESS) {
>             _FDT((fdt_pack(spapr->fdt_blob)));
>             [...]
>         }
> 
>         return ret;
> 
> However, we can get in trouble when the shadowed variable is used in a
> macro argument:
> 
>     #define QOBJECT(obj) ({                                 \
>         typeof(obj) o = (obj);                              \
>         o ? container_of(&(o)->base, QObject, base) : NULL; \
>      })
> 
> QOBJECT(o) expands into
> 
>     ({
> --->    typeof(o) o = (o);
>         o ? container_of(&(o)->base, QObject, base) : NULL;
>     })
> 
> Unintended variable name capture at --->.  We'd be saved by
> -Winit-self.  But I could certainly construct more elaborate death
> traps that don't trigger it.

Indeed, not fully understanding the preprocessor makes for some
interesting death traps.

> 
> To reduce the risk of trapping ourselves, we use variable names in
> macros that no sane person would use elsewhere.  Here's our actual
> definition of QOBJECT():
> 
>     #define QOBJECT(obj) ({                                         \
>         typeof(obj) _obj = (obj);                                   \
>         _obj ? container_of(&(_obj)->base, QObject, base) : NULL;   \
>     })

Yep, goes back to the policy I've seen of enforcing naming conventions
that ensure macros can't shadow normal code.

> 
> Works well enough until we nest macro calls.  For instance, with
> 
>     #define qobject_ref(obj) ({                     \
>         typeof(obj) _obj = (obj);                   \
>         qobject_ref_impl(QOBJECT(_obj));            \
>         _obj;                                       \
>     })
> 
> the expression qobject_ref(obj) expands into
> 
>     ({
>         typeof(obj) _obj = (obj);
>         qobject_ref_impl(
>             ({
> --->            typeof(_obj) _obj = (_obj);
>                 _obj ? container_of(&(_obj)->base, QObject, base) : NULL;
>             }));
>         _obj;
>     })
> 
> Unintended variable name capture at --->.

Yep, you've just proven how macros calling macros requires care, as we
no longer have the namespace protections.  If macro calls can only
form a DAG, it is possible to choose unique intermediate declarations
for every macro (although remembering to do that, and still keeping
the macro definition legible, may not be easy).  But if you can have
macros that form any sort of nesting loop (and we WANT to allow things
like MIN(MIN(a, b), c) - which is such a loop), dynamic naming becomes
the only solution.

> 
> The only reliable way to prevent unintended variable name capture is
> -Wshadow.

Yes, I would love to have that enabled eventually.

> 
> One blocker for enabling it is shadowing hiding in function-like
> macros like
> 
>      qdict_put(dict, "name", qobject_ref(...))
> 
> qdict_put() wraps its last argument in QOBJECT(), and the last
> argument here contains another QOBJECT().
> 
> Use dark preprocessor sorcery to make the macros that give us this
> problem use different variable names on every call.

Sounds foreboding; hopefully not many macros are affected...

> 
> Signed-off-by: Markus Armbruster <armbru@xxxxxxxxxx>
> ---
>  include/qapi/qmp/qobject.h |  8 +++++---
>  include/qemu/atomic.h      | 11 ++++++-----
>  include/qemu/osdep.h       | 34 +++++++++++++++++++---------------
>  3 files changed, 30 insertions(+), 23 deletions(-)

...okay, the size of the diffstat seems tolerable (good that we don't
have many macros that expand to a temporary variable declaration and
which are likely to be heavily reused from nested contexts).

> 
> diff --git a/include/qapi/qmp/qobject.h b/include/qapi/qmp/qobject.h
> index 9003b71fd3..7b50fc905d 100644
> --- a/include/qapi/qmp/qobject.h
> +++ b/include/qapi/qmp/qobject.h
> @@ -45,10 +45,12 @@ struct QObject {
>      struct QObjectBase_ base;
>  };
>  
> -#define QOBJECT(obj) ({                                         \
> -    typeof(obj) _obj = (obj);                                   \
> -    _obj ? container_of(&(_obj)->base, QObject, base) : NULL;   \
> +#define QOBJECT_INTERNAL(obj, l) ({                                     \
> +    typeof(obj) PASTE(_obj, l) = (obj);                                 \
> +    PASTE(_obj, l)                                                      \
> +        ? container_of(&(PASTE(_obj, l))->base, QObject, base) : NULL;  \
>  })
> +#define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__)

Slick!  Every call to QOBJECT() defines a unique temporary variable
name.  But as written, QOBJECT(o) expands to this (when __COUNTER__ is
1):

({
  typeof((o)) _obj1 = ((o));
  _obj1 ? container_of(&(_obj1)->base, QObject, base) : NULL;
})

which has three sets of redundant parens that could be elided.  Why do
you need to add () around obj when forwarding to QOBJECT_INTERNAL()?
The only way the expansion of the text passed through the 'obj'
parameter can contain a comma is if the user has already parenthesized
it on their end (not that I expect a comma expression to be a frequent
argument to QOBJECT(), but who knows).  Arguing that it is to silence
a syntax checker is weak; since we must NOT add parens around the
parameters to QOBJECT_INTERNAL (calling PASTE((_obj), (l)) is
obviously wrong).

Meanwhile, the repetition of three calls to PASTE() is annoying.  How
about:

#define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \
  typeof _obj _tmp = _obj; \
  _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \
  )}
#define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp)
#define QOBJECT_INTERNAL(_arg, _ctr) \
  QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr))
#define QOBJECT(obj) QOBJECT_INTERNAL((obj), __COUNTER__)

or:

#define QOBJECT_INTERNAL_2(_obj, _tmp) ({ \
  typeof(_obj) _tmp = (_obj); \
  _tmp ? container_of(&_tmp->base, QObject, base) : NULL; \
  )}
#define QOBJECT_INTERNAL_1(_arg, _tmp) QOBJECT_INTERNAL_2(_arg, _tmp)
#define QOBJECT_INTERNAL(_arg, _ctr) \
  QOBJECT_INTERNAL_1(_arg, PASTE(_obj, _ctr))
#define QOBJECT(obj) QOBJECT_INTERNAL(obj, __COUNTER__)

where, in either case, QOBJECT(o) should then expand to

({
  typeof (o) _obj1 = (o);
  _obj1 ? container_of(&_obj1->base, QObject, base) : NULL;
})

>  
>  /* Required for qobject_to() */
>  #define QTYPE_CAST_TO_QNull     QTYPE_QNULL
> diff --git a/include/qemu/atomic.h b/include/qemu/atomic.h
> index d95612f7a0..3f80ffac69 100644
> --- a/include/qemu/atomic.h
> +++ b/include/qemu/atomic.h
> @@ -157,13 +157,14 @@
>      smp_read_barrier_depends();
>  #endif
>  
> -#define qatomic_rcu_read(ptr)                          \
> -    ({                                                 \
> +#define qatomic_rcu_read_internal(ptr, l)               \
> +    ({                                                  \
>      qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
> -    typeof_strip_qual(*ptr) _val;                      \
> -    qatomic_rcu_read__nocheck(ptr, &_val);             \
> -    _val;                                              \
> +    typeof_strip_qual(*ptr) PASTE(_val, l);             \
> +    qatomic_rcu_read__nocheck(ptr, &PASTE(_val, l));    \
> +    PASTE(_val, l);                                     \
>      })
> +#define qatomic_rcu_read(ptr) qatomic_rcu_read_internal((ptr), __COUNTER__)

Same observation about being able to reduce the number of PASTE()
calls by adding yet another layer of macro invocations.

>  
>  #define qatomic_rcu_set(ptr, i) do {                   \
>      qemu_build_assert(sizeof(*ptr) <= ATOMIC_REG_SIZE); \
> diff --git a/include/qemu/osdep.h b/include/qemu/osdep.h
> index 21ef8f1699..9c191ebe99 100644
> --- a/include/qemu/osdep.h
> +++ b/include/qemu/osdep.h
> @@ -371,18 +371,21 @@ void QEMU_ERROR("code path is reachable")
>   * have to open-code it.  Sadly, Coverity is severely confused by the
>   * constant variants, so we have to dumb things down there.
>   */
> +#define PASTE(a, b) a##b
> +#define MIN_INTERNAL(a, b, l)                                           \
> +    ({                                                                  \
> +        typeof(1 ? (a) : (b)) PASTE(_a, l) = (a), PASTE(_b, l) = (b);   \
> +        PASTE(_a, l) < PASTE(_b, l) ? PASTE(_a, l) : PASTE(_b, l);      \
> +    })

And again.

I think you are definitely on the right track to have all internal
variable declarations within a macro definition use multi-layered
expansion with the help of __COUNTER__ to ensure that the macro's
temporary variable is globally unique; so if you leave it as written,
I could live with:

Reviewed-by: Eric Blake <eblake@xxxxxxxxxx>

But if you respin it to pick up any of my suggestions, I'll definitely
spend another review cycle on the result.

If it helps, here's what I ended up using in nbdkit:

https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/unique-name.h#L36
/* https://stackoverflow.com/a/1597129
 * https://stackoverflow.com/a/12711226
 */
#define XXUNIQUE_NAME(name, counter) name ## counter
#define XUNIQUE_NAME(name, counter) XXUNIQUE_NAME (name, counter)
#define NBDKIT_UNIQUE_NAME(name) XUNIQUE_NAME (name, __COUNTER__)

https://gitlab.com/nbdkit/nbdkit/-/blame/master/common/include/minmax.h#L47
#include "unique-name.h"

#undef MIN
#define MIN(x, y) \
  MIN_1 ((x), (y), NBDKIT_UNIQUE_NAME (_x), NBDKIT_UNIQUE_NAME (_y))

...
#define MIN_1(x, y, _x, _y) ({                 \
      __auto_type _x = (x);                    \
      __auto_type _y = (y);                    \
      _x < _y ? _x : _y;                       \
    })

-- 
Eric Blake, Principal Software Engineer
Red Hat, Inc.
Virtualization:  qemu.org | libguestfs.org




 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.