[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Refactoring of a possibly unsafe pattern for variable initialization via function calls


  • To: nicola <nicola.vetrini@xxxxxxxxxxx>
  • From: Jan Beulich <jbeulich@xxxxxxxx>
  • Date: Fri, 16 Jun 2023 09:19:12 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=suse.com; dmarc=pass action=none header.from=suse.com; dkim=pass header.d=suse.com; arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=p9GTJ1kuzPSWUrtDGYm9JyTBeMTLVQJg9R+4uoGTD1c=; b=PubUqC4aHlK5NyVfg29+aGfYXzm0UKhNB05haOmTw57R7ggIJW8BLUn0Kbv5uloa138vHq7qpojOiejQQilBI+UrhzP9kp4krXS3YfRCpFGvkGX5crVg6Tcn9mIvfdE/C1Y4p8KxaPevWwiK1ykj2DQzeAlcp16Mq2HqbeeItZR39hey+WMlFITLbyvj6BoRcwV6FAb7FJ/DLXivDI38BVxb5HzxJD4WWXXMuPdzelmuZkZQPFG2RMMAe41c6PVZdeN/UlLVHLc+uqpqs0CkOp8NDMXxnr1n42YBP0yrSo+JnFO8yHJAnvIBQOVJ5BU67Jh9ZV4Kbt+7Yiuh8uPzEw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QO/kaLyifzNhBvQCUdlOCfIOEoBZznPZmzHQF/YPaO44n1552TpLIMKl4dVr4n/bt+KN1zkJgJ1JTFE9gw8O/VF37Kq/ebQTlMtQ6F9irfG5WSmuPzG0UfuaE2HMnIMoBwYLCMEsQPIWVS1hTG86BF4rVaWsPAUpxOVlyEqgG9T/On80PfNfv8pOvVVzSmwdw6xBztCSO+u7nZNpDoVa34DU26bCaJ4GplodyxUg1vdHNJWI16mDTbNmnXYh7a4zgTIoy8ximS1RVTdh9XEYEcZn2YNFPRydHyDYLhGzl8ivRDSgZLtr26ZNnzf0k4Soddc4Kcdi+nyXLkcVWGTzyg==
  • Authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=suse.com;
  • Cc: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>, Wei Liu <wl@xxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Fri, 16 Jun 2023 07:19:37 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

On 15.06.2023 18:39, nicola wrote:
> while investigating possible patches regarding Mandatory Rule 9.1, I
> found the following pattern, that is likely to results in a lot possible
> positives from many (all) static analysis tools for this rule.
> 
> This is the current status (taken from `xen/common/device_tree.c:135')
> 
> 
> const struct dt_property *dt_find_property(const struct dt_device_node *np,
>                                             const char *name, u32 *lenp)
> {
>      const struct dt_property *pp;
> 
>      if ( !np )
>          return NULL;
> 
>      for ( pp = np->properties; pp; pp = pp->next )
>      {
>          if ( dt_prop_cmp(pp->name, name) == 0 )
>          {
>              if ( lenp )
>                  *lenp = pp->length;
>              break;
>          }
>      }
> 
>      return pp;
> }
> 
> 
> 
> 
> It's very hard to detect that the pointee is always written whenever a 
> non-NULL pointer for `lenp' is supplied, and it can safely be read in 
> the callee, so a sound analysis will err on the cautious side.

I'm having trouble seeing why this is hard to recognize: The loop can
only be exited two ways: pp == NULL or with *lenp written.

For rule 9.1 I'd rather expect the scanning tool (and often the compiler)
to get into trouble with the NULL return value case, and *lenp not being
written yet apparently consumed in the caller. Then, however, ...

> My proposal, in a future patch, is to refactor these kinds of functions 
> as follows:
> 
> 
> const struct dt_property *dt_find_property(const struct dt_device_node *np,
>                                             const char *name, u32 *lenp)
> {
>      u32 len = 0;
>      const struct dt_property *pp;
> 
>      if ( !np )
>          return NULL;

... this path would be a problem as well.

>      for ( pp = np->properties; pp; pp = pp->next )
>      {
>          if ( dt_prop_cmp(pp->name, name) == 0 )
>          {
>              len = pp->length;
>              break;
>          }
>      }
> 
>      if ( lenp )
>          *lenp = len;
>      return pp;
> }
> 
> 
> The advantage here is that we can easily argue that `*lenp' is always
> initialized by the function (if not NULL) and inform the tool about
> this, which is a safer API and also resolves almost all subsequent
> "don't know"s about further uses of the variables involved (e.g. `lenp').

The disadvantage is that in a more complex case and with the function
e.g. being static, the initializer of "len" may prevent compiler /
tools from spotting cases where the variable would (otherwise) truly
(and wrongly) remain uninitialized (and that fact propagating up the
call chain, through - in this example - whatever variable's address
the caller passed for "lenp"). IOW - I don't think a common pattern
can be agreed upon up front for cases like this one.

Jan



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.