Xen project Mailing List

Re: Proposal for deviations in static analyser findings

From: Luca Fancellu <Luca.Fancellu@xxxxxxx>

Date: Tue, 18 Oct 2022 15:49:50 +0000

Accept-language: en-GB, en-US

Arc-authentication-results: i=2; mx.microsoft.com 1; spf=pass (sender ip is 63.35.35.123) smtp.rcpttodomain=lists.xenproject.org smtp.mailfrom=arm.com; dmarc=pass (p=none sp=none pct=100) action=none header.from=arm.com; dkim=pass (signature was verified) header.d=armh.onmicrosoft.com; arc=pass (0 oda=1 ltdi=1 spf=[1,1,smtp.mailfrom=arm.com] dkim=[1,1,header.d=arm.com] dmarc=[1,1,header.from=arm.com])

Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass smtp.mailfrom=arm.com; dmarc=pass action=none header.from=arm.com; dkim=pass header.d=arm.com; arc=none

Arc-message-signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/JK6iBwdWgdNikg4jzo96cher+aqzD3zrjGnzg1PmSk=; b=ThrkU0HNmcnGeFgTabUbEkvWXsgQZgmArOyQCJqcf+vIU4O9FPR9zxH0zG+XQ9xpD2K47qA71UJFPx+RZHcJ3nuiHCIXv0Sbs7/+uU/gSO7CjTH8bH9eiUC8wE4jmvLusqYjt+w5kSymWrDJp3O+FEFj3eZ/noEYbRNxt7aRdqPf/reV5IroxsIPCIp/EmImkvUvwQ2tDR3bub9RAdPrzCZKtLXbwgPKEMdJ8uRF5H2Uf6TrVgNrcFYIUteV04RrBbIEe4gICGMOOLOKxK3nb3R58ksw6QWVrhj6xewIyOFckxF/VCDlsWUbq/UeEwe9Z7StvwhGtZJU6CTl8lsjug==

Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=/JK6iBwdWgdNikg4jzo96cher+aqzD3zrjGnzg1PmSk=; b=NFM8uX6nPNKNbhVc2w3Nji9kr64RX9bhgpjm2GRo18IZQyxUtHf0Bd0YZGUPlY4bbORkT/+MHUfhSB6Q/w0J0XNssibxgnUwOgfvsa8T7wB7nzw8EkCIVAwyeZ0QP2Z+Np2Jo6AQ0T8tGO8bQw0OvxlQ+M0YC3AhYUJxkAwesMK3BQ7kBkFbibkGIeNBhDBaqgusyeInjWZmpM053GD38HK4BySCtlJGcJvIL615lLOwaXgK9sxIxPjZMcXvJ+EjoF0amBPcC354/rRK/d8zB72Rcdj5RYlB0W7k+/Fpw061GxV3K+fsq6lKbdbOfxHxYMO8YKJqrYxnaRVOIZQCiw==

Arc-seal: i=2; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=pass; b=Tp5WWOqgyIt0ou1FRI/J5O9+azC1Zey2K7voGUOVaFdGBiC6NoMJNFN55Yuy+QCQNplhoP/LiUr5lyimX7TjBCRMssgt07EA/aYbysv3M2NwJM631QoTuDHRX0faiqKpXQu8XDRIwcMmBuoqRq/MvGvsJswQ4uAJI2owfCwHJJweJGRmD5L7AYPuCMr1JdEv+x9VifeXycI00XD1hHzLUtocq26w06gy8KgCKQrVApLlRBheUejlX8Txw3Fp6QdgBOam6MUqmDCyrkvYxCEBxkl1wQDcmkUof8Y9WB22VLbH8GrHOiYmU5lb0JVzO63gFQh5Bkn+3XTE7kMczroKVA==

Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=QocmTyuVeCVYFhfJCB7pK9JhcK/CyFBcc8A3orqxUvc0G5+DHKc7atARVYMXv1+lgknKeoece1Ro2JtZwIsFcMpeaDrijUz5plJoL1QLDuB/VyRzFyfTMuuqS892O+PvUFp9K7xX+33UHyhh3CHCQvrqT4e6mqff1sjFCcZJg2kJgyhiA68eji3XYtHq9TRhwLunjh1xeYDLSpKJhP9Bc4gBv1EyrRDXrRkTzSiEWmSCNfPS+a1UicVi5hQ4dM+00hwQNka2T3t0T42T8Hzv4uOpiXEmLg7CndYvZoORXHN9yo0xaIWXJIjnuWoBeZPFKmMRunPDHzl0oc45waftSQ==

Authentication-results-original: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;

Cc: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Wei Chen <Wei.Chen@xxxxxxx>, Julien Grall <julien@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Roger Pau Monné <roger.pau@xxxxxxxxxx>, George Dunlap <george.dunlap@xxxxxxxxxx>, Xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 18 Oct 2022 15:50:20 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Nodisclaimer: true

Original-authentication-results: dkim=none (message not signed) header.d=none;dmarc=none action=none header.from=arm.com;

Thread-index: AQHY3lPHTZwvWtdyCU+0K6Hw9nMyM64L9DeAgAAneoCAABdHAIAIGdUAgAADdICAAAWaAA==

Thread-topic: Proposal for deviations in static analyser findings

> On 18 Oct 2022, at 16:29, Jan Beulich <jbeulich@xxxxxxxx> wrote: > > On 18.10.2022 17:17, Luca Fancellu wrote: >>> On 13 Oct 2022, at 12:34, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>> On 13.10.2022 12:11, Luca Fancellu wrote: >>>>> On 13 Oct 2022, at 08:50, Jan Beulich <jbeulich@xxxxxxxx> wrote: >>>>> On 12.10.2022 18:00, Luca Fancellu wrote: >>>>>> Entries in the database should never be removed, even if they are not >>>>>> used >>>>>> anymore in the code (if a patch is removing or modifying the faulty >>>>>> line). >>>>>> This is to make sure that numbers are not reused which could lead to >>>>>> conflicts >>>>>> with old branches or misleading justifications. >>>>> >>>>> Can we add provisions for shrinking such entries to e.g. just their "id" >>>>> line? Or is the intention to be able to re-use such an entry if a matching >>>>> instance appears again later? >>>> >>>> I prefer to don’t shrink it, the name itself is not very long, even using >>>> many digits of the incremental >>>> number, it removes also the dependency on the file name. >>> >>> Name length isn't relevant here, and I have no idea what dependency on a >>> file name you're thinking of. My question is a scalability one: Over time >>> the table will grow large. If all entries remain there in full forever, >>> table size may become unwieldy. >> >> Ok I misunderstood your question, now I understand what you are asking, we >> could remove the content >> of the “analyser” dictionary for sure, because if there is not anymore a >> link with the current code. >> >> Regarding removing the “name” and “text”, could it be that at some point we >> can introduce in the code >> a violation that requires the same justification provided by the “orphan” >> entry? >> In that case we could reuse that entry without creating a new one that will >> only waste space. >> What is the opinion on this? > > Well, yes, this is the one case that I, too, was wondering about. It's not > clear to me whether it wouldn't be better to allocate a fresh ID in such a > case. It’s not clear to me the opposite, that is why we would allocate a fresh ID in this case. > >>>>>> Here a brief explanation of the field inside an object of the "content" >>>>>> array: >>>>>> - id: it is a unique string that is used to refer to the finding, many >>>>>> finding >>>>>> can be tagged with the same id, if the justification holds for any >>>>>> applied >>>>>> case. >>>>>> It tells the tool to substitute a Xen in-code comment having this >>>>>> structure: >>>>>> /* SAF-0-safe [...] \*/ >>>>>> - analyser: it is an object containing pair of key-value strings, the >>>>>> key is >>>>>> the analyser, so it can be cppcheck, coverity or eclair. The value is the >>>>>> proprietary id corresponding on the finding, for example when coverity is >>>>>> used as analyser, the tool will translate the Xen in-code coment in this >>>>>> way: >>>>>> /* SAF-0-safe [...] \*/ -> /* coverity[coverity-id] \*/ >>>>> >>>>> In here, where would coverity-id come from? And how does the >>>>> transformation >>>>> here match up with the value of the "coverity": field in the table? >>>> >>>> I can put an example of that, as you pointed out it could be difficult to >>>> get where >>>> this proprietary tool ID comes from. >>>> >>>> The proprietary ID (Coverity in this case) comes from the report it >>>> produces: >>>> >>>> […] >>>> <file path>:<line number>: >>>> 1. proprietary_ID: […] >>>> […] >>>> >>>> after we see the finding, we take that ID, we put it in the “analyser” >>>> dictionary as: >>>> >>>> […] >>>> "id":”SAF-2-safe", >>>> “analyser”: { >>>> “coverity”: “proprietary_ID" >>>> }, >>>> […] >>>> >>>> So in the source code we will have: >>>> >>>> /* SAF-2-safe [optional text] */ >>>> C code affected line; >>>> >>>> And when the analysis will be performed, the tool (coverity for example) >>>> will run on this source code: >>>> >>>> /* coverity[proprietary_ID] */ >>>> C code affected line; >>>> >>>> The tool will write a report and will suppress the finding with >>>> “proprietary_ID” that comes in the C code >>>> line after the comment. >>> >>> Let me add some background to my earlier comment: >>> >>> If we wanted to add such IDs to the table, then I guess this would result in >>> a proliferation of entries. If my observations haven't misguided me, >>> Coverity might re-use the same ID for multiple similar new issues found in a >>> single run, but it would not re-use them across runs. Hence irrespective of >>> their similarity, multiple table entries would be needed just because of the >>> different Coverity IDs. >> >> Coverity will use every run the same id for the same class of violation, for >> example >> misra_c_2012_rule_8_6_violation for violation of rule 8.6. > > Hmm, I've never seen such. I always saw it use numeric IDs, and we've > actually been putting these in commits when addressing their findings. I don’t know how does it work for Coverity Scan, but it should just run static analysis on the code and not MISRA compliance. The “Coverity” suite instead use fixed IDs for every class of violations, also in case of static analysis, an example here: xen/arch/arm/arm64/lib/bitops.c:124: 1. set_but_not_used: variable "tmp" was set but never used > >>>> After the analysis, the source code will return as the original (with the >>>> SAF-* tag). >>> >>> While you mention something similar also as step 3 in the original document >>> near the top, I'm afraid I don't understand what this "return as the >>> original" >>> means. If you want to run the tool on an altered (comments modified) source >>> tree, what I'd expect you to do is clone the sources into a throw-away tree, >>> massage the comments, run the tool, and delete the massaged tree. >>>>>> if the object doesn't have a key-value, then the corresponding in-code >>>>>> comment won't be translated. >>>>> >>>>> Iirc at least Coverity ignores certain instances of what it might consider >>>>> violations (fall-through in switch() statements in particular) in case >>>>> _any_ comment is present. Therefore may I suggest that such comments be >>>>> deleted (really: replaced by a blank line, to maintain correct line >>>>> numbering) if there's no matching key-value pair? >>>> >>>> Yes the line won’t be altered if there is no match. This to ensure the >>>> correct line >>>> numbering is not affected. >>> >>> "won't be altered" is the opposite of what I've been asking to consider: >>> Observing that comments _regardless_ of their contents may silence findings, >>> the suggestion is to remove comments (leaving a blank line) when there's no >>> entry for the targeted tool in the table entry. >> >> Why? The tag comment won’t do anything, it would act as a blank line from >> the analyser >> perspective. > > The _tag_ won't do anything, but as said any _comment_ may have an effect. Yes, any comment that is using a proprietary syntax for the tools we use: /* cppcheck-suppress[proprietary_ID] */ /* coverity[proprietary_ID] */ /* -E> hide proprietary_ID 1 “" */ May have an effect. If an entry in the database has no match with the used tool, then it would stay as (for example): /* SAF-X-safe [blablabla] */ Which has no effect on any tool, hence I don’t see the needs to replace it with a blank line. Cheers, Luca > > Jan

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.