[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Deadcode discussion based on Arm NS phys timer


  • To: Bertrand Marquis <Bertrand.Marquis@xxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>, Julien Grall <julien@xxxxxxx>
  • From: Michal Orzel <michal.orzel@xxxxxxx>
  • Date: Wed, 26 Oct 2022 13:29:59 +0200
  • Arc-authentication-results: i=1; mx.microsoft.com 1; spf=pass (sender ip is 165.204.84.17) smtp.rcpttodomain=arm.com smtp.mailfrom=amd.com; dmarc=pass (p=quarantine sp=quarantine pct=100) action=none header.from=amd.com; dkim=none (message not signed); arc=none
  • Arc-message-signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901; h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1; bh=1MKL4/bJPIbrbqSmm1MNH2U23zvEtOLoWhA71QkNGbc=; b=Y7i9WHdbzf6t6vtz5KtWYY7wi96QNzTtXbh4kMcE02K+Ttcurkmj9LyB6F4hMJ402UEbKTFz6b2SkTUkx6wQ4Ecuj5zPEoD2ZE278gGOWCJobOavcZ8dPw73pjczZbHs26x1oOYAAYxyLwDqtM4ZA/7y3kVQE4bf7gsIiIVL+T4DSEU9cVksDM75U32zBTkiqHHsZ8Yss+CEkRVglAFqaIbV1pKr5XzcN6y0OU+GWSQY6QzKy8BP5GdOWPqLmapFzAoPURic+4/NsmatL438wMb4a6063Oa9hZ6noDthK993HLdPAkUx6J4DAiZVKqCrwjm4q5gN/djsi2k/Fru/Tw==
  • Arc-seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none; b=N87Bn6kKXWsIU0qB2EJsRsNajHsXcjNnXlwcHMEwkI4NF4psBn5HW2h1frPZ8tjf5a1Lb7wSANOURwQFCi6nAKE3684Gc+8KX15NuiH8ALOcoUhbNga/XeDOHX9dxfJkH+FZNrtM2acNmRi2CKOhp2eQ+19X5uKHwnaFuxwuYlc/y+KhFoJEglPNQ3z/1Vt6vV/XCHMxU/G/FsmTGF+qxrapQA9/ty4/F8ZYcUp30VKfipkvcnW0rQ6+MTyQZNpZDZT/DQDdF4OJJ6dyMwUdOoxmAJC8KK5cza/scy9QdXrKhoJ5F5aPOXSbpTPBEkanHfObMoPw90C98FL0nMaCzw==
  • Cc: "xen-devel@xxxxxxxxxxxxxxxxxxxx" <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Wed, 26 Oct 2022 11:30:39 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Hi all,

On 25/10/2022 10:20, Bertrand Marquis wrote:
> Caution: This message originated from an External Source. Use proper caution 
> when opening attachments, clicking links, or responding.
> 
> 
> Hi Michal,
> 
>> On 25 Oct 2022, at 09:07, Michal Orzel <michal.orzel@xxxxxxx> wrote:
>>
>> Hi Bertrand,
>>
>> On 25/10/2022 09:45, Bertrand Marquis wrote:
>>>
>>>
>>> Hi Michal,
>>>
>>>> On 25 Oct 2022, at 08:11, Michal Orzel <michal.orzel@xxxxxxx> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 25/10/2022 03:29, Stefano Stabellini wrote:
>>>>>
>>>>>
>>>>> On Mon, 24 Oct 2022, Julien Grall wrote:
>>>>>>> On 24/10/2022 12:51, Julien Grall wrote:
>>>>>>>> Caution: This message originated from an External Source. Use proper
>>>>>>>> caution when opening attachments, clicking links, or responding.
>>>>>>>>
>>>>>>>>
>>>>>>>> On 24/10/2022 10:07, Michal Orzel wrote:
>>>>>>>>> Hello,
>>>>>>>>
>>>>>>>> Hi Michal,
>>>>>>>>
>>>>>>>>> Recently I came across a deadcode in Xen Arm arch timer code. Briefly
>>>>>>>>> speaking, we are routing
>>>>>>>>> the NS phys timer (CNTP) IRQ to Xen, even though Xen does not make use
>>>>>>>>> of it (as it uses the hypervisor timer CNTHP).
>>>>>>>>> This timer is fully emulated, which means that there is nothing that 
>>>>>>>>> can
>>>>>>>>> trigger such IRQ. This code is
>>>>>>>>> a left over from early days, where the CNTHP was buggy on some models
>>>>>>>>> and we had to use the CNTP instead.
>>>>>>>>>
>>>>>>>>> As far as the problem itself is not really interesting, it raises a
>>>>>>>>> question of what to do with a deadcode,
>>>>>>>>> as there might be/are other deadcode places in Xen.
>>>>>>>>
>>>>>>>> There are multiple definition of deadcode. Depending on which one you
>>>>>>>> chose, then this could cover IS_ENABLED() and possibly #ifdef. So this
>>>>>>>> would result to a lot of places impacted with the decision.
>>>>>>>>
>>>>>>>> So can you clarify what you mean by deadcode?
>>>>>>> In the timer example, I think we have both a deadcode and unreachable 
>>>>>>> code.
>>>>>>> For the purpose of this discussion, let's take the MISRA definition of a
>>>>>>> deadcode which is a "code that can be executed
>>>>>>> but has no effect on the functional behavior of the program". This 
>>>>>>> differs
>>>>>>> from the unreachable code definition that is
>>>>>>> a "code that cannot be executed". Setting up the IRQ for Xen is an 
>>>>>>> example
>>>>>>> of a deadcode. Code within IRQ handler is an unreachable code
>>>>>>> (there is nothing that can trigger this IRQ).
>>>>>>>
>>>>>>> What I mean by deadcode happens to be the sum of the two cases above 
>>>>>>> i.e.
>>>>>>> the code that cannot be executed as well as the code that
>>>>>>> does not impact the functionality of the program.
>>>>>>>
>>>>>>>>
>>>>>>>>> One may say that it is useful to keep it, because one day,
>>>>>>>>> someone might need it when dealing with yet another broken HW. Such
>>>>>>>>> person would still need to modify the other
>>>>>>>>> part of the code (e.g. reprogram_timer), but there would be less work
>>>>>>>>> required overall. Personally, I'm not in favor of
>>>>>>>>> such approach, because we should not really support possible scenarios
>>>>>>>>> with broken HW (except for erratas listing known issues).
>>>>>>>>
>>>>>>>> The difference between "broken HW" and "HW with known errata" is a bit
>>>>>>>> unclear to me. Can you clarify how you would make the difference here?
>>>>>>>>
>>>>>>>> In particular, at which point do you consider that the HW should not be
>>>>>>>> supported by Xen?
>>>>>>> I'm not saying that HW should not be supported. The difference for me
>>>>>>> between broken HW and
>>>>>>> HW with known errata is that for the former, the incorrect behavior is 
>>>>>>> often
>>>>>>> due to the early support stage,
>>>>>>> using emulators/models instead of real HW, whereas for the latter, the 
>>>>>>> HW is
>>>>>>> already released and it happens to be that it is buggy
>>>>>>> (the HW vendor is aware of the issue and released erratas).
>>>>>>
>>>>>> Thanks for the clarification. What I would call broken is anything that 
>>>>>> can't
>>>>>> be fixed in software. For a not too fictional example, an HW where PCI 
>>>>>> devices
>>>>>> are using the same stream ID. So effectively, passthrough can't be safely
>>>>>> supported.
>>>>>>
>>>>>> Regarding, not yet released HW, I don't think Xen should have workaround 
>>>>>> for
>>>>>> them. I wouldn't even call it "broken" because they are not yet released 
>>>>>> and
>>>>>> it is common to have bug in early revision.
>>>>>>
>>>>>>> Do we have any example in Xen for supporting broken HW,
>>>>>>> whose vendor is not aware of the issue or did not release any errata?
>>>>>> I will not cite any HW on the ML. But from my experience, the vendors 
>>>>>> are not
>>>>>> very vocal about issues in public (some don't even seem to have public 
>>>>>> doc).
>>>>>> The best way to find the issues is to look at Linux commit.
>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>>> Also, as part of the certification/FUSA process, there should be no
>>>>>>>>> deadcode and we should have explanation for every block of code.
>>>>>>>>
>>>>>>>> See above. What are you trying to cover by deadcode? Would protecting
>>>>>>>> code with IS_ENABLED() (or #ifdef) ok?
>>>>>>> I think this would be ok from the certification point of view (this 
>>>>>>> would at
>>>>>>> least means, that we are aware of the issue
>>>>>>> and we took some steps). Otherwise, such code is just an example of a
>>>>>>> deadcode/unreachable code.
>>>>>>
>>>>>> Thanks for the clarification. So the exact approach will depend on the
>>>>>> context....
>>>>>>
>>>>>>>>> There are different ways to deal with a deadcode: > 1. Get rid of it
>>>>>>>>> completely
>>>>>>>>> 2. Leave it as it is
>>>>>>
>>>>>> ... this is my preference in the context of the timer.
>>>>>
>>>>> From a certification point of view, the fewer lines of code the better,
>>>>> and ideally all the lines of code used for the certified build should be
>>>>> testable and used.
>>>>>
>>>>> So I think 2. is the lest useful option from a certification
>>>>> perspective. For this reason, I'd prefer another alternative.
>>>>>
>>>>>
>>>>>> If the other don't like it, then 1 would be my preference.
>>>>>>
>>>>>> In general, my preference would be either 3.3 or 3.2 (see below).
>>>>>
>>>>> I also think that 3.2 and 3.3 are good options for the general case. For
>>>>> the timer, I can see why 1 is your (second) preference and I am fine
>>>>> with 1 as well.
>>>> Ok, sounds good to me. Let's still give Bertrand the chance to share his 
>>>> opinion.
>>>
>>> We need to get rid of dead code and removing it is not always the best 
>>> solution.
>>>
>>> If the code is or could be useful for someone some day, protecting it with 
>>> ifdef is ok.
>>>
>>> In the mid term we will have to introduce a lot more ifdef or IS_ENABLED in 
>>> the
>>> code so that we can compile out what we do not need and code not applying to
>>> some hardware is a case where we will do that (does not mean that by default
>>> we will not compile it in but we will make it easier to reduce the code 
>>> size for a
>>> specific use case).
>>>
>>> So 3.2 and 3.3 are ok for me.
>>
>> So we all agree that the code in the current form is a no go from 
>> certification purposes.
>> That is good :)
>>
>> The reason why I opt for solution 1 and not the others is that in the latter 
>> case it would
>> mean introducing the Kconfig option to allow user to select the timer to be 
>> used by Xen.
>> This is not really correct. Also in the current form, it would also require 
>> adding more
>> code to time.c code because at the moment using CNTP for Xen would not work 
>> out of the box.
>> The architecture defines the hypervisor timer for a purpose. If it does not 
>> work, it means
>> that the HW is problematic. I agree that we would want to support such use 
>> case but I'm not
>> really aware of any issue like that. Adding more code and Kconfig options 
>> just because
>> one day someone may face issues with a new HW is something I am not a fan of.
> 
> I see 2 solutions here:
> - somehow push the code to a different file (not quite sure this is feasible 
> here)
> - remove completely the code with a clean commit. Doing this it will be easy 
> for someone needing this to later revert the patch
> 
> It is definitely true here that adding more code to keep some unused code 
> does not really make sense.
> And let’s be realistic here, if we need that one day, it will not take ages 
> to support it somehow.
> 
> As said, from a pure certification point of view:
> - we must not have deadcode
> - proper ifdef is acceptable
> - if 0 is not acceptable
> - commented code is not acceptable

Given that we agree on that (+ IS_ENABLED option if possible), and the option 1 
seems
to be the best choice for the timer, I will create a patch removing the IRQ 
path to get rid
of the deadcode/unreachable code.

Do you think this is something we want for 4.17?
The risk is low as the code is already dead and the benefit is that we have no 
deadcode.
What do you think?

~Michal



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.