Xen project Mailing List

Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL

To: Jan Beulich <JBeulich@xxxxxxxx>, Stefano Stabellini <sstabellini@xxxxxxxxxx>

From: Stewart Hildebrand <Stewart.Hildebrand@xxxxxxxxxxxxxxx>

Date: Tue, 15 Jan 2019 20:03:45 +0000

Accept-language: en-US

Cc: Juergen Gross <jgross@xxxxxxxx>, Stefano Stabellini <stefanos@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>

Delivery-date: Tue, 15 Jan 2019 20:04:13 +0000

List-id: Xen developer discussion <xen-devel.lists.xenproject.org>

Thread-index: AQHUqHT6EHkpRCxf9EupBCStWIThYqWoHrMAgABgAgCAAJhgAIAAIAUAgAAkzACAACAxgIAALXqAgACPqQCAAHnVgP//rUmwgAB9BID//6xewIAAZPuA//+u8MAAiZEUAAAWyCqAAAyogT8AF7od8A==

Thread-topic: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL

On Tuesday, January 15, 2019 3:21 AM, Jan Beulich wrote: > >>> On 14.01.19 at 22:18, <sstabellini@xxxxxxxxxx> wrote: > > Hi Jan, > > > > One question below to make a decision on the way forward. > > > > On Mon, 14 Jan 2019, Jan Beulich wrote: > >> >>> On 14.01.19 at 04:45, <Stewart.Hildebrand@xxxxxxxxxxxxxxx> wrote: > >> > The difference would be whether we want to rely on implementation-defined > >> > behavior or not. > >> > >> Why not? Simply specify that compilers with implementation defined > >> behavior not matching our expectations are unsuitable. And btw, I > >> suppose this is just the tiny tip of the iceberg of our reliance on > >> implementation defined behavior. > > > > The reason is that relying on undefined behavior is not reliable, it is > > not C compliant, it is not allowed by MISRA-C, and not guaranteed to > > work with any compiler. > > "undefined behavior" != "implementation defined behavior" > The C standard gives definitions for unspecified, implementation defined, and undefined behavior. To paraphrase: - Unspecified behavior: the C standard intentionally leaves a choice for the implementation to make. - Implementation defined behavior: the implementation's choice of the unspecified behavior. - Undefined behavior: the C standard does not impose any requirements. Annex J in the C99 standard lists cases of unspecified, implementation defined, and undefined behavior. Two relevant examples are: - The width of unsigned long is implementation-defined (i.e. ULONG_MAX is implementation-defined). The example given in Annex E in the C standard is "#define ULONG_MAX 4294967295", but that is to be replaced by the implementation's choice. - Performing subtraction on pointers to different objects is undefined behavior. > > Yes, this instance is only the tip of the > > iceberg, we have a long road ahead, but we shouldn't really give up > > because it is going to be difficult :-) Stewart's approach would > > actually be compliant and help toward reducing reliance on undefined > > behavior. > > > > Would you be OK if I rework the series to follow his approach using > > intermediate variables? See the attached patch as a reference, it only > > "converts" _start and _end as an example. Fortunately, it will be > > textually similar to the previous SYMBOL returning unsigned long version > > of the series. > > Well, I've given reasons why I dislike that, and why (I think) it was > done without such intermediate variables. Nevertheless, if this is > _the only way_ to achieve compliance, I don't think I could > reasonably NAK it. > > The thing that I don't understand though is how the undefined > behavior (if there really is any) goes away: Even if you compare > the contents of the variables instead of the original (perhaps > casted) pointers, in the end you still compare what C would > consider pointers to different objects. It's merely a different > way of hiding that fact from C. Undefined behavior would imo > go away only if those comparisons/subtractions didn't happen > in C anymore. IOW - see my .startof.() / .sizeof.() proposal. No, the C standard provides us a guarantee. To quote the ISO/IEC 9899 C99 standard regarding the subtract operator: > For subtraction, one of the following shall hold: > - both operands have arithmetic type; > - both operands are pointers to qualified or unqualified versions of > compatible object types; or > - the left operand is a pointer to an object type and the right operand > has integer type. > > If both operands have arithmetic type, the usual arithmetic conversions > are performed on them. > > When an expression that has integer type is added to or subtracted from > a pointer ... If both the pointer operand and the result point to > elements of the same array object, or one past the last element of the > array object, the evaluation shall not produce an overflow; otherwise, > the behavior is undefined. Here, "arithmetic type" is either integer type or floating point type (but NOT pointer type). There is similar language for the equality comparison operator that Stefano quoted earlier in the thread. My interpretation of the standard is: Subtract/compare operations where one or both operands are pointer types are potentially subject to the "pointers to different objects" issue, and the compiler is free to make that determination by any means available. Subtract/compare operations where both operands are integer types are well defined in the C standard, and, per the C standard, are NOT subject to the "pointers to different objects" issue. If the compiler starts to consider integer types being "pointers to different objects" then the compiler clearly does not adhere to the C standard. The compiler may look through *pointer type* casts, but if it started to look through *integer type* casts, we would have good reason to complain to the GCC mailing list. You could achieve both operands being integer types either by changing the type of _start (using intermediate variables) or by casting (using SYMBOL_HIDE with an integer return type). I would argue that changing the type of _start would be less prone to human error, since developers wouldn't have to remember to explicitly wrap each reference to _start and friends in a macro. That's probably not an issue for the patches submitted to xen-devel that are subject to informed review, but in potential downstream/forks of Xen, it would be easy for somebody to miss the requirement of having to use SYMBOL_HIDE every time the variable is referenced. Somebody wrote the code this way in the first place, and the potentially undefined behavior has been in upstream for years without any remediation. With the SYMBOL_HIDE approach, we are probably violating a few MISRA rules with all the tricks going on inside SYMBOL_HIDE, so we'd have to write up deviations with justification for that and track each occurrence. With the approach of changing the type of _start, I believe there will be fewer MISRA rule violations. Just to reiterate, MISRA C says: don't subtract/compare *pointer types* pointing to different objects, otherwise it's "undefined behavior" except in one irrelevant corner case (I'm paraphrasing since the actual text is copyrighted). If the operands are both integer types (not pointer types), we don't risk violating the MISRA rules pertaining to pointer types. For completeness, the corner case is pointer A pointing to one element past the end of object A, and pointer B pointing to the beginning of object B. > > > If you are OK with it, do you have any suggestions on how would you like > > the intermediate variables to be called? I went with _start/start_ and > > _end/end_ but I am open to suggestions. Also to which assembly file you > > would like the new variables being added -- I created a new one for the > > purpose named var.S in the attached example. > > First of all we should explore whether the variables could also be > linker generated, in particular to avoid the current symbols to be > global (thus making it impossible to access them from C files in the > first place). Interesting idea. That certainly would be ideal if the linker will allow it. > Failing that, I don't think it matters much where these > helper symbols live, and hence your choice is probably fine (I'd > prefer though if, just like on Arm, the x86 file didn't live in the > boot/ subdirectory; in the end it might even be possible to have > some of them in xen/common/var.S). > > Jan (if you're tired of reading my walls of text, you can stop here) Lastly, please let me take a moment to reiterate why MISRA C exists and how safety certification relates to the Xen project. Here is a quick definition two distinct concepts: Safety: preservation of human life and avoiding harmful behavior. Security: locking up your valuables. As embedded devices gain more connectivity and functionality, it is becoming more economical to consolidate functions (both safety and non-safety critical) on to a single hardware platform, hence the need for a hypervisor. One of the draws of potentially using Xen in a safety critical system is that it already has an extremely large user base that cares a lot about security. Although safety and security are two distinct concepts, there is still a lot of overlap. A coding error that allows a hacker to access a private database could just as well cause unintended acceleration in a drive-by-wire system. In the safety critical world, it is not enough to say that something works, we also have to do our due diligence to ensure that it won't exhibit unintended behavior and that it keeps working reliably. The rigor and assurance involved in a safety critical process is a pretty powerful benefit that I think carries over to those who care about security. From a MISRA perspective, we have to document and ensure developers understand implementation defined behavior (MISRA C:2012 Directive 1.1). We also can't use any undefined behavior (Rule 1.3). Where it is unavoidable to violate a rule, we have to write up deviations for MISRA rules that are violated, and justify/maintain each violation of a MISRA rule. It's much more maintainable and justifiable from a MISRA perspective to not violate the rule in the first place. The longer our list of violations becomes, the larger the burden imposed on the community that cares about safety certification. This is not something that would be necessary for the entire codebase, only a safety certified subset. For MISRA, we have no choice but to pick apart the entire safety certified subset of the iceberg. Let's assume, for example, that we would have to document why "inspecting the text of an asm() is something that should never happen", thus guaranteeing that the compiler won't make the assumption that the value passed through that inline assembly could be considered a "pointer to a different object". Documenting this could range anywhere from simply referencing compiler documentation to inspecting compiler source code and potentially imposing certain restrictions on what optimization flags we're allowed to use, and then likely pinning the compiler version. Once the safety critical subset has been fully defined, I wouldn't rule out the possibility that somebody will try to use a compiler built for safety critical applications. GCC has too many unregulated moving pieces to really be suitable for higher levels of safety assurance (unless you painstakingly perform manual object code verification - which we obviously try to avoid if we can due to the effort involved. And even with the right compiler, sometimes this still has to be done for certain software on commercial airliners - but Xen has quite a way to go before that would reasonably happen). At the end of the day, I have a harder time justifying more and more implementation defined behavior and rule violations when there is a proposed solution that avoids violating certain MISRA rules in the first place. I live in a world where *should never* isn't something I would trust my life to. Stew _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel

©2013 Xen Project, A Linux Foundation Collaborative Project. All Rights Reserved.
Linux Foundation is a registered trademark of The Linux Foundation.
Xen Project is a trademark of The Linux Foundation.