[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL


  • To: Stefano Stabellini <sstabellini@xxxxxxxxxx>, Jan Beulich <JBeulich@xxxxxxxx>
  • From: Juergen Gross <jgross@xxxxxxxx>
  • Date: Tue, 22 Jan 2019 07:08:08 +0100
  • Autocrypt: addr=jgross@xxxxxxxx; prefer-encrypt=mutual; keydata= xsBNBFOMcBYBCACgGjqjoGvbEouQZw/ToiBg9W98AlM2QHV+iNHsEs7kxWhKMjrioyspZKOB ycWxw3ie3j9uvg9EOB3aN4xiTv4qbnGiTr3oJhkB1gsb6ToJQZ8uxGq2kaV2KL9650I1SJve dYm8Of8Zd621lSmoKOwlNClALZNew72NjJLEzTalU1OdT7/i1TXkH09XSSI8mEQ/ouNcMvIJ NwQpd369y9bfIhWUiVXEK7MlRgUG6MvIj6Y3Am/BBLUVbDa4+gmzDC9ezlZkTZG2t14zWPvx XP3FAp2pkW0xqG7/377qptDmrk42GlSKN4z76ELnLxussxc7I2hx18NUcbP8+uty4bMxABEB AAHNHkp1ZXJnZW4gR3Jvc3MgPGpncm9zc0BzdXNlLmRlPsLAeQQTAQIAIwUCU4xw6wIbAwcL CQgHAwIBBhUIAgkKCwQWAgMBAh4BAheAAAoJELDendYovxMvi4UH/Ri+OXlObzqMANruTd4N zmVBAZgx1VW6jLc8JZjQuJPSsd/a+bNr3BZeLV6lu4Pf1Yl2Log129EX1KWYiFFvPbIiq5M5 kOXTO8Eas4CaScCvAZ9jCMQCgK3pFqYgirwTgfwnPtxFxO/F3ZcS8jovza5khkSKL9JGq8Nk czDTruQ/oy0WUHdUr9uwEfiD9yPFOGqp4S6cISuzBMvaAiC5YGdUGXuPZKXLpnGSjkZswUzY d9BVSitRL5ldsQCg6GhDoEAeIhUC4SQnT9SOWkoDOSFRXZ+7+WIBGLiWMd+yKDdRG5RyP/8f 3tgGiB6cyuYfPDRGsELGjUaTUq3H2xZgIPfOwE0EU4xwFgEIAMsx+gDjgzAY4H1hPVXgoLK8 B93sTQFN9oC6tsb46VpxyLPfJ3T1A6Z6MVkLoCejKTJ3K9MUsBZhxIJ0hIyvzwI6aYJsnOew cCiCN7FeKJ/oA1RSUemPGUcIJwQuZlTOiY0OcQ5PFkV5YxMUX1F/aTYXROXgTmSaw0aC1Jpo w7Ss1mg4SIP/tR88/d1+HwkJDVW1RSxC1PWzGizwRv8eauImGdpNnseneO2BNWRXTJumAWDD pYxpGSsGHXuZXTPZqOOZpsHtInFyi5KRHSFyk2Xigzvh3b9WqhbgHHHE4PUVw0I5sIQt8hJq 5nH5dPqz4ITtCL9zjiJsExHuHKN3NZsAEQEAAcLAXwQYAQIACQUCU4xwFgIbDAAKCRCw3p3W KL8TL0P4B/9YWver5uD/y/m0KScK2f3Z3mXJhME23vGBbMNlfwbr+meDMrJZ950CuWWnQ+d+ Ahe0w1X7e3wuLVODzjcReQ/v7b4JD3wwHxe+88tgB9byc0NXzlPJWBaWV01yB2/uefVKryAf AHYEd0gCRhx7eESgNBe3+YqWAQawunMlycsqKa09dBDL1PFRosF708ic9346GLHRc6Vj5SRA UTHnQqLetIOXZm3a2eQ1gpQK9MmruO86Vo93p39bS1mqnLLspVrL4rhoyhsOyh0Hd28QCzpJ wKeHTd0MAWAirmewHXWPco8p1Wg+V+5xfZzuQY0f4tQxvOpXpt4gQ1817GQ5/Ed/wsDtBBgB CAAgFiEEhRJncuj2BJSl0Jf3sN6d1ii/Ey8FAlrd8NACGwIAgQkQsN6d1ii/Ey92IAQZFggA HRYhBFMtsHpB9jjzHji4HoBcYbtP2GO+BQJa3fDQAAoJEIBcYbtP2GO+TYsA/30H/0V6cr/W V+J/FCayg6uNtm3MJLo4rE+o4sdpjjsGAQCooqffpgA+luTT13YZNV62hAnCLKXH9n3+ZAgJ RtAyDWk1B/0SMDVs1wxufMkKC3Q/1D3BYIvBlrTVKdBYXPxngcRoqV2J77lscEvkLNUGsu/z W2pf7+P3mWWlrPMJdlbax00vevyBeqtqNKjHstHatgMZ2W0CFC4hJ3YEetuRBURYPiGzuJXU pAd7a7BdsqWC4o+GTm5tnGrCyD+4gfDSpkOT53S/GNO07YkPkm/8J4OBoFfgSaCnQ1izwgJQ jIpcG2fPCI2/hxf2oqXPYbKr1v4Z1wthmoyUgGN0LPTIm+B5vdY82wI5qe9uN6UOGyTH2B3p hRQUWqCwu2sqkI3LLbTdrnyDZaixT2T0f4tyF5Lfs+Ha8xVMhIyzNb1byDI5FKCb
  • Cc: Stefano Stabellini <stefanos@xxxxxxxxxx>, Wei Liu <wei.liu2@xxxxxxxxxx>, Andrew Cooper <andrew.cooper3@xxxxxxxxxx>, Julien Grall <julien.grall@xxxxxxxxx>, Julien Grall <julien.grall@xxxxxxx>, Stewart Hildebrand <Stewart.Hildebrand@xxxxxxxxxxxxxxx>, xen-devel <xen-devel@xxxxxxxxxxxxxxxxxxxx>
  • Delivery-date: Tue, 22 Jan 2019 06:09:25 +0000
  • List-id: Xen developer discussion <xen-devel.lists.xenproject.org>
  • Openpgp: preference=signencrypt

On 22/01/2019 00:41, Stefano Stabellini wrote:
> On Mon, 21 Jan 2019, Jan Beulich wrote:
>>>>> On 19.01.19 at 00:05, <sstabellini@xxxxxxxxxx> wrote:
>>> On Fri, 18 Jan 2019, Jan Beulich wrote:
>>>>>>> On 18.01.19 at 02:24, <sstabellini@xxxxxxxxxx> wrote:
>>>>> On Thu, 17 Jan 2019, Jan Beulich wrote:
>>>>>>>>> On 17.01.19 at 01:37, <sstabellini@xxxxxxxxxx> wrote:
>>>>>>> On Wed, 16 Jan 2019, Jan Beulich wrote:
>>>>>>>> In any event - since intermediate variables merely hide the
>>>>>>>> casting from the compiler, but they don't remove the casts, the
>>>>>>>> solution involving casts is better imo, for incurring less overhead.
>>>>>>>
>>>>>>> This is where I completely disagree. The intermediate variables are not
>>>>>>> hiding casts from the compiler. There were never any pointers in this
>>>>>>> case.  The linker creates "symbols", not pointers, completely invisible
>>>>>>> from C land. Assembly uses these symbols to initialize variables. We
>>>>>>> expose these assembly variables as integer to C lands. LD scripts and
>>>>>>> assembly have their own terminology and rules: neither "_start" nor
>>>>>>> "start" are pointers at any point in time. The operations done in var.S
>>>>>>> is not a cast. The C spec is happy, the compiler is happy, MISRA-C is
>>>>>>> happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is
>>>>>>> really a win-win.
>>>>>>
>>>>>> Well, that's a position one can take. But we have to settle on another
>>>>>> aspect then first: Does what is not done in C underly C's rules? I
>>>>>> thought you were of the opinion that what comes from linker scripts
>>>>>> does. In which case what comes from assembly files ought to, too.
>>>>>> (FAOD my implication is: If the answer is yes, both approaches
>>>>>> violate C's rules. If the answer is no, no change is needed at all.)
>>>>>
>>>>> Great question, that is the core of the issue. Also, let me premise that
>>>>> I agree on the comments you made on the patches (I dislike "start_"
>>>>> too), and I can address them if we agree to continue down this path.
>>>>>
>>>>> But no, I do not think that what is done outside of C-land should follow
>>>>> C rules. But I do not agree with your conclusion that in that case there
>>>>> is no difference between the approaches. Let's get more into the
>>>>> details.
>>>>>
>>>>>
>>>>> 1) SYMBOL_HIDE returning pointer type
>>>>>
>>>>> Let's take _start and _end as an example. _start is born as a linker
>>>>> symbol, and it becomes a C pointer when we do:
>>>>>
>>>>>   extern char _start[], _end[]
>>>>>
>>>>> Now it is a pointer (actually I should say an array, but let's pretend
>>>>> they are the same thing for this discussion).
>>>>>
>>>>> When we do:
>>>>>
>>>>>   SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start)
>>>>>
>>>>> We are still subtracting pointers: the pointers returned by SYMBOL_HIDE.
>>>>> We cannot prove that they are pointers to the same object or subsequence
>>>>> objects in memory, so it is undefined behavior, which is not allowed.
>>>>
>>>> Stop. No. We very much can prove they are - _end points at
>>>> one past the last element of _start[]. It is the compiler which
>>>> can't prove the opposite, and hence it can't leverage
>>>> undefined behavior for optimization purposes.
>>>
>>> This is an interesting comment. However, even for normal pointers it is
>>> unreliable to count on one pointing one past the last element of the
>>> other. This was well explained in the GCC thread linked earlier in this
>>> thread. The vision of at least one of the GCC maintainers is that the
>>> compiler is free to place things in memory where it wishes, so as a
>>> programmer you cannot count on pointers pointing one past the last
>>> element of the other. Ever. In this case, where _start and _end are
>>> defined outside of C-land, I think it is even more true, and it remains
>>> undefined.
>>
>> You mix up two things: One is the chance of two objects being
>> adjacent to one another. We don't care about this. The other is
>> a pointer truly pointing one past the last element of an array (as
>> will naturally result with e.g.
>>
>>     for ( ptr = arr; ptr < arr + ARRAY_SIZE(arr); ++ptr )
>>
>> It is this second case which all the cases we care about here fall
>> into. As per my other mail, just like the same object can have multiple
>> names, symbols may also refer to places other than the start of
>> an object; the fact that plain C can't produce such symbols is not
>> relevant as long as there's no requirement that C code may
>> interface only with other C code.
> 
> The chance of two objects being adjacent to one another is relevant
> because the compiler could rightfully decide that the programmer can
> never rely on pointers pointing one past the last element of the other,
> even if they truly point one past the last element of an array.
> 
> Otherwise, Linux would have never needed to introduce RELOC_HIDE in the
> first place.
> 
> 
>>> Moreover, I went back to MISRAC (finally I have a copy) and rule 18.2
>>> says: "subtraction between pointers shall only be applied to pointers
>>> that address elements of the same array". So, all the evidence we have
>>> seems to say that we cannot rely on _end pointing one past the last
>>> element of _start in this matter.
>>
>> With the C standard's wording in mind, this surely is to include
>> the "one past the last element" case, in which case all is fine. _end
>> does not point at or into a different object, it points at the end of
>> _start[].
>>
>>>>> 3) var.S + start_ as unsigned long
>>>>>
>>>>> With this approach, _start is born as a linker symbol. It is never
>>>>> exported to C, so from C point of view, it doesn't exist. There is
>>>>> another variable named "start_" defined in assembly and initialized to
>>>>> _start. Now we go into C land with:
>>>>>
>>>>>   extern uintptr_t start_, end_
>>>>>
>>>>> start_ and end_ are uintptr_t from the beginning from C point of view.
>>>>> They have never been pointers or in any way connected to _start. They
>>>>> are "clean".
>>>>>
>>>>> When we do:
>>>>>
>>>>>   _end - _start
>>>>>
>>>>> it is a subtraction between uintptr_t, which is allowed. When we do:
>>>>>
>>>>>     for ( call = (const initcall_t *)initcall_start_;
>>>>>           (uintptr_t)call < presmp_initcall_end_;
>>>>>
>>>>> The comparison is still between uintptr_t types, and the value of "call"
>>>>> still comes from an unsigned long initially. There is never a comparison
>>>>> between dubious pointers. (Interger to pointer conversions and pointer
>>>>> to integer conversions are allowed by MISRA with some limitations, but I
>>>>> am double-checking.) Even:
>>>>>
>>>>>    (uintptr_t)random_pointer < presmp_initcall_end_
>>>>>
>>>>> would be acceptable because presmp_initcall_end_ is an integer and has
>>>>> always been an integer from C point of view.
>>>>
>>>> Well, as said - this is one of the possible positions to take. Personally
>>>> I see no difference between the helper symbols defined in
>>>> assembly sources, or in C sources the object files for which are never
>>>> made part of potential whole program optimization. 
>>>
>>> I don't think this is the case for MISRAC. C rules apply to C. Other
>>> rules apply to assembly and linker scripts. This is something that
>>> should be easy to check, and I hope that Stewart should be able to
>>> confirm.
>>
>> As per above - the interesting aspect is what rules apply to the
>> case of C interfacing with another language.
>>
>>>> Using C files for this is still in conflict with the supposed
>>>> undefined behavior, but I think you agree that C and assembly files
>>>> could be set up such that the resulting binary data is identical. In
>>>> which case it is bogus to call one satisfactory, but not the other.
>>>
>>> I see what you are saying, but it doesn't work that way from a spec
>>> compliance point of view.
>>>
>>>
>>>>> However, there are still a couple of issued not correctly solved by v8
>>>>> of the series. For starters: 
>>>>>
>>>>>         apply_alternatives((struct alt_instr *)alt_instructions_,
>>>>>                            (struct alt_instr *)alt_instructions_end_);
>>>>>
>>>>> I can see how the pointers comparisons in apply_alternatives could be
>>>>> considered wrong given the way the pointers are initialized:
>>>>>
>>>>>     for ( a = base = start; a < end; a++ )
>>>>>     {
>>>>>
>>>>> start and end come from alt_instructions_ and alt_instructions_end_. It
>>>>> doesn't matter that alt_instructions_ and alt_instructions_end_ are
>>>>> "special", they could be perfectly normal integers and we would still
>>>>> have the same problem: we cannot prove that "start" and "end" point to
>>>>> the same object or subsequent objects in memory.
>>>>>
>>>>> The way to fix it is by changing the parameters of apply_alternatives to
>>>>> interger types, making comparison between integers, and only using
>>>>> pointers to access the data.
>>>>
>>>> You know my position on casts from integer to pointer types, especially
>>>> ones taking a type out of thin air. This applies to your addition to the
>>>> apply_alternatives() construct as well as the alternative of adding such
>>>> in order to access memory. The quote from the standard that I gave
>>>> makes such casts not provably (by the compiler) defined behavior as
>>>> well, so it all boils down to the same distinction as pointed out above in
>>>> the first part of my reply here: _We_ can prove it, but the compiler
>>>> can't. Hence we're still depending on whose proof is necessary to
>>>> eliminate MISRA's undefined behavior concerns.
>>>
>>> Comparisons between pointers to different objects is undefined by the C
>>> spec, and not allowed by MISRAC.
>>>
>>> Casting pointers to integers and casting integers to pointers is
>>> implementation-defined, which is not the same thing as undefined.
>>
>> Of course it is not, but the result possibly not even being a valid
>> pointer can't make it much better than "undefined".
>>
>>> I don't make up the rules, I am only trying to follow them :-)
>>
>> Sure. But we shouldn't uglify our code just to follow insane
>> (exaggeration intended) rules.
> 
> We haven't managed to reach consensus on this topic. Your view might be
> correct, but it is not necessarily supported by compilers' behavior,
> which depends on the opinion of compilers engineers on the topic, and
> MISRAC compliance, which depends on the opinion of MISRAC specialists on
> the topic. If we take your suggested approach we end up with the code
> most likely to break in case the compilers engineers or the MISRAC
> experts disagree with you. In this case, being right doesn't necessarily
> lead to the code less likely to break.
> 
> Regardless, if that is the decision of the Xen community as a whole,
> I'll follow it. My preference remains with approach 3. (var.S), followed
> by approach 2. (SYMBOL_HIDE returns uintptr_t), but I am willing to
> refresh my series to do approach 1. (SYMBOL_HIDE returns pointer type)
> if that is the only way forward.
> 
> Let us come to a conclusion so that we can move on.
> 
> Jan, Julien, Juergen, anybody else interested, let me know what you want
> me to do.
> 

I am "only" the release manager, so I can opt for the series to go into
4.12 in case the committers are ready to give it a go. The decision for
4.12 depends on the time consensus is reached. Right now I'd give it my
Rab, but in case some more weeks are needed I might not want to take the
risk.


Juergen

_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.