[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL
On Thu, 17 Jan 2019, Jan Beulich wrote: > >>> On 17.01.19 at 01:37, <sstabellini@xxxxxxxxxx> wrote: > > On Wed, 16 Jan 2019, Jan Beulich wrote: > >> In any event - since intermediate variables merely hide the > >> casting from the compiler, but they don't remove the casts, the > >> solution involving casts is better imo, for incurring less overhead. > > > > This is where I completely disagree. The intermediate variables are not > > hiding casts from the compiler. There were never any pointers in this > > case. The linker creates "symbols", not pointers, completely invisible > > from C land. Assembly uses these symbols to initialize variables. We > > expose these assembly variables as integer to C lands. LD scripts and > > assembly have their own terminology and rules: neither "_start" nor > > "start" are pointers at any point in time. The operations done in var.S > > is not a cast. The C spec is happy, the compiler is happy, MISRA-C is > > happy. And we get to avoid the ugly SYMBOL macro that Linux uses. It is > > really a win-win. > > Well, that's a position one can take. But we have to settle on another > aspect then first: Does what is not done in C underly C's rules? I > thought you were of the opinion that what comes from linker scripts > does. In which case what comes from assembly files ought to, too. > (FAOD my implication is: If the answer is yes, both approaches > violate C's rules. If the answer is no, no change is needed at all.) Great question, that is the core of the issue. Also, let me premise that I agree on the comments you made on the patches (I dislike "start_" too), and I can address them if we agree to continue down this path. But no, I do not think that what is done outside of C-land should follow C rules. But I do not agree with your conclusion that in that case there is no difference between the approaches. Let's get more into the details. 1) SYMBOL_HIDE returning pointer type Let's take _start and _end as an example. _start is born as a linker symbol, and it becomes a C pointer when we do: extern char _start[], _end[] Now it is a pointer (actually I should say an array, but let's pretend they are the same thing for this discussion). When we do: SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start) We are still subtracting pointers: the pointers returned by SYMBOL_HIDE. We cannot prove that they are pointers to the same object or subsequence objects in memory, so it is undefined behavior, which is not allowed. This solution allows us to highlight the problematic call sites, it helps with compiler issues, but I am not convinced it helps with C compliance. Better than nothing, but the worst of the lot. 2) SYMBOL_HIDE returning unsigned long Similarly to the previous case, _start and _end are born as linker symbols and become pointers when we do: extern char _start[], _end[] Then the operation: SYMBOL_HIDE(_end) - SYMBOL_HIDE(_start) SYMBOL_HIDE returns unsigned long, so the pointers disappear. We are comparing unsigned long, which should solve the C compliance issue. Because the pointers to unsigned long conversion is done in assembly, C compliance does not have a say on the nature of the unsigned long returned by SYMBOL_HIDE. However, given that _start and _end are still defined as pointers in C-land and given that SYMBOL_HIDE, although assembly, is basically a fancy cast, I concede that the solution is not ideal. I still think is acceptable, but inferior to the next solution. 3) var.S + start_ as unsigned long With this approach, _start is born as a linker symbol. It is never exported to C, so from C point of view, it doesn't exist. There is another variable named "start_" defined in assembly and initialized to _start. Now we go into C land with: extern uintptr_t start_, end_ start_ and end_ are uintptr_t from the beginning from C point of view. They have never been pointers or in any way connected to _start. They are "clean". When we do: _end - _start it is a subtraction between uintptr_t, which is allowed. When we do: for ( call = (const initcall_t *)initcall_start_; (uintptr_t)call < presmp_initcall_end_; The comparison is still between uintptr_t types, and the value of "call" still comes from an unsigned long initially. There is never a comparison between dubious pointers. (Interger to pointer conversions and pointer to integer conversions are allowed by MISRA with some limitations, but I am double-checking.) Even: (uintptr_t)random_pointer < presmp_initcall_end_ would be acceptable because presmp_initcall_end_ is an integer and has always been an integer from C point of view. However, there are still a couple of issued not correctly solved by v8 of the series. For starters: apply_alternatives((struct alt_instr *)alt_instructions_, (struct alt_instr *)alt_instructions_end_); I can see how the pointers comparisons in apply_alternatives could be considered wrong given the way the pointers are initialized: for ( a = base = start; a < end; a++ ) { start and end come from alt_instructions_ and alt_instructions_end_. It doesn't matter that alt_instructions_ and alt_instructions_end_ are "special", they could be perfectly normal integers and we would still have the same problem: we cannot prove that "start" and "end" point to the same object or subsequent objects in memory. The way to fix it is by changing the parameters of apply_alternatives to interger types, making comparison between integers, and only using pointers to access the data. _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |