[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Xen-devel] [PATCH v6 1/4] xen: introduce SYMBOL



>>> On 15.01.19 at 13:23, <julien.grall@xxxxxxx> wrote:
> Hi,
> 
> On 1/15/19 12:04 PM, Jan Beulich wrote:
>>>>> On 15.01.19 at 12:51, <julien.grall@xxxxxxx> wrote:
>>> Hi Jan,
>>>
>>> On 1/15/19 8:21 AM, Jan Beulich wrote:
>>>>>>> On 14.01.19 at 22:18, <sstabellini@xxxxxxxxxx> wrote:
>>>>> Hi Jan,
>>>>>
>>>>> One question below to make a decision on the way forward.
>>>>>
>>>>> On Mon, 14 Jan 2019, Jan Beulich wrote:
>>>>>>>>> On 14.01.19 at 04:45, <Stewart.Hildebrand@xxxxxxxxxxxxxxx> wrote:
>>>>>>> So let's keep the linker-accessible variable as a type that works for 
>>>>>>> the
>>>>>>> linker (which really could be anything as long as you use the address, 
>>>>>>> not
>>>>>>> the value), but name it something else - a name that screams "DON'T USE 
>>>>>>> ME
>>>>>>> UNLESS YOU KNOW WHAT YOU'RE DOING". And then before the first use, copy
>>>>>>> that value to "uintptr_t _start;".
>>>>>>>
>>>>>>> The following is a quick proof of concept for aarch64. I changed the 
>>>>>>> type
>>>>>>> of _start and _end, and added code to copy the linker-assigned value to
>>>>>>> _start and _end. Upon booting, I see the correct values:
>>>>>>
>>>>>> Global symbols starting with underscores should already be shouting
>>>>>> enough. But what's worse - the whole idea if using array types is to
>>>>>> avoid the intermediate variables.
>>>>>>
>>>>>>> --- a/xen/arch/arm/setup.c
>>>>>>> +++ b/xen/arch/arm/setup.c
>>>>>>> @@ -726,6 +726,12 @@ static void __init setup_mm(unsigned long 
>>>>>>> dtb_paddr,
>>>>> size_t dtb_size)
>>>>>>>    
>>>>>>>    size_t __read_mostly dcache_line_bytes;
>>>>>>>    
>>>>>>> +typedef char TYPE_DOESNT_MATTER;
>>>>>>> +extern TYPE_DOESNT_MATTER _start_linker_assigned_dont_use_me,
>>>>>>> +                          _end_linker_assigned_dont_use_me;
>>>>>>
>>>>>> This and ...
>>>>>>
>>>>>>> @@ -770,10 +776,17 @@ void __init start_xen(unsigned long 
>>>>>>> boot_phys_offset,
>>>>>>>        printk("Command line: %s\n", cmdline);
>>>>>>>        cmdline_parse(cmdline);
>>>>>>>    
>>>>>>> +    _start = (uintptr_t)&_start_linker_assigned_dont_use_me;
>>>>>>> +    _end = (uintptr_t)&_end_linker_assigned_dont_use_me;
>>>>>>
>>>>>> ... this violates what the symbol names say. And if you want to
>>>>>> avoid issues, you'd want to keep out of C files uses of those
>>>>>> symbols altogether anyway, and you easily can: In any
>>>>>> assembly file, have
>>>>>>
>>>>>> _start:  .long _start_linker_assigned_dont_use_me
>>>>>> _end:    .long _end_linker_assigned_dont_use_me
>>>>>>
>>>>>> In particular, they don't need to be runtime initialized, saving
>>>>>> you from needing to set them before first use. But as said -
>>>>>> things are the way they are precisely to avoid such variables.
>>>>>>
>>>>>>>> But, instead of converting _start to unsigned long via SYMBOL_HIDE, we
>>>>>>>> could convert it to uintptr_t instead, it would be a trivial change on
>>>>>>>> top of the existing unsigned long series. Not sure if it is beneficial.
>>>>>>>
>>>>>>> The difference would be whether we want to rely on 
>>>>>>> implementation-defined
>>>>>>> behavior or not.
>>>>>>
>>>>>> Why not? Simply specify that compilers with implementation defined
>>>>>> behavior not matching our expectations are unsuitable. And btw, I
>>>>>> suppose this is just the tiny tip of the iceberg of our reliance on
>>>>>> implementation defined behavior.
>>>>>
>>>>> The reason is that relying on undefined behavior is not reliable, it is
>>>>> not C compliant, it is not allowed by MISRA-C, and not guaranteed to
>>>>> work with any compiler.
>>>>
>>>> "undefined behavior" != "implementation defined behavior"
>>>>
>>>>> Yes, this instance is only the tip of the
>>>>> iceberg, we have a long road ahead, but we shouldn't really give up
>>>>> because it is going to be difficult :-) Stewart's approach would
>>>>> actually be compliant and help toward reducing reliance on undefined
>>>>> behavior.
>>>>>
>>>>> Would you be OK if I rework the series to follow his approach using
>>>>> intermediate variables? See the attached patch as a reference, it only
>>>>> "converts" _start and _end as an example. Fortunately, it will be
>>>>> textually similar to the previous SYMBOL returning unsigned long version
>>>>> of the series.
>>>>
>>>> Well, I've given reasons why I dislike that, and why (I think) it was
>>>> done without such intermediate variables. Nevertheless, if this is
>>>> _the only way_ to achieve compliance, I don't think I could
>>>> reasonably NAK it. >
>>>> The thing that I don't understand though is how the undefined
>>>> behavior (if there really is any) goes away: Even if you compare
>>>> the contents of the variables instead of the original (perhaps
>>>> casted) pointers, in the end you still compare what C would
>>>> consider pointers to different objects. It's merely a different
>>>> way of hiding that fact from C. Undefined behavior would imo
>>>> go away only if those comparisons/subtractions didn't happen
>>>> in C anymore. IOW - see my .startof.() / .sizeof.() proposal.
>>>
>>> Do you have a pointer to the series using startof/sizeof?
>> 
>> https://lists.xenproject.org/archives/html/xen-devel/2016-08/msg02718.html 
>> 
> Thank you! Looking at the thread, Andrew had some concerns about it. Do 
> you have any update on them?

I withdrew the patch, as I don't see how to address the clang
concern (unless clang support these constructs, which I doubt).
Considering the context here, I nevertheless thought it may be
worthwhile to consider reviving it, but I didn't check whether
there were any other open points.

Jan


_______________________________________________
Xen-devel mailing list
Xen-devel@xxxxxxxxxxxxxxxxxxxx
https://lists.xenproject.org/mailman/listinfo/xen-devel

 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.