[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH] x86/altcall: further refine clang workaround



On Fri, Jul 26, 2024 at 05:38:11PM +0100, Alejandro Vallejo wrote:
> On Fri Jul 26, 2024 at 5:25 PM BST, Alejandro Vallejo wrote:
> > On Fri Jul 26, 2024 at 4:18 PM BST, Roger Pau Monné wrote:
> > > On Fri, Jul 26, 2024 at 03:25:08PM +0100, Alejandro Vallejo wrote:
> > > > On Fri Jul 26, 2024 at 3:17 PM BST, Alejandro Vallejo wrote:
> > > > > On Fri Jul 26, 2024 at 9:05 AM BST, Jan Beulich wrote:
> > > > > > On 26.07.2024 09:52, Roger Pau Monné wrote:
> > > > > > > On Fri, Jul 26, 2024 at 09:36:15AM +0200, Jan Beulich wrote:
> > > > > > >> On 26.07.2024 09:31, Roger Pau Monné wrote:
> > > > > > >>> On Thu, Jul 25, 2024 at 05:00:22PM +0200, Jan Beulich wrote:
> > > > > > >>>> On 25.07.2024 16:54, Roger Pau Monné wrote:
> > > > > > >>>>> On Thu, Jul 25, 2024 at 03:18:29PM +0200, Jan Beulich wrote:
> > > > > > >>>>>> On 25.07.2024 12:56, Roger Pau Monne wrote:
> > > > > > >>>>>>> --- a/xen/arch/x86/include/asm/alternative.h
> > > > > > >>>>>>> +++ b/xen/arch/x86/include/asm/alternative.h
> > > > > > >>>>>>> @@ -184,11 +184,11 @@ extern void 
> > > > > > >>>>>>> alternative_branches(void);
> > > > > > >>>>>>>   * https://github.com/llvm/llvm-project/issues/82598
> > > > > > >>>>>>>   */
> > > > > > >>>>>>>  #define ALT_CALL_ARG(arg, n)                               
> > > > > > >>>>>>>              \
> > > > > > >>>>>>> -    register union {                                       
> > > > > > >>>>>>>              \
> > > > > > >>>>>>> -        typeof(arg) e[sizeof(long) / sizeof(arg)];         
> > > > > > >>>>>>>              \
> > > > > > >>>>>>> -        unsigned long r;                                   
> > > > > > >>>>>>>              \
> > > > > > >>>>>>> +    register struct {                                      
> > > > > > >>>>>>>              \
> > > > > > >>>>>>> +        typeof(arg) e;                                     
> > > > > > >>>>>>>              \
> > > > > > >>>>>>> +        char pad[sizeof(void *) - sizeof(arg)];            
> > > > > > >>>>>>>              \
> > > > > > >>>>>>
> > > > > > >>>>>> One thing that occurred to me only after our discussion, and 
> > > > > > >>>>>> I then forgot
> > > > > > >>>>>> to mention this before you would send a patch: What if 
> > > > > > >>>>>> sizeof(void *) ==
> > > > > > >>>>>> sizeof(arg)? Zero-sized arrays are explicitly something 
> > > > > > >>>>>> we're trying to
> > > > > > >>>>>> get rid of.
> > > > > > >>>>>
> > > > > > >>>>> I wondered about this, but I though it was only [] that we 
> > > > > > >>>>> were trying
> > > > > > >>>>> to get rid of, not [0].
> > > > > > >>>>
> > > > > > >>>> Sadly (here) it's actually the other way around, aiui.
> > > > > > >>>
> > > > > > >>> The only other option I have in mind is using an oversized 
> > > > > > >>> array on
> > > > > > >>> the union, like:
> > > > > > >>>
> > > > > > >>> #define ALT_CALL_ARG(arg, n)                                    
> > > > > > >>>         \
> > > > > > >>>     union {                                                     
> > > > > > >>>         \
> > > > > > >>>         typeof(arg) e[(sizeof(long) + sizeof(arg) - 1) / 
> > > > > > >>> sizeof(arg)];  \
> > > > > > >>>         unsigned long r;                                        
> > > > > > >>>         \
> > > > > > >>>     } a ## n ## __  = {                                         
> > > > > > >>>         \
> > > > > > >>>         .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); 
> > > > > > >>> (arg); })\
> > > > > > >>>     };                                                          
> > > > > > >>>         \
> > > > > > >>>     register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n 
> > > > > > >>> ) =      \
> > > > > > >>>         a ## n ## __.r
> > > > > > >>
> > > > > > >> Yet that's likely awful code-gen wise?
> > > > > > > 
> > > > > > > Seems OK: https://godbolt.org/z/nsdo5Gs8W
> > > > > >
> > > > > > In which case why not go this route. If the compiler is doing fine 
> > > > > > with
> > > > > > that, maybe the array dimension expression could be further 
> > > > > > simplified,
> > > > > > accepting yet more over-sizing? Like "sizeof(void *) / sizeof (arg) 
> > > > > > + 1"
> > > > > > or even simply "sizeof(void *)"? Suitably commented of course ...
> > > > > >
> > > > > > >> For the time being, can we perhaps
> > > > > > >> just tighten the BUILD_BUG_ON(), as iirc Alejandro had suggested?
> > > > > > > 
> > > > > > > My main concern with tightening the BUILD_BUG_ON() is that then I
> > > > > > > would also like to do so for the GCC one, so that build fails
> > > > > > > uniformly.
> > > > > >
> > > > > > If we were to take that route, then yes, probably should constrain 
> > > > > > both
> > > > > > (with a suitable comment on the gcc one).
> > > > > >
> > > > > > Jan
> > > > >
> > > > > Yet another way would be to have an intermediate `long` to cast onto. 
> > > > > Compilers
> > > > > will optimise away the copy. It ignores the different-type aliasing 
> > > > > rules in
> > > > > the C spec, so there's an assumption that we have 
> > > > > -fno-strict-aliasing. But I
> > > > > belive we do? Otherwise it should pretty much work on anything.
> > > > >
> > > > > ```
> > > > >   #define ALT_CALL_ARG(arg, n)                                        
> > > > >       \
> > > > >       unsigned long __tmp = 0;                                        
> > > > >       \
> > > > >       *(typeof(arg) *)&__tmp =                                        
> > > > >       \
> > > > >           ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })    
> > > > >       \
> > > > >       register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = 
> > > > > __tmp; \
> > > > > ```
> > > > >
> > > > > fwiw, clang18 emits identical code compared with the previous godbolt 
> > > > > link.
> > > > >
> > > > > Link: https://godbolt.org/z/facd1M9xa
> > > > >
> > > > > Cheers,
> > > > > Alejandro
> > > > 
> > > > Bah. s/b/__tmp/ in line15. Same output though, so the point still 
> > > > stands.
> > >
> > > Had to adjust it to:
> > >
> > > #define ALT_CALL_ARG(arg, n)                                              
> > > \
> > >     unsigned long a ## n ## __ = 0;                                       
> > > \
> > >     *(typeof(arg) *)&a ## n ## __ =                                       
> > > \
> > >         ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); });         
> > > \
> > >     register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = a ## n 
> > > ## __
> > >
> > > So that tmp__ is not defined multiple times for repeated
> > > ALT_CALL_ARG() usages.
> > >
> > > Already tried something like this in the past, but it mixes code with
> > > declarations, and that's forbidden in the current C standard that Xen
> > > uses:
> > >
> > > ./arch/x86/include/asm/hvm/hvm.h:665:5: error: mixing declarations and 
> > > code is incompatible with standards before C99 
> > > [-Werror,-Wdeclaration-after-statement]
> > >
> > > The `*(typeof(arg) *)&__tmp = ...` line is considered code, and is
> > > followed by further declarations.  Even if we moved both declarations
> > > ahead of the assigns it would still complain when multiple
> > > ALT_CALL_ARG() instances are used in the same altcall block.
> > >
> > > Thanks, Roger.
> >
> > That _was_ forbidden in C89, but it has been allowed since. We have a 
> > warning
> > enabled to cause it to fail even if we always use C99-compatible compilers. 
> > I
> > think we should change that.
> >
> > Regardless, I think it can be worked around. This compiles (otherwise
> > untested):
> >
> > #define ALT_CALL_ARG(arg, n)
> >     register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = ({   \
> >         unsigned long tmp = 0;                                          \
> >         *(typeof(arg) *)&a ## n ## __ = (arg);                          \
> >         BUILD_BUG_ON(sizeof(arg) > sizeof(void *));                     \
> >         tmp;                                                            \
> >     })
> >
> > That said, if the oversized temp union works, I'm fine with that too.
> >
> > Cheers,
> > Alejandro
> 
> Sigh... I'm going at 1 mistake per email today. I meant:
> 
> #define ALT_CALL_ARG(arg, n)
>      register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = ({   \
>          unsigned long tmp = 0;                                          \
>          *(typeof(arg) *)&tmp = (arg);                                   \
>          BUILD_BUG_ON(sizeof(arg) > sizeof(void *));                     \
>          tmp;                                                            \
>      })

I see, so placing the temp variable on an inner scope.  The code
generated seems to be the same from godbolt.  I don't have a strong
opinion on using either.  Your suggestion is slightly shorter than the
union one, so I might go for it.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.