[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [PATCH for-4.19] x86/altcall: fix clang code-gen when using altcall in loop constructs



On Wed, Jul 24, 2024 at 08:39:05AM +0200, Jan Beulich wrote:
> On 23.07.2024 18:24, Alejandro Vallejo wrote:
> > On Tue Jul 23, 2024 at 5:09 PM BST, Roger Pau Monné wrote:
> >> On Tue, Jul 23, 2024 at 04:37:12PM +0100, Alejandro Vallejo wrote:
> >>> On Tue Jul 23, 2024 at 10:31 AM BST, Roger Pau Monne wrote:
> >>>> --- a/xen/arch/x86/include/asm/alternative.h
> >>>> +++ b/xen/arch/x86/include/asm/alternative.h
> >>>> @@ -185,10 +185,10 @@ extern void alternative_branches(void);
> >>>>   */
> >>>>  #define ALT_CALL_ARG(arg, n)                                            
> >>>> \
> >>>>      register union {                                                    
> >>>> \
> >>>> -        typeof(arg) e;                                                  
> >>>> \
> >>>> +        typeof(arg) e[sizeof(long) / sizeof(arg)];                      
> >>>> \
> >>>>          unsigned long r;                                                
> >>>> \
> >>>>      } a ## n ## _ asm ( ALT_CALL_arg ## n ) = {                         
> >>>> \
> >>>> -        .e = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })   
> >>>> \
> >>>> +        .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); 
> >>>> })\
> >>>>      }
> >>>>  #else
> >>>>  #define ALT_CALL_ARG(arg, n) \
> >>>
> >>> Don't we want BUILD_BUG_ON(sizeof(long) % sizeof(arg) == 0) instead?
> >>
> >> I think you meant BUILD_BUG_ON(sizeof(long) % sizeof(arg) != 0)?
> > 
> > Bah, yes. I wrote it as a COMPILE_ASSERT().
> > 
> >>
> >>> Otherwise
> >>> odd sizes will cause the wrong union size to prevail, and while I can't 
> >>> see
> >>> today how those might come to happen there's Murphy's law.
> >>
> >> The overall union size would still be fine, because it has the
> >> unsigned long element, it's just that the array won't cover all the
> >> space assigned to the long member?
> > 
> > I explained myself poorly. If the current BUILD_BUG_ON() stays as-is that's
> > right, but...
> > 
> >>
> >> IOW if sizeof(arg) == 7, then we would define an array with only 1
> >> element, which won't make the size of the union change, but won't
> >> cover the same space that's used by the long member.
> > 
> > ... I thought the point of the patch was to cover the full union with the
> > array, and not just a subset. My proposed alternative merely tries to ensure
> > the argument is always a submultiple in size of a long so the array is 
> > always a
> > perfect match.

I would be fine with the adjusted BUILD_BUG_ON(), but if we change the
instance for clang we should also update the BUILD_BUG_ON() used on
the gcc counterpart so they both match.  That might be undesirable
however since gcc doesn't exhibit any of those bugs, and hence
shouldn't have such constraints.

> Question is whether there's an issue with odd sized values in Clang. I
> wouldn't want to exclude such (admittedly somewhat exotic) uses "just
> in case". My understanding here is that the issue the patch addresses
> is not merely the treatment of the union by Clang, but the combination
> thereof with it violating the psABI when it comes to passing bool
> around.

So there are at least two issues, one is the lack of truncation of
register value done by the callee, which is a psABI violation.  The
other is clang not resetting the high bits of the register when the
altcall is inside a loop construct.  In the godbolt example provided
the high bits of %rdi are not cleared between loops.  This last issue
would also be 'fixed' if clang implemented the psABI properly and
truncated the register at the callee.

I've given a try in godbolt, and odd structure sizes seem to be
affected:

https://godbolt.org/z/778YsoWsn

We have no usages of such structures in Xen so far.

The only way I've found to cope with this is to use something like:

#define ALT_CALL_ARG(arg, n)                                            \
    union {                                                             \
        typeof(arg) e[(sizeof(long) + sizeof(arg) - 1) / sizeof(arg)];  \
        unsigned long r;                                                \
    } a ## n ## __  = {                                                 \
        .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })\
    };                                                                  \
    register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) =      \
        a ## n ## __.r

An oversized array that ensures all the space of the long is covered
by the array, but then we need an extra variable, as we would
otherwise require modifying ALT_CALL{0..6}_OUT to use aX_.r instead of
aX_.

Thanks, Roger.



 


Rackspace

Lists.xenproject.org is hosted with RackSpace, monitoring our
servers 24x7x365 and backed by RackSpace's Fanatical Support®.