|
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [PATCH for-4.19] x86/altcall: fix clang code-gen when using altcall in loop constructs
On Wed, Jul 24, 2024 at 08:39:05AM +0200, Jan Beulich wrote:
> On 23.07.2024 18:24, Alejandro Vallejo wrote:
> > On Tue Jul 23, 2024 at 5:09 PM BST, Roger Pau Monné wrote:
> >> On Tue, Jul 23, 2024 at 04:37:12PM +0100, Alejandro Vallejo wrote:
> >>> On Tue Jul 23, 2024 at 10:31 AM BST, Roger Pau Monne wrote:
> >>>> --- a/xen/arch/x86/include/asm/alternative.h
> >>>> +++ b/xen/arch/x86/include/asm/alternative.h
> >>>> @@ -185,10 +185,10 @@ extern void alternative_branches(void);
> >>>> */
> >>>> #define ALT_CALL_ARG(arg, n)
> >>>> \
> >>>> register union {
> >>>> \
> >>>> - typeof(arg) e;
> >>>> \
> >>>> + typeof(arg) e[sizeof(long) / sizeof(arg)];
> >>>> \
> >>>> unsigned long r;
> >>>> \
> >>>> } a ## n ## _ asm ( ALT_CALL_arg ## n ) = {
> >>>> \
> >>>> - .e = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })
> >>>> \
> >>>> + .e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg);
> >>>> })\
> >>>> }
> >>>> #else
> >>>> #define ALT_CALL_ARG(arg, n) \
> >>>
> >>> Don't we want BUILD_BUG_ON(sizeof(long) % sizeof(arg) == 0) instead?
> >>
> >> I think you meant BUILD_BUG_ON(sizeof(long) % sizeof(arg) != 0)?
> >
> > Bah, yes. I wrote it as a COMPILE_ASSERT().
> >
> >>
> >>> Otherwise
> >>> odd sizes will cause the wrong union size to prevail, and while I can't
> >>> see
> >>> today how those might come to happen there's Murphy's law.
> >>
> >> The overall union size would still be fine, because it has the
> >> unsigned long element, it's just that the array won't cover all the
> >> space assigned to the long member?
> >
> > I explained myself poorly. If the current BUILD_BUG_ON() stays as-is that's
> > right, but...
> >
> >>
> >> IOW if sizeof(arg) == 7, then we would define an array with only 1
> >> element, which won't make the size of the union change, but won't
> >> cover the same space that's used by the long member.
> >
> > ... I thought the point of the patch was to cover the full union with the
> > array, and not just a subset. My proposed alternative merely tries to ensure
> > the argument is always a submultiple in size of a long so the array is
> > always a
> > perfect match.
I would be fine with the adjusted BUILD_BUG_ON(), but if we change the
instance for clang we should also update the BUILD_BUG_ON() used on
the gcc counterpart so they both match. That might be undesirable
however since gcc doesn't exhibit any of those bugs, and hence
shouldn't have such constraints.
> Question is whether there's an issue with odd sized values in Clang. I
> wouldn't want to exclude such (admittedly somewhat exotic) uses "just
> in case". My understanding here is that the issue the patch addresses
> is not merely the treatment of the union by Clang, but the combination
> thereof with it violating the psABI when it comes to passing bool
> around.
So there are at least two issues, one is the lack of truncation of
register value done by the callee, which is a psABI violation. The
other is clang not resetting the high bits of the register when the
altcall is inside a loop construct. In the godbolt example provided
the high bits of %rdi are not cleared between loops. This last issue
would also be 'fixed' if clang implemented the psABI properly and
truncated the register at the callee.
I've given a try in godbolt, and odd structure sizes seem to be
affected:
https://godbolt.org/z/778YsoWsn
We have no usages of such structures in Xen so far.
The only way I've found to cope with this is to use something like:
#define ALT_CALL_ARG(arg, n) \
union { \
typeof(arg) e[(sizeof(long) + sizeof(arg) - 1) / sizeof(arg)]; \
unsigned long r; \
} a ## n ## __ = { \
.e[0] = ({ BUILD_BUG_ON(sizeof(arg) > sizeof(void *)); (arg); })\
}; \
register unsigned long a ## n ## _ asm ( ALT_CALL_arg ## n ) = \
a ## n ## __.r
An oversized array that ensures all the space of the long is covered
by the array, but then we need an extra variable, as we would
otherwise require modifying ALT_CALL{0..6}_OUT to use aX_.r instead of
aX_.
Thanks, Roger.
|
![]() |
Lists.xenproject.org is hosted with RackSpace, monitoring our |