[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] Re: [Xen-devel] [PATCH v2] x86/atomic: Improvements and simplifications to assembly constraints
>>> On 18.03.19 at 15:11, <andrew.cooper3@xxxxxxxxxx> wrote: > On 18/03/2019 13:20, Jan Beulich wrote: >> >>> On 18.03.19 at 12:27, <andrew.cooper3@xxxxxxxxxx> wrote: >>> * Some of the single-byte versions specify "=q" as the output. This is a >>> remnent of the 32bit build and can be relaxed to "=r" in 64bit builds. >> I have to admit that I don't understand the "relaxed" part of this: >> "q" and "r" represent the exact same set of registers on 64-bit. >> Unless the conversion allows further code folding, I think it wouldn't >> be a bad idea to retain the distinction, just for cases like code >> eventually getting shared via something like lib/x86/. > > The change from =q to =r is specifically to allow the folding to +r Hmm, I still don't understand. "+q" is as valid as "+r". And "q" does not make any size implications, i.e. registers in that group aren't implicitly byte ones (albeit this aspect doesn't even matter here). >>> The reason the volatile cast in __cmpxchg_user() can't be dropped is because >>> without it, the compiler uses a stack copy rather than the in-memory copy, >>> which ends up tripping: >>> >>> /* Allowed to change in Accessed/Dirty flags only. */ >>> BUG_ON((t ^ old) & ~(intpte_t)(_PAGE_ACCESSED|_PAGE_DIRTY)); >> Isn't this hinting at some other shortcoming or even flaw then? >> If the compiler generally did such transformations, I'm afraid a >> lot of other code would be at risk too, including some of what >> you modify here. > > I don't think there is any flaw or shortcoming. Without the volatile, > the compiler doesn't know that there are any side effects, so can > legitimately operate on a local stack copy so long as it copies things > back later. > > In practice, this is an operation on shared memory which has to happen > on the shared memory pointer. Yes, of course. But volatile isn't guaranteed to have this effect. Iirc there's even a statement to this effect in the gcc docs. It just so happens that the compiler wouldn't normally make a memory copy of a memory variable. But it wouldn't normally do so for non-volatile memory either (because it would be an anti-optimization), so there has to be something special here. >> In any event I think it would be a good idea to have a code >> comment for this as well. > > I don't see how that would help. The same applies to all atomic > operations, even test_bit(). I don't think any other operation has casts to void in it - certainly none of the others you touch here. In fact in (at least) __xchg() and __cmpxchg() you cast away volatile, which I think you want to avoid. test_bit() (as the example you give) doesn't cast to volatile, it merely alters the type, but the incoming pointer is already a volatile one. Then again this is a macro, so I guess the cast can be counted to be the equivalent of an implicit conversion due to a function parameter being pointer-to-volatile. >>> @@ -40,28 +37,24 @@ static always_inline unsigned long __xchg( >>> switch ( size ) >>> { >>> case 1: >>> - asm volatile ( "xchgb %b0,%1" >>> - : "=q" (x) >>> - : "m" (*__xg(ptr)), "0" (x) >>> - : "memory" ); >>> + asm volatile ( "xchg %b[x], %[ptr]" >>> + : [x] "+r" (x), [ptr] "+m" (*(uint8_t *)ptr) >>> + :: "memory" ); >>> break; >>> case 2: >>> - asm volatile ( "xchgw %w0,%1" >>> - : "=r" (x) >>> - : "m" (*__xg(ptr)), "0" (x) >>> - : "memory" ); >>> + asm volatile ( "xchg %w[x], %[ptr]" >>> + : [x] "+r" (x), [ptr] "+m" (*(uint16_t *)ptr) >>> + :: "memory" ); >>> break; >>> case 4: >>> - asm volatile ( "xchgl %k0,%1" >>> - : "=r" (x) >>> - : "m" (*__xg(ptr)), "0" (x) >>> - : "memory" ); >>> + asm volatile ( "xchg %k[x], %[ptr]" >>> + : [x] "+r" (x), [ptr] "+m" (*(uint32_t *)ptr) >>> + :: "memory" ); >>> break; >>> case 8: >>> - asm volatile ( "xchgq %0,%1" >>> - : "=r" (x) >>> - : "m" (*__xg(ptr)), "0" (x) >>> - : "memory" ); >>> + asm volatile ( "xchg %q[x], %[ptr]" >>> + : [x] "+r" (x), [ptr] "+m" (*(uint64_t *)ptr) >>> + :: "memory" ); >>> break; >> Is the q modifier really useful to have here (and elsewhere below)? > > Yes - it is strictly necessary, because otherwise it gets derived from > the type of (x) which is unsigned long even in the smaller size constructs. What you say explains why you need the b, w, and k modifiers. It doesn't explain the q one, since sizeof(unsigned long) is 8 and hence exactly what would result without the modifier. >>> @@ -63,36 +65,38 @@ static always_inline __uint128_t cmpxchg16b_local_( >>> * If no fault occurs then _o is updated to the value we saw at _p. If this >>> * is the same as the initial value of _o then _n is written to location >>> _p. >>> */ >>> -#define __cmpxchg_user(_p,_o,_n,_isuff,_oppre,_regtype) \ >>> +#define __cmpxchg_user(_p, _o, _n, _oppre) \ >>> stac(); \ >>> asm volatile ( \ >>> - "1: lock; cmpxchg"_isuff" %"_oppre"2,%3\n" \ >>> + "1: lock cmpxchg %"_oppre"[new], %[ptr]\n" \ >>> "2:\n" \ >>> ".section .fixup,\"ax\"\n" \ >>> - "3: movl $1,%1\n" \ >>> + "3: movl $1, %[rc]\n" \ >>> " jmp 2b\n" \ >>> ".previous\n" \ >>> _ASM_EXTABLE(1b, 3b) \ >>> - : "=a" (_o), "=r" (_rc) \ >>> - : _regtype (_n), "m" (*__xg((volatile void *)_p)), "0" (_o), "1" >>> (0) \ >>> + : "+a" (_o), [rc] "=r" (_rc), \ >>> + [ptr] "+m" (*(volatile typeof(*(_p)) *)(_p)) \ >>> + : [new] "r" (_n), "[rc]" (0) \ >> Wouldn't it further help readability a little if _rc was initialized to zero >> right when getting declared, eliminating the last input arg here (the >> output then would need to be "+r" of course)? > > I can do. > >> And since then you >> actually touch all lines containing uses of _rc, it would be a good >> opportunity to also rename the variable to get rid of the leading >> underscore. > > I'm not sure that is a sensible move. Its a macro-scope variable from > cmpxchg_user() which still needs disambiguating from potential names of > parameters. I don't understand. For one, the macro parameters are all (wrongly) underscore-prefixed too, so there's no risk of collision there. Disambiguation is needed against outer scope variables, yet this should be done by an underscore suffix (as we commonly do) or yet some other mechanism. A leading underscore here is not in line with the C spec. >> Anyway, with at least the "relaxed" part of the description changed >> (e.g. to "converted") or explained verbally in a reply, with or without >> the other items taken care of >> Reviewed-by: Jan Beulich <jbeulich@xxxxxxxx> > > Hopefully my reply is sufficient? Well, as per above not for now: I still don't see any relaxation you do. Jan _______________________________________________ Xen-devel mailing list Xen-devel@xxxxxxxxxxxxxxxxxxxx https://lists.xenproject.org/mailman/listinfo/xen-devel
|
Lists.xenproject.org is hosted with RackSpace, monitoring our |