MINOR: atomic/arm64: detect and use builtins for the double-word CAS
Gcc 10.2 implements outline atomics on aarch64. The replace all inline atomic ops with a function call that checks if the machine supports LSE atomics. This comes with a small cost but allows modern machines to scale much better than with the old LL/SC ones even when built for full 8.0 compatibility. This patch enables the use of the __atomic_compare_exchange() builtin for the double-word CAS when detected as available instead of using the hand-written LL/SC version. The extra cost is negligible because we do very few DWCAS operations (essentially FD migrations and shared pools) so the cost is low but under high contention it can still be beneficial. As expected no performance difference was measured in either direction on 4-core machines with this change. This could be backported to 2.3 if it was shown that FD migrations were representing a significant source of contention, but for now it does not appear to be needed.
This commit is contained in:
parent
184b21259b
commit
6756d95a8e
@ -550,8 +550,27 @@ static forceinline int __ha_cas_dw(void *target, void *compare, const void *set)
|
||||
return ret;
|
||||
}
|
||||
|
||||
#else // no ARMv8.1-A atomics
|
||||
#elif defined(__SIZEOF_INT128__) && defined(__GCC_HAVE_SYNC_COMPARE_AND_SWAP_16) // no ARMv8.1-A atomics but 128-bit atomics
|
||||
|
||||
/* According to https://gcc.gnu.org/onlinedocs/gcc/_005f_005fatomic-Builtins.html
|
||||
* we can use atomics on __int128. The availability of CAS is defined there:
|
||||
* https://gcc.gnu.org/onlinedocs/cpp/Common-Predefined-Macros.html
|
||||
* However these usually involve a function call which can be expensive for some
|
||||
* cases, but gcc 10.2 and above can reroute the function call to either LL/SC for
|
||||
* v8.0 or LSE for v8.1+, which allows to use a more scalable version on v8.1+ at
|
||||
* the extra cost of a function call.
|
||||
*/
|
||||
|
||||
/* returns 0 on failure, non-zero on success */
|
||||
static __inline int __ha_cas_dw(void *target, void *compare, const void *set)
|
||||
{
|
||||
return __atomic_compare_exchange((__int128*)target, (__int128*)compare, (const __int128*)set,
|
||||
0, __ATOMIC_RELAXED, __ATOMIC_RELAXED);
|
||||
}
|
||||
|
||||
#else // neither ARMv8.1-A atomics nor 128-bit atomics
|
||||
|
||||
/* returns 0 on failure, non-zero on success */
|
||||
static __inline int __ha_cas_dw(void *target, void *compare, void *set)
|
||||
{
|
||||
void *value[2];
|
||||
|
Loading…
x
Reference in New Issue
Block a user