Fix: change order of _cds_lfht_new_with_alloc parameters The "flavor" parameter should come before the "alloc" parameter to match the order of cds_lfht_new_with_flavor_alloc() parameters. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Change-Id: Ia704a0fd9cb90af966464e25e6202fed1a952eed
Add support for custom memory allocators for rculfhash The current implementation of rculfhash relies on calloc() to allocate memory for its buckets. This can in some cases lead to latency spikes when accessing the hash table, which can be avoided by using an optimized custom memory allocator. However, there is currently no way of replacing the default allocator with a custom one. This commit allows custom allocators to be used during the table initialization. The default behavior of the hash table remains unaffected, by using the stdlib calloc() and free(), if no custom allocator is given. Signed-off-by: Xenofon Foukas <fon1989@gmail.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Change-Id: Id9a405e5dc42e5564ff8623394c86056a4d1ff48
ppc.h: use mftb on ppc Older versions of GNU as do not support mftbl. The issue affects Darwin PowerPC, as well as some older versions of NetBSD and Linux. Since mftb is equivalent and universally understood, just use that. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Change-Id: I098b70fa8bb077143d2d658835586b6b059b879f
fix: add missing SPDX licensing tags Change-Id: If7016a3c83211e88c102f8b395dc290859af4789 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
urcu/uatomic/riscv: Mark RISC-V as broken Implementations of some atomic operations of GCC for RISC-V are insufficient for sequential consistency. For this reason Userspace RCU is currently marked as `broken' for RISC-V with GCC. However, it is still possible to use other toolchains. See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=104831 for details. For now, we mark every version of GCC as unsupported. Distribution package maintainers will have to cherry-pick the relevant patches in GCC then remove the #error in Userspace RCU if they want to support it. As for us, we will incrementally add specific versions of GCC that have fixed the issue whenever new stable releases are made from the GCC project. Change-Id: I2cd7c8f12068628b845a096e03f5f8100eacbe43 Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
LoongArch: Document that byte and short atomics are implemented with LL/SC Based on the LoongArch Reference Manual: https://loongson.github.io/LoongArch-Documentation/LoongArch-Vol1-EN.html Section 2.2.7 "Atomic Memory Access Instructions" only lists atomic operations for 32-bit and 64-bit integers. As detailed in Section 2.2.7.1, LL/SC instructions operating on 32-bit and 64-bit integers are also available. Those are used by the compiler to support atomics on byte and short types. This means atomics on 32-bit and 64-bit types have stronger forward progress guarantees than those operating on 8-bit and 16-bit types. Link: https://github.com/urcu/userspace-rcu/pull/11#issuecomment-1706528796 Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Change-Id: I01569b718f7300a46d984c34065c0bbfbd2f7cc6
Add LoongArch support This commit completes LoongArch support. LoongArch supports byte and short atomic operations, and defines UATOMIC_HAS_ATOMIC_BYTE and UATOMIC_HAS_ATOMIC_SHORT. Signed-off-by: Wang Jing <wangjing@loongson.cn> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Change-Id: I335e654939bfc90994275f2a4fad550c95f3eba4
Complete removal of urcu-signal flavor This commit completes removal of the urcu-signal flavor. Users can migrate to liburcu-memb with a kernel implementing the membarrier(2) system call to have similar read-side performance without requiring use of a reserved signal, and with improved grace period performance. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Change-Id: I75b9171e705b9b2ef4c8eeabe6164e5587816fb4
Fix: Add missing cmm_smp_mb() in deprecated urcu-signal commit 97d13221f8a1 ("Phase 1 of deprecating liburcu-signal") miss a cmm_smp_mb() at the beginning of the read-side critical sections, which causes spurious failures in the CI tests. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Change-Id: Id8d5822142bef5f418e2c4653369d93968dca637
urcu/compiler: Add urcu_static_assert Static assertion macros copied from LTTng-ust ust-compiler.h for compatibility with compilers that do not support static assertion. Change-Id: I5dfa8ba565041b522a1d5c226c7a9369979a3a02 Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Phase 1 of deprecating liburcu-signal The first phase of liburcu-signal deprecation consists of implementing it in term of liburcu-mb. In other words, liburcu-signal is identical to liburcu-mb at the exception of the function symbols and public header files. This is done by: 1) Removing the RCU_SIGNAL specific code in urcu.c 2) Making the RCU_MB specific code also specific to RCU_SIGNAL in urcu.c 3) Rewriting _urcu_signal_read_unlock_update_and_wakeup to use a atomic store with CMM_SEQ_CST instead of a store CMM_RELAXED with cmm_barrier() around it. We could keep the explicit barriers, but that would require to add some cmm_annotate annotations. Therefore, to be less intrusive in a public header file, simply use the CMM_SEQ_CST like for the mb flavor. Change-Id: Ie406f7df2f47da0a9f464df94b968ad9204821f3 Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
uatomic/generic: Fix redundant declaration warning abort(3) was explicitly declared external to avoid including <stdlib.h>. However, this emit a redundant declaration warning if it was already declared before including <urcu/uatomic.h>. Fix this by including <stdlib.h> and not declaring abort(). Change-Id: If9557814c311e2b531e85fec8c41788462338fe4 Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Add cmm_emit_legacy_smp_mb() Some public APIs stipulate implicit memory barriers on operations. These were coherent with the memory model used at that time. However, with the migration to a memory model closer to the C11 memory model, these memory barriers are not strictly emitted by the atomic operations in the new memory model. Therefore, introducing the `--disable-legacy-mb' configuration option. By default, liburcu is configured to emit these legacy memory barriers, thus keeping backward compatibility at the expense of slower performances. However, users can opt-out by disabling the legacy memory barriers. This options is publicly exported in the system configuration header file and can be overrode manually on a compilation unit basis by defining `CONFIG_RCU_EMIT_LEGACY_MB' before including any liburcu files. The usage of this macro requires to re-write atomic operations in term of the CMM memory model. This is done for the queue and stack APIs. Change-Id: Ia5ce3b3d8cd1955556ce96fa4408a63aa098a1a6 Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
urcu/annotate: Add CMM annotation The CMM annotation is highly experimental and not meant to be used by user for now, even though it is exposed in the public API since some parts of the liburcu public API require those annotations. The main primitive is the cmm_annotate_t which denotes a group of memory operations associated with a memory barrier. A group follows a state machine, starting from the `CMM_ANNOTATE_VOID' state. The following are the only valid transitions: CMM_ANNOTATE_VOID -> CMM_ANNOTATE_MB (acquire & release MB) CMM_ANNOTATE_VOID -> CMM_ANNOTATE_LOAD (acquire memory) CMM_ANNOTATE_LOAD -> CMM_ANNOTATE_MB (acquire MB) The macro `cmm_annotate_define(name)' can be used to create an annotation object on the stack. The rest of the `cmm_annotate_*' macros can be used to change the state of the group after validating that the transition is allowed. Some of these macros also inject TSAN annotations to help it understand the flow of events in the program since it does not currently support thread fence. Sometime, a single memory access does not need to be associated with a group. In the case, the acquire/release macros variant without the `group' infix can be used to annotate memory accesses. Note that TSAN can not be used on the liburcu-signal flavor. This is because TSAN hijacks calls to sigaction(3) and places its own handler that will deliver the signal to the application at a synchronization point. Thus, the usage of TSAN on the signal flavor is undefined behavior. However, there's at least one known behavior which is a deadlock between readers that want to unregister them-self by locking the `rcu_registry_lock' while a synchronize RCU is made on the writer side which has already locked that mutex until all the registered readers execute a memory barrier in a signal handler defined by liburcu-signal. However, TSAN will not call the registered handler while waiting on the mutex. Therefore, the writer spin infinitely on pthread_kill(3p) because the reader simply never complete the handshake. See the deadlock minimal reproducer below. Deadlock reproducer: ``` #include <poll.h> #include <signal.h> #include <pthread.h> #define SIGURCU SIGUSR1 static pthread_mutex_t rcu_registry_lock = PTHREAD_MUTEX_INITIALIZER; static int need_mb = 0; static void *reader_side(void *nil) { (void) nil; pthread_mutex_lock(&rcu_registry_lock); pthread_mutex_unlock(&rcu_registry_lock); return NULL; } static void writer_side(pthread_t reader) { __atomic_store_n(&need_mb, 1, __ATOMIC_RELEASE); while (__atomic_load_n(&need_mb, __ATOMIC_ACQUIRE)) { pthread_kill(reader, SIGURCU); (void) poll(NULL, 0, 1); } pthread_mutex_unlock(&rcu_registry_lock); pthread_join(reader, NULL); } static void sigrcu_handler(int signo, siginfo_t *siginfo, void *context) { (void) signo; (void) siginfo; (void) context; __atomic_store_n(&need_mb, 0, __ATOMIC_SEQ_CST); } static void install_signal(void) { struct sigaction act; act.sa_sigaction = sigrcu_handler; act.sa_flags = SA_SIGINFO | SA_RESTART; sigemptyset(&act.sa_mask); (void) sigaction(SIGURCU, &act, NULL); } int main(void) { pthread_t th; install_signal(); pthread_mutex_lock(&rcu_registry_lock); pthread_create(&th, NULL, reader_side, NULL); writer_side(th); return 0; } ``` Change-Id: I9c234bb311cc0f82ea9dbefdf4fee07047ab93f9 Co-authored-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Add CMM memory model Introducing the CMM memory model with the following new primitives: - uatomic_load(addr, memory_order) - uatomic_store(addr, value, memory_order) - uatomic_and_mo(addr, mask, memory_order) - uatomic_or_mo(addr, mask, memory_order) - uatomic_add_mo(addr, value, memory_order) - uatomic_sub_mo(addr, value, memory_order) - uatomic_inc_mo(addr, memory_order) - uatomic_dec_mo(addr, memory_order) - uatomic_add_return_mo(addr, value, memory_order) - uatomic_sub_return_mo(addr, value, memory_order) - uatomic_xchg_mo(addr, value, memory_order) - uatomic_cmpxchg_mo(addr, old, new, memory_order_success, memory_order_failure) The CMM memory model reflects the C11 memory model with an additional CMM_SEQ_CST_FENCE memory order. The memory order can be selected through the enum cmm_memorder. * With Atomic Builtins If configured with atomic builtins, the correspondence between the CMM memory model and the C11 memory model is a one to one at the exception of the CMM_SEQ_CST_FENCE memory order which implies the memory order CMM_SEQ_CST and a thread fence after the operation. * Without Atomic Builtins However, if not configured with atomic builtins, the following stipulate the memory model. For load operations with uatomic_load(), the memory orders CMM_RELAXED, CMM_CONSUME, CMM_ACQUIRE, CMM_SEQ_CST and CMM_SEQ_CST_FENCE are allowed. A barrier may be inserted before and after the load from memory depending on the memory order: - CMM_RELAXED: No barrier - CMM_CONSUME: Memory barrier after read - CMM_ACQUIRE: Memory barrier after read - CMM_SEQ_CST: Memory barriers before and after read - CMM_SEQ_CST_FENCE: Memory barriers before and after read For store operations with uatomic_store(), the memory orders CMM_RELAXED, CMM_RELEASE, CMM_SEQ_CST and CMM_SEQ_CST_FENCE are allowed. A barrier may be inserted before and after the store to memory depending on the memory order: - CMM_RELAXED: No barrier - CMM_RELEASE: Memory barrier before operation - CMM_SEQ_CST: Memory barriers before and after operation - CMM_SEQ_CST_FENCE: Memory barriers before and after operation For load/store operations with uatomic_and_mo(), uatomic_or_mo(), uatomic_add_mo(), uatomic_sub_mo(), uatomic_inc_mo(), uatomic_dec_mo(), uatomic_add_return_mo() and uatomic_sub_return_mo(), all memory orders are allowed. A barrier may be inserted before and after the operation depending on the memory order: - CMM_RELAXED: No barrier - CMM_ACQUIRE: Memory barrier after operation - CMM_CONSUME: Memory barrier after operation - CMM_RELEASE: Memory barrier before operation - CMM_ACQ_REL: Memory barriers before and after operation - CMM_SEQ_CST: Memory barriers before and after operation - CMM_SEQ_CST_FENCE: Memory barriers before and after operation For the exchange operation uatomic_xchg_mo(), any memory order is valid. A barrier may be inserted before and after the exchange to memory depending on the memory order: - CMM_RELAXED: No barrier - CMM_ACQUIRE: Memory barrier after operation - CMM_CONSUME: Memory barrier after operation - CMM_RELEASE: Memory barrier before operation - CMM_ACQ_REL: Memory barriers before and after operation - CMM_SEQ_CST: Memory barriers before and after operation - CMM_SEQ_CST_FENCE: Memory barriers before and after operation For the compare exchange operation uatomic_cmpxchg_mo(), the success memory order can be anything while the failure memory order cannot be CMM_RELEASE nor CMM_ACQ_REL and cannot be stronger than the success memory order. A barrier may be inserted before and after the store to memory depending on the memory orders: Success memory order: - CMM_RELAXED: No barrier - CMM_ACQUIRE: Memory barrier after operation - CMM_CONSUME: Memory barrier after operation - CMM_RELEASE: Memory barrier before operation - CMM_ACQ_REL: Memory barriers before and after operation - CMM_SEQ_CST: Memory barriers before and after operation - CMM_SEQ_CST_FENCE: Memory barriers before and after operation Barriers after the operations are only emitted if the compare exchange succeed. Failure memory order: - CMM_RELAXED: No barrier - CMM_ACQUIRE: Memory barrier after operation - CMM_CONSUME: Memory barrier after operation - CMM_SEQ_CST: Memory barriers before and after operation - CMM_SEQ_CST_FENCE: Memory barriers before and after operation Barriers after the operations are only emitted if the compare exchange failed. Barriers before the operation are never emitted by this memory order. Change-Id: I213ba19c84e82a63083f00143a3142ffbdab1d52 Co-authored-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
urcu/arch/generic: Use atomic builtins if configured If configured to use atomic builtins, implement SMP memory barriers in term of atomic builtins if the architecture does not implement its own version. Change-Id: Iddc4283606e0fce572e104d2d3f03b5c0d9926fb Co-authored-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
urcu/compiler: Use atomic builtins if configured Use __atomic_signal_fence(__ATOMIC_SEQ_CST) for cmm_barrier() if configured to use atomic builtins. Change-Id: Ib168b50f1e97a8da861b92d6882c56db230ebb2c Co-authored-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
configure: Add --enable-compiler-atomic-builtins option If the toolchain supports atomic builtins and the user ask for atomic builtins, use them for the uatomic API. This requires that the toolchains used to compile the library and the user application supports such builtins. The advantage of using these builtins is that they are well known synchronization primitives by several tools such as TSAN. However, they may introduce redundant memory barriers, mainly on strongly ordered architectures. Change-Id: Ia8e97112681f744f17816dbc4cbbec805a483331 Co-authored-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Olivier Dion <odion@efficios.com> Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>