Wrap calls to fmt::format to catch formatting exceptions fmt::format throws when a formatting error is encountered. Unfortunately, we can't ensure complete coverage of all logging call sites (e.g. error paths) and it is not desirable for such an exception to be thrown in those cases. The formatting error is returned as the formatted string so that it ends up in the logs or exception messages more or less transparently. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I1cb33a5fe87221139eaf9de918b47e0397daa89c
Move create_unique_class util to the memory namespace create_unique_class is helpful to write unique_ptr wrappers and is now accessed in numerous places even though it lives in the `details` namespace. Moving it to the `memory` namespace to live with other memory management facilities. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: Id4bb5100c1eb3a7e1e2d65f5b2d40ff8f97c786e
Remove fcntl wrapper Replace the questionnable sync_file_range wrapper by more descriptive util functions. Remove the unimplemented splice wrapper, I'd rather have a build failure / adjust the build system on some platforms than have hard to diagnose runtime issues. Change-Id: I4114d0d9765ae3d95a1488c945e5d66a20c2029d Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Build fix: brace-enclosed initlializer lists error with g++ 4.8 A build error occurs when building using g++ 4.8 : rotation-thread.cpp: In constructor 'lttng::sessiond::rotation_thread::rotation_thread(lttng::sessiond::rotation_thread_timer_queue&, notification_thread_handle&)': rotation-thread.cpp:400:58: error: invalid initialization of non-const reference of type 'lttng::sessiond::rotation_thread_timer_queue&' from an rvalue of type '<brace-enclosed initializer list>' _notification_thread_handle{ notification_thread_handle } Use old-style initialization of references instead. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: Ia3392a88b8a2d8dd8c60330c16229f507338e7cd
Clean-up: run format-cpp on the tree The original re-format commit missed a number of files that were caught by format-cpp. Hopefully this is the last large reformat commit for a while. Change-Id: I493ee6d9fe6187e0bd087c68ed346af69c929c1e Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Cleanup: rotation-thread: enforce conding standard following fix A fix introducing rotate_notification_channel_subscription_change_eventfd didn't follow the current coding standard so as to make it easier to backport to the stable branches. Clean-up the affected code to follow the current standard: - Replace the use of a raw eventfd to use the eventfd utility, - Subscribe and unsubscribe functions made use of global variables to communicate with the rotation thread: replace that with the rotation_thread class to centralize the interface, - Make the code using eventfd exception-safe (automatic memory management, use of various RAII utils), - Replacement of non-null pointers by references. Change-Id: I7e363e21b829fd0939a336aca2570fdbcc346967 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: sessiond: size-based notification occasionally not triggered Issue observed ============== When tracing to multiple sessions with scheduled size-based rotations that occur simultaneously (typically because they trace the same events and use the same rotation schedule configuration), the start of some rotations seems to be delayed indefinitely. Cause ===== The size-based rotations are implemented by piggy-backing onto the channel monitoring facilities. Essentially, a per-channel timer samples a number of statistics on the consumer daemon end, which transmits them to the session daemon. The session daemon's notification subsystem evaluates the statistics against the various registered triggers bound to the channels being monitored when a statistics sample is received. To implement size-based rotations, internal triggers are registered with the "consumed size" condition set to a given threshold. A session rotation management thread (which also performs other tasks) uses a notification channel to wait for sessions to reach their target size, starts rotations as needed, and sets a new threshold according to the sessions' configured rotation schedule. The rotation thread uses liblttng-ctl's API to consume notifications from a notification channel. At any time, a notification channel may have multiple notifications queued-up internally in its buffers. This is because a notification channel multiplexes command replies and notifications over the same UNIX socket. The current protocol specifies that multiple notifications can be received before the reply to a command. In such cases, the notification channel client implementation internally queues them and provides them on the next calls to lttng_notification_channel_get_next_notification(). This is correct with respect to the public API, which is intended to be used in "blocking mode". However, this internal user uses the notification channel's raw file descriptor to wake-up when a notification is available. This is problematic because notifications may be queued by the notification channel (and thus removed from the socket) while waiting for command replies (subscribing and unsubscribing from notification conditions). In such a case, a notification is available but the rotation thread does not wake-up to consume it as nothing is available in the socket's buffer. When this happens, a session that is supposed to rotate automatically appears to grow indefinitely. It will typically eventually rotate as new notifications become available and cause the rotation thread to wake-up. However, a "lag" builds up as the notification that caused the wake-up is not consumed. Instead, the last buffered notification is provided to the rotation thread. Solution ======== Use an event_fd to wake-up the rotation thread whenever a command completes on the notification channel. This ensures that any notification that was queued while waiting for a reply to the command is eventually consumed. Known drawbacks =============== None. Note ==== The use of C++ features is kept to a minimum in this patch in order to make it easier to backport to the stable releases. A clean-up patch follows and makes the code conform to the coding standards. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I8974b10124704d1e66e8da32d495fee738e3d43f
Replace explicit rcu_read_lock/unlock with lttng::urcu::read_lock_guard The explicit use of rcu_read_lock/unlock() has been a frequent source of bugs in the code base as the locks can often end-up imbalanced when error paths are taken (through goto's). Since lttng::urcu::read_lock_guard was implemented in a previous commit, replace all usages of the rcu_read_lock/unlock() API with an RAII lock_guard wrapper. Essentially, lttng::urcu::read_lock_guard acquires the RCU reader lock on construction, and releases it when it goes out of scope. This automates what is accomplished in the various error paths by jumping to error handling labels. For more info, see: https://en.cppreference.com/w/cpp/thread/lock_guard No user-visible change of behaviour is intended by this patch. The scope of some critical sections has been reduced when possible and when it didn't matter from a correctness standpoint. The RCU reader lock would often be held longer than necessary to simplify the error paths. Explicit scopes are used to limit the lifetime of the reader lock guards when used in long functions or when it is only intended to protect the iteration over cds_lfht instances. In those cases, the following pattern is used: ```cpp void foo() { [...] { lttng::urcu::read_lock_guard read_guard; cds_lfht_for_each(...) { [...] } [...] } ``` This explicitly bounds the critical section and also serves as a hint as to why the read guard is being used. It is fairly verbose, but I think it's a good compromise until we add an idiomatic urcu wrapper that automates this stuff too. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: Ie3ef8ddf0f1ab813971522176115688939696370
clang-tidy: add Chrome-inspired checks Add the checks used by the Chrome project. Most of these changes were produced and applied by clang-tidy. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I4058dafbdef00e81ac04f202fdfe377384a99e6b
Run clang-format on the whole tree Generated by running: find -not \( -path "./src/vendor*" -prune \) -iname "*.h" -o -iname "*.hpp" -o -iname "*.c" -o -iname "*.cpp" -exec clang-format -i {} \; Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I9557e7d84e305187475ef88124857cf2f438bbb1
Cleanup: remove ignored flags from poll events bitmasks The POLLHUP and POLLERR flags are only valid in 'revents', they are implicitly enabled regardless of the fact they were set in 'events' or not. As such remove those flags from all poll events to reduce possible confusion as to which flags can be returned by poll. Change-Id: Id22c78c38257d96dfc47e1337795f13c70dd5f91 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: sessiond: handle empty scheduled rotations A number of error codes were added to cmd_rotate_session since the implementation of size-based rotations. The rotation thread doesn't expect LTTNG_ERR_ROTATION_MULTIPLE_AFTER_STOP and LTTNG_ERR_ROTATION_AFTER_STOP_CLEAR which are not fatal failures. These rotations would simply result in an empty trace archive and are, therefore, not produced. In both cases, it is safe to wait for the next size cycle. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: Ib90526b586de17c0b14d16970c862d9f981ed464
Fix: sessiond: size-based rotation threshold exceeded in per-pid tracing (1/2) Issue observed -------------- When tracing short-lived applications with buffers configured in per-pid mode, the size-based rotation threshold is often greatly exceeded. In the CI, this occasionally causes the size-based rotation tests to timeout for the per-pid case. Cause ----- There is a scenario where a session's consumed size is miscalculated. When an application exits during per-pid tracing, both the session and consumer daemons notice it. The session daemon sees the application's command pipe hanging-up, while the consumer daemon sees the application's data-ready pipe hanging-up. Upon handling these events, both daemons tear down their representation of the channels. In an ideal world, we'd want to sample the streams' "consumed_size" at the last possible moment to get the size of all consumed data for this stream. However, this is problematic in the following scenario: - the sessiond destroys the channel before the consumer daemon, - the consumer daemon sends a final buffer stats sample on tear down, - the sessiond can do nothing with the sample as it doesn't know that channel anymore. (Note that the session daemon gracefully handles the case where it doesn't know a channel.) When applications have a short lifetime and are traced in per-PID buffering mode, there is a high likelihood that the last buffer statistics sample sent for a given channel will target a channel that the session daemon has already torn down. Solution -------- Consumed-size conditions are somewhat special: they are bound to a session, but they are evaluated through a per-channel event (buffer statistics samples taken by the channels' monitoring timer). To work around the problem of lifetime of channels, we can rely on the fact that sessions outlive channels to perform the accounting of the consumed size. This patch is the first step to implement this fix: new notification-thread commands are introduced to announce the creation and destruction of an `ltt_session`. Currently, the notification thread implies the existence of a session by tracking its channels' creation and destruction. With this change, it no longer needs to do so; session are explicitly created and destroyed. Their unique ID is also kept stored. The key of `sessions_ht` becomes the `id` of the session to allow efficient look-ups on the reception of a buffer statistics sample. The existing callsites that make use of the session's name to perform a look-up are modified to look-up the id by name (see sample_session_id_by_name()). The add/remove channel commands and rotation ongoing/completed commands are modified to refer to sessions by ID since they can assume the notification thread knows about the session. Note ---- In a follow-up patch, buffer statistics samples are modified to include the session's ID and the consumed size is modified to become a "delta" relative to the previous sample associated with a given channel. This makes it possible to perform the accounting of a session's consumed size beyond the lifetime of its channels. The follow-up patch is the "core" of the fix, but it requires these prior changes. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I865e9ac5e1a63e62123209be63957dad28c588a8
Fix: sessiond: rotation trigger leak ==1801304==ERROR: LeakSanitizer: detected memory leaks Direct leak of 224 byte(s) in 2 object(s) allocated from: #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x559fbeb64175 in zmalloc_internal ../../src/common/macros.hpp:60 #2 0x559fbeb6a291 in lttng_trigger* zmalloc<lttng_trigger>() ../../src/common/macros.hpp:89 #3 0x559fbeb64aa6 in lttng_trigger_create /home/jgalar/EfficiOS/src/lttng-tools/src/common/trigger.cpp:58 #4 0x559fbe9dc417 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:87 #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993 #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246 #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624 #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68 #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1) Indirect leak of 208 byte(s) in 2 object(s) allocated from: #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x559fbeb16e21 in zmalloc_internal ../../src/common/macros.hpp:60 #2 0x559fbeb16e31 in lttng_action_notify* zmalloc<lttng_action_notify>() ../../src/common/macros.hpp:89 #3 0x559fbeb168a0 in lttng_action_notify_create actions/notify.cpp:135 #4 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80 #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993 #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246 #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624 #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68 #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1) Indirect leak of 160 byte(s) in 2 object(s) allocated from: #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x559fbeb3d7a1 in zmalloc_internal ../../src/common/macros.hpp:60 #2 0x559fbeb3fa35 in lttng_condition_session_consumed_size* zmalloc<lttng_condition_session_consumed_size>() ../../src/common/macros.hpp:89 #3 0x559fbeb3e6fd in lttng_condition_session_consumed_size_create conditions/session-consumed-size.cpp:206 #4 0x559fbe9dc0f1 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:54 #5 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993 #6 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246 #7 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624 #8 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68 #9 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1) Indirect leak of 112 byte(s) in 2 object(s) allocated from: #0 0x7fe0f4e73fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x559fbeb242ad in zmalloc_internal ../../src/common/macros.hpp:60 #2 0x559fbeb27062 in zmalloc<(anonymous namespace)::lttng_rate_policy_every_n> ../../src/common/macros.hpp:89 #3 0x559fbeb25e9f in lttng_rate_policy_every_n_create actions/rate-policy.cpp:492 #4 0x559fbeb168b9 in lttng_action_notify_create actions/notify.cpp:141 #5 0x559fbe9dc34b in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:80 #6 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993 #7 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246 #8 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624 #9 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68 #10 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1) Indirect leak of 34 byte(s) in 2 object(s) allocated from: #0 0x7fe0f4e19319 in __interceptor_strdup /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:454 #1 0x559fbeb3f603 in lttng_condition_session_consumed_size_set_session_name conditions/session-consumed-size.cpp:442 #2 0x559fbe9dc2c4 in subscribe_session_consumed_size_rotation(ltt_session*, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/rotate.cpp:71 #3 0x559fbe995d6f in cmd_rotation_set_schedule(ltt_session*, bool, lttng_rotation_schedule_type, unsigned long, notification_thread_handle*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/cmd.cpp:5993 #4 0x559fbe9fe559 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2246 #5 0x559fbea01378 in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2624 #6 0x559fbe9ea642 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:68 #7 0x7fe0f44935c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1) The rotation trigger of a session (used for size-based rotations) is never cleaned-up. It is now cleaned up every time its condition is hit and whenever the session is destroyed. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I5a89341535f87b7851b548ded9838c18bd1ccb95
Fix: sessiond: ODR violation results in memory corruption Issue observed ============== Address sanitizer reports the following invalid accesses while running the test_mi test. ❯ ASAN_OPTIONS=detect_odr_violation=0 lttng-sessiond ================================================================= ==289173==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60400000e280 at pc 0x55cbbe35e2e0 bp 0x7f01672f1550 sp 0x7f01672f1540 WRITE of size 4 at 0x60400000e280 thread T13 #0 0x55cbbe35e2df in mark_thread_as_ready /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:32 #1 0x55cbbe360160 in thread_consumer_management /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:267 #2 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66 #3 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1) #4 0x7f0172a46583 in __clone (/usr/lib/libc.so.6+0x112583) 0x60400000e280 is located 8 bytes to the right of 40-byte region [0x60400000e250,0x60400000e278) allocated by thread T7 here: #0 0x7f01733b1fb9 in __interceptor_calloc /usr/src/debug/gcc/libsanitizer/asan/asan_malloc_linux.cpp:154 #1 0x55cbbe33adf3 in zmalloc_internal ../../../src/common/macros.hpp:60 #2 0x55cbbe33ae03 in thread_notifiers* zmalloc<thread_notifiers>() ../../../src/common/macros.hpp:89 #3 0x55cbbe3617f9 in launch_consumer_management_thread(consumer_data*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:440 #4 0x55cbbe33cf49 in spawn_consumer_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:188 #5 0x55cbbe33f7cf in start_consumerd /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:394 #6 0x55cbbe345713 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:1277 #7 0x55cbbe34d74b in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2622 #8 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66 #9 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1) Thread T13 created by T7 here: #0 0x7f0173353eb7 in __interceptor_pthread_create /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:216 #1 0x55cbbe336f9e in lttng_thread_create(char const*, void* (*)(void*), bool (*)(void*), void (*)(void*), void*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:106 #2 0x55cbbe3618cc in launch_consumer_management_thread(consumer_data*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:453 #3 0x55cbbe33cf49 in spawn_consumer_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:188 #4 0x55cbbe33f7cf in start_consumerd /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:394 #5 0x55cbbe345713 in process_client_msg /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:1277 #6 0x55cbbe34d74b in thread_manage_clients /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2622 #7 0x55cbbe336ac4 in launch_thread /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:66 #8 0x7f01729c15c1 in start_thread (/usr/lib/libc.so.6+0x8d5c1) Thread T7 created by T0 here: #0 0x7f0173353eb7 in __interceptor_pthread_create /usr/src/debug/gcc/libsanitizer/asan/asan_interceptors.cpp:216 #1 0x55cbbe336f9e in lttng_thread_create(char const*, void* (*)(void*), bool (*)(void*), void (*)(void*), void*) /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/thread.cpp:106 #2 0x55cbbe34eebf in launch_client_thread() /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/client.cpp:2756 #3 0x55cbbe27f31a in main /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/main.cpp:1838 #4 0x7f017296130f in __libc_start_call_main (/usr/lib/libc.so.6+0x2d30f) SUMMARY: AddressSanitizer: heap-buffer-overflow /home/jgalar/EfficiOS/src/lttng-tools/src/bin/lttng-sessiond/manage-consumer.cpp:32 in mark_thread_as_ready Shadow bytes around the buggy address: 0x0c087fff9c00: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa 0x0c087fff9c10: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa 0x0c087fff9c20: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa 0x0c087fff9c30: fa fa fd fd fd fd fd fa fa fa fd fd fd fd fd fa 0x0c087fff9c40: fa fa fd fd fd fd fd fa fa fa 00 00 00 00 00 fa =>0x0c087fff9c50:[fa]fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c087fff9c60: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c087fff9c70: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c087fff9c80: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c087fff9c90: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa 0x0c087fff9ca0: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa Shadow byte legend (one shadow byte represents 8 application bytes): Addressable: 00 Partially addressable: 01 02 03 04 05 06 07 Heap left redzone: fa Freed heap region: fd Stack left redzone: f1 Stack mid redzone: f2 Stack right redzone: f3 Stack after return: f5 Stack use after scope: f8 Global redzone: f9 Global init order: f6 Poisoned by user: f7 Container overflow: fc Array cookie: ac Intra object redzone: bb ASan internal: fe Left alloca redzone: ca Right alloca redzone: cb Shadow gap: cc ==289173==ABORTING Cause ===== The start functions of the various worker threads of the session daemon are implemented in separate translation units (TU). To make use of the lttng_thread API, they all define different control structures to control their shutdown. Those structures are all named 'thread_notifiers' and are all allocated using zmalloc<>. The various instances of zmalloc<thread_notifiers> all end up having the same mangled name (e.g. _Z7zmallocI16thread_notifiersEPT_v). At link time, only one instance of zmalloc<thread_notifiers> is kept. Since those structures all have different layout/sizes, this is problematic. However, it is an acceptable behaviour according to the ODR [1]. I first considered making the various memory allocation functions in macros.hpp 'static' which results in each TU holding the appropriate specialization of the various functions. While this works, it doesn't make us ODR-compliant. To make a long story short, a program defining multiple types sharing the same name, in the same namespace, is ill-formed. Another concern is that marking all templated free-functions as static will eventually result in code bloat. Solution ======== All structures defined in TUs (but not in a header) are placed in unnamed namespaces (also called anonymous namespaces) [2]. This results in separate copies of the templated functions being generated when specialized using a structure in an anonymous namespace (e.g. _Z7zmallocIN12_GLOBAL__N_116thread_notifiersEEPT_v). We could have renamed the various `thread_notifiers` structures to give them different names. However, I found those are not the only structures sharing a name in different TUs. For instance, the same problem applies to `struct lttng_index` (index in a stream, index in a map). I propose we systematically namespace structures defined in TUs in the future. This will also save us trouble if those POD structures eventually become non-POD: we would experience the same "clashes" if those structures had constructors, for example. References ========== [1] https://en.cppreference.com/w/cpp/language/definition [2] https://en.cppreference.com/w/cpp/language/namespace Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I867e5a287ad8cf3ada617335bc1a80b800bf0833
Add type-checked versions of allocation and deallocations functions A common mistake when porting things from C to C++ is to use malloc for types that have a on-trivial constructor (and to not call the constructor explicitly either). For example: struct foo { std::vector<int> field; }; foo *f = (foo *) zmalloc(sizeof(*f)); This allocates a `foo` without calling the constructor, leaving it in an invalid state. Same idea when free-ing with free something that has a non-trivial destructor. To avoid this, I suggest adding templated allocation functions that we will use throughout, that verify if the given type is safe to malloc (and generate a compilation failure if not). The existing code barely needs changes. For example: - (foo *) zmalloc(sizeof(*f)) + zmalloc<foo>() For simplicity I propose that as soon as a type is non-POD (std::is_pod<T>::value is false), we prevent using malloc/free on it. It would be ok in theory to allocate such a type with malloc and free with free, but call the constructor (using placement-new) and destructor explicitly, but I don't see why we would want to do that. It might also be technically more correct to use a combination of std::is_trivially_constructible and std::is_trivially_destructible (std::is_pod being not fine-grained enough), but using std::is_pod just keeps things simpler. This patch introduces the following templated allocation functions: 1. zmalloc<T>() 2. zmalloc<T>(size) 3. malloc<T>() 4. malloc<T>(size) 5. calloc<T>(nmemb) 1. Allocate one T, zero-initialized 2. Allocate a buffer of size `size`, zero-initialized, this is used when the caller calculates the size to allocate, like when using flexible array members 3. Same as 1, but without the zero-initialization 4. Same as 2, but without the zero-initialization 5. Allocate an array of `nmemb` elements of type T, zero-initialized For the de-allocation side, add templated `free` function declaration that uses std::enable_if (SFINAE) to declare a deleted prototype if the type T isn't safe to free (causing a compilation error). There are a lot of places where we pass pointers to void to free. These can't be checked, as we don't know what type of object the pointer really points to. We could forbid that and fix all callers to pass a typed pointer, but that seems a bit too much to chew for the moment. So for now, simply accept that freeing pointers to void won't be checked. It's a best effort. As an example, if I add an explicit constructor to type ctf_trace (in src/bin/lttng-relayd/ctf-trace.h), I get the following errors with clang. For the allocation: /home/simark/src/lttng-tools/src/common/macros.h:57:2: error: static_assert failed due to requirement 'std::is_pod<ctf_trace>::value' "type is POD" static_assert (std::is_pod<T>::value, "type is POD"); ^ ~~~~~~~~~~~~~~~~~~~~~ /home/simark/src/lttng-tools/src/bin/lttng-relayd/ctf-trace.cpp:84:10: note: in instantiation of function template specialization 'zmalloc<ctf_trace>' requested here trace = zmalloc<ctf_trace>(); ^ For the de-allocation: /home/simark/src/lttng-tools/src/bin/lttng-relayd/ctf-trace.cpp:29:2: error: call to deleted function 'free' free(trace); ^~~~ /home/simark/src/lttng-tools/src/common/macros.h:125:6: note: candidate function [with T = ctf_trace, $1 = void] has been explicitly deleted void free(T *p) = delete; ^ /usr/include/stdlib.h:565:13: note: candidate function extern void free (void *__ptr) __THROW; ^ Change-Id: I246a9113d08fa36b81a49137f4e80a5e808de913 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Rename C++ header files to .hpp Rename all C++ header files (include/**/*-internal.h, src/**/*.h except argpar and msgpack, some headers in tests) to have the .hpp extension. Doing so highlights that we include some C++ header files in some test files still compiled as C. This is ok for now, as the files they include don't actually contain C++ code incompatible with C yet, but they could eventually. This is something we can fix later. Change-Id: I8bf326b6b2946a3e26704f3ef3ac5831bbe9bc26 Signed-off-by: Simon Marchi <simon.marchi@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>