tests: add check_skip_kernel_test to check root user and lttng kernel modules The current tests will run both userspace and kernel testing. Some of use cases only use lttng for one kind of tracing on an embedded device (e.g. userspace), so in this scenario, the kernel modules might not install to target rootfs, the test cases would be fail and exit. Add LTTNG_TOOLS_DISABLE_KERNEL_TESTS to skip the lttng kernel features test, this flag can be set via "make": make check LTTNG_TOOLS_DISABLE_KERNEL_TESTS=1 When this flag was set, all kernel related testcases would be marked as SKIP in result. Since the the LTTNG_TOOLS_DISABLE_KERNEL_TESTS was checked in function check_skip_kernel_test, lots of testcases also need to check root permission, so merging the root permission checking into check_skip_kernel_test. Change-Id: I49a1f642a9869c21a69e0186c296fd917bd7b525 Signed-off-by: Xiangyu Chen <xiangyu.chen@windriver.com> Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
port: add support for BSD mktemp Use '-t' which is portable instead of the GNU specific '--tmpdir'. Change-Id: I430af6b96c27c2766a2cc4b5574af8563297d717 Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: tests: test definitions arrays contain invalid data Observed issue ============== The long_regression Ci job fails on test_thread_stall. 11:17:16 # export LTTNG_SESSION_CONFIG_XSD_PATH=/home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/common/ 11:17:16 # env /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/bin/lttng-sessiond/lttng-sessiond --background --consumerd64-path=/home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/bin/lttng-consumerd/lttng-consumerd 1 11:17:16 ok 16 - Start session daemon 11:17:16 # Check after running for 30 seconds 11:17:16 not ok 17 - Validation failure 11:17:16 # Failed test 'Validation failure' 11:17:16 # in /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../utils/tap/tap.sh:fail() at line 159. 11:17:16 # Health returned: 11:17:16 # stdout: 11:17:16 # stderr: 11:17:16 # Killing (signal SIGKILL) lttng-sessiond and lt-lttng-sessiond pids: 1840601 1840602 11:17:16 ok 18 - Wait after kill session daemon ... 17:57:01 # Test health problem detection with LTTNG_RELAYD_THREAD_DISPATCHER 17:57:01 # Start session daemon 17:57:01 # export LTTNG_SESSION_CONFIG_XSD_PATH=/home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/common/ 17:57:01 # env /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/bin/lttng-sessiond/lttng-sessiond --background --consumerd64-path=/home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/../src/bin/lttng-consumerd/lttng-consumerd 1 17:57:01 ok 38 - Start session daemon 17:57:01 # /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../../src/bin/lttng/lttng create health_thread_stall --no-output 17:57:01 ok 39 - Create session health_thread_stall in no-output mode 17:57:01 # With UST consumer daemons 17:57:01 # /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../../src/bin/lttng/lttng enable-event tp:tptest -c testchan -s health_thread_stall -u 17:57:01 ok 40 - Enable ust event tp:tptest for session health_thread_stall 17:57:01 ok 41 # skip: Root access is needed. Skipping kernel consumer health check test. 17:57:01 # /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../../src/bin/lttng/lttng start health_thread_stall 17:57:01 ok 42 - Start tracing for session health_thread_stall 17:57:01 # Check after running for 30 seconds 17:57:01 not ok 43 - Validation failure 17:57:01 # Failed test 'Validation failure' 17:57:01 # in /home/jenkins/workspace/lttng-tools_master_long_regression/arch/amd64/babeltrace_version/stable-2.0/build/std/conf/std/liburcu_version/master/test_type/full/src/lttng-tools/tests/regression/tools/health/../../../utils/tap/tap.sh:fail() at line 159. 17:57:01 # Health returned: 17:57:01 # stdout: 17:57:01 # stderr: 17:57:01 # Killing (signal SIGTERM) lttng-consumerd pids: 690297 690299 17:57:01 Error: consumer closed the command socket 17:57:01 Error: Health error occurred in thread_consumer_management 17:57:01 ok 44 - Wait after kill consumer daemon Cause ===== After investigation, commit 3c3390532736cfb5198f863d0d2b218e21fcf76d [1] introduces the test regression. Albeit [1] removes `LTTNG_SESSIOND_THREAD_HT_CLEANUP` from the `THREAD` array and the corresponding error message in `ERROR_STRING`, it does not modify the `NEEDS_ROOT`, `TEST_CONSUMERD` and `TEST_RELAYD` arrays. Also the test count is not adjusted to reflect the removal of the `THREAD` element. Solution ======== Remove the unused data from `NEEDS_ROOT`, `TEST_CONSUMERD` and `TEST_RELAYD` and adjust the test count. Known drawbacks ========= None. References ========== [1] https://github.com/lttng/lttng-tools/commit/3c3390532736cfb5198f863d0d2b218e21fcf76d Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I9c16fa8d76b41f1a28fd342d9f076969f4ff1b13
Remove ht-cleanup thread The hashtable cleanup thread was introduced to prevent deadlocks happening when the `cds_lfht_destroy()` function was called concurrently with userspace-rcu hashtable resizes. This was fixed in the userspace-rcu project in commit: commit d0ec0ed2fcb5d67a28587dcb778606e64f5b7b83 Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Date: Tue May 30 15:51:45 2017 -0400 Use workqueue in rculfhash That commit makes it so that the `cds_lfht_destroy()` function can safely be called within RCU read-side critical sections. This commit is included in the 0.10 release of urcu. The LTTng-Tools project now has a minimum version dependency on urcu 0.11. Because it's now safe to call `cds_lfht_destroy()` within RCU critical sections, the need for the hash table cleanup thread disappears. This commit replaces all uses of `ht_cleanup_push()` by `lttng_ht_destroy()` and remove all uses and mentions of the ht_cleanup thread. Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I163a281b64a6b3eed62c58515932f71f3b52fea6
Cleanup: tests: name all temporary files to better identify leakage When using a template, we need to add `--tmpdir` to the `mktemp` arguments to place the tmp files in `/tmp` or `$TMPDIR`. Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: Id107530578d91700b726ceec016a8cef772e94b0
Tests: fix: test_tp_fail: bail out on non-existing relay daemon Fatal thread errors simulated by the tp_fail test cause the relay daemon to shutdown. This is unexepected by stop_lttng_relayd_notap which bails out, causing the test to fail. We bail-out when the daemon is already dead to catch crashes during the test suite since a0f8e310. Use the clean-up variant so that we don't fail the tests for this expected outcome. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I02e7f35451b3c81e7e808e9ff96b6c824fa8f904
Fix: tests: health thread stall: only stop consumerd when required Since a0f8e3109, stop_lttng_consumerd will report a failure when there is no consumer daemon to kill. This fix ensures it is only invoked for tests that launch a consumer daemon. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I6831fbe7563d2e1804f10588494d126fbb4202ff
Tests: remove unused libhealthexit code libhealthexit is no longer used since 89c453960. Remove the now-unused code of that test library. A comment referencing that library is also adjusted. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com> Change-Id: I1d088e9032a3b2e1a9f7956e81c7cb662473a7fd
tests: Move to kernel style SPDX license identifiers The SPDX identifier is a legally binding shorthand, which can be used instead of the full boiler plate text. See https://spdx.org/ids-how for details. Signed-off-by: Michael Jeanson <mjeanson@efficios.com> Change-Id: I89cd4b4b7440f71f52426a5508252932bb46e796 Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Tests: consolidate session creation with a uri parameter in utils.sh Introduce a new create_lttng_session_uri test helper. Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Tests: use SIGKILL to shutdown daemons in test_thread_exit and test_tp_fail A current design limitation of the lttng-consumerd will cause it to hang on shutdown if the timer management thread exits as the teardown of channels switches off the channel's timers. The timer thread is then expected to purge timer signals and signal when it is done. Obviously this state will never be reached as signals are no longer being processed. This is not dramatic as this is not what this test is meant to test; we only want to make sure the health check signals that something went wrong. Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: tests: skip tests on static build Skip tests that depend on shared objects on static build rather than bailing out, which will let the overall test suite succeed. Fixes: #977 Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Fix: regression tests Fix racy session/relayd wait-after-kill scheme. Fix racy live test where application may not have generated events yet when we attach to the live trace. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com> Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
health check tests: test relayd and consumerd - Cover relayd and consumerd, - Add a test_thread_ok test (no issue found by test) to fast_regression, - Merge duplicated code into test_health.sh. Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>