lttng-tools.git
3 months agoFix: sessiond: double free on duplicate removal of tracer source
Jérémie Galarneau [Thu, 28 Jan 2021 19:39:37 +0000 (14:39 -0500)] 
Fix: sessiond: double free on duplicate removal of tracer source

An unrelated bug (fixed in a separate commit) can cause an event source
to be removed from the notification thread's monitored sources twice.

The event source removal starts by searching for the source to remove
based on the source pipe's read-end fd number and assumes that it will
always be found. After iterating on the list, an assertion that
`source_element` is not NULL is done in the assumption that NULL would
mean that the source was not found.

This is incorrect since, if the source is not found, `source_element`
will simply point to the last element of the list, causing the assertion
to succeed.

Then, the last source in the list is torn down, but not removed from the
list. This causes that event source to be free'd twice when it is
actually removed later on.

The assumption that an event source can always be found does not hold
for the moment. For instance, when an application can exit, closing its
end of the notification pipe, the notification thread could wake-up
before the application management thread.

In that case, the notification thread will react to the event by
removing the application's source from its monitored sources. Then, when
the application management thread wakes up, it will ask the notification
thread to (again) remove the event source, which will fail as it will
not be found.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7b5ebf90b868faded47a4e9675e01e1fb2b77a70

3 months agosessiond: kernel triggers: add infrastructure to create event notifiers
Jonathan Rajotte [Mon, 23 Mar 2020 21:09:18 +0000 (17:09 -0400)] 
sessiond: kernel triggers: add infrastructure to create event notifiers

Add the infrastructure to initialize the kernel tracer event notifier
group and individual event notifiers from event rules issued from
triggers.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I20127f655018260e45566d09d11c2852cd3b3f97
Depends-on: lttng-ust: I5a800fc92e588c2a6a0e26282b0ad5f31c044479

3 months agokernel: event notifier: kernel-ctl interface
Jonathan Rajotte [Mon, 23 Mar 2020 21:03:37 +0000 (17:03 -0400)] 
kernel: event notifier: kernel-ctl interface

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idd3983c594ffecff0a20b71b7cd3d297e77446a1
Depends-on: lttng-ust: I5a800fc92e588c2a6a0e26282b0ad5f31c044479

3 months agokernel: load lttng-ring-buffer-event-notifier-client module
Jonathan Rajotte [Mon, 3 Feb 2020 19:03:25 +0000 (14:03 -0500)] 
kernel: load lttng-ring-buffer-event-notifier-client module

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9449e895b1eb88eb6db47dc3f8eb2864a2eb816d
Depends-on: lttng-ust: I5a800fc92e588c2a6a0e26282b0ad5f31c044479

3 months agosessiond: kernel: make modules required/optional property per-module
Jérémie Galarneau [Tue, 19 Jan 2021 19:56:49 +0000 (14:56 -0500)] 
sessiond: kernel: make modules required/optional property per-module

Modules are considered required or optional based on their
category (control or data probes). Make the load policy per-probe since
optional control probes will be introduced in a follow-up change.

No change in behaviour is intended by this change.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I0048b60bee3969d2fa2b9ed94b6fb24d3b5ae659

3 months agoFix: add rcu_barrier() after sessiond_cleanup()
Francis Deslauriers [Mon, 14 Dec 2020 22:30:12 +0000 (17:30 -0500)] 
Fix: add rcu_barrier() after sessiond_cleanup()

This is to ensure that tracer event source (event notifier socket) are
removed from the notification thread list.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2ca8f72c023132c341193bf626c0dac20b89e1f2

3 months agoust-app: implement event notifier support
Jonathan Rajotte [Mon, 13 Jan 2020 18:59:39 +0000 (13:59 -0500)] 
ust-app: implement event notifier support

Event notifier support mostly resemble how it is done for regular event.

We end up implementing ust_app_synchronize_event_notifier_rules which is
used in a similar fashion to ust_app_synchronize minus the dependency on
a ltt_ust_session session.

The lttng_event_rule_generate_bytecode interface is modified to return a
status code since it could fail (return NULL) for reasons other than not
having exclusions (e.g. an allocation failure).

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2cde2b3d2530e2114bff99b1b26ac6d83f575ad9
Depends-on: lttng-ust: I5a800fc92e588c2a6a0e26282b0ad5f31c044479

3 months agoFix: liblttng-ctl: unreported truncations when copying strings
Jérémie Galarneau [Tue, 12 Jan 2021 22:41:54 +0000 (17:41 -0500)] 
Fix: liblttng-ctl: unreported truncations when copying strings

gcc 10.2 reports a large number of string truncation warning in
liblttng-ctl. Replace the uses of lttng_ctl_copy_string() util by
lttng_strncpy() (handling the null source case when applicable) and
report the truncations when they occur.

Example gcc warning:
  lttng-ctl.c:86:3: warning: ‘strncpy’ output may be truncated copying 254 bytes from a string of length 254 [-Wstringop-truncation]

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Icca5f4c2490c6796b451999d7694db8597bae719

3 months agoFix: sessiond: event name truncation during listing
Jérémie Galarneau [Tue, 12 Jan 2021 22:08:56 +0000 (17:08 -0500)] 
Fix: sessiond: event name truncation during listing

The use of strncpy can lead to silently-truncated event names. Replace
its use by the internal lttng_strncpy which fails on truncation.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I65d6bd46208dc7b62a83e4600a52a6669fd99d55

3 months agoClean-up: replace erroneous of empty parameter list by void
Jérémie Galarneau [Tue, 12 Jan 2021 20:36:00 +0000 (15:36 -0500)] 
Clean-up: replace erroneous of empty parameter list by void

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I951feeed92b346e79e34bec45a14f8b226283ae4

3 months agosessiond: setup event notifier group for registering app
Jonathan Rajotte [Fri, 10 Jan 2020 21:05:48 +0000 (16:05 -0500)] 
sessiond: setup event notifier group for registering app

Create a pipe for each application and setup an event notifier group
associated with that pipe. Transfer the write side to the app, and
transfer the read side to the notification thread as an application
event source ([...]_ADD_TRACER_EVENT_SOURCE)

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3e4aab84e3270ddef1f50f72f946a4d80b3f36e0
Depends-on: lttng-ust: I5a800fc92e588c2a6a0e26282b0ad5f31c044479

3 months agoFix: configure: support Autoconf 2.70
Jérémie Galarneau [Mon, 11 Jan 2021 23:11:46 +0000 (18:11 -0500)] 
Fix: configure: support Autoconf 2.70

The newly-released autoconf 2.70 introduces a number of breaking
changes [1] and is being rolled-out by some distros.

Amongst those changes, the AC_PROG_CC_STDC macro is marked as obsolete
and was merged into AC_PROG_CC, which we already use. On 2.70, this
results in a warning which we handle as an error.

A version check is added to invoke the AC_PROG_CC_STDC macro only when
running a pre-2.70 version of autoconf, fixing the issue.

A single use of the AC_HELP_STRING macro is replaced by AS_HELP_STRING
as the former was marked as obsolete.

The AC_PROG_LEX now takes an argument, and the argument-less version is
marked as obsolete. The macro is invoked with the `noyywrap` option, as
recommended in the documentation.

Also, the AX_PTHREAD macro makes use of the $as_echo built-in shell
variable which no longer exists in 2.70. A patch was submitted to the
GNU Autoconf archive in March, but there have been no signs of life
given since then [2].

As such, our local copy is updated to the latest version and the patch
(which looks fairly straight-forward / safe) is applied. This should
minimize changes once we go back to an "official" version of the macro.

[1] https://lwn.net/Articles/839395/
[2] https://savannah.gnu.org/patch/?9906

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Reviewed-by: Michael Jeanson <michael.jeanson@efficios.com>
Change-Id: Ie949de73442770f60cbef55300265205527731c6

4 months agoFix: different pthread_getname_np signature() on macOS causes build failure
Michael Jeanson [Wed, 16 Dec 2020 17:40:53 +0000 (12:40 -0500)] 
Fix: different pthread_getname_np signature() on macOS causes build failure

macOS likes to be special so it has pthread_setname_np() without a
thread id parameter, but a pthread_getname_np() with it. Split the
detection macro in two and modifiy the compat layer to handle it.

Change-Id: I8034c54057d68eef59546960c75afe8fbe07f5ad
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
4 months agolttng-ust abi: sync _UST_CMD() values
Francis Deslauriers [Fri, 18 Dec 2020 22:00:40 +0000 (17:00 -0500)] 
lttng-ust abi: sync _UST_CMD() values

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ieacee6ecab41855cebae7113e7c512d4d684eb98

4 months agohashtable: silence -fsanitize=address warning for `hashlittle()` function
Francis Deslauriers [Fri, 4 Dec 2020 18:47:32 +0000 (13:47 -0500)] 
hashtable: silence -fsanitize=address warning for `hashlittle()` function

Issue
=====
The code of this function triggers the following heap-buffer-overflow
warning when compiled with `-fsanitize=address` in specific situation:

  ==247225==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000001310 at pc 0x5559db6c575a bp 0x7f193e6faeb0 sp 0x7f193e6faea0
  READ of size 4 at 0x602000001310 thread T4 (Notification)
      #0 0x5559db6c5759 in hashlittle /home/frdeso/projets/lttng/tools/src/common/hashtable/utils.c:315
      #1 0x5559db6c6df4 in hash_key_str /home/frdeso/projets/lttng/tools/src/common/hashtable/utils.c:490
      #2 0x5559db5e3282 in hash_trigger_by_name_uid /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:378
      #3 0x5559db5ecbe3 in trigger_name_taken /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2333
      #4 0x5559db5ecd7c in generate_trigger_name /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2362
      #5 0x5559db5ed6e0 in handle_notification_thread_command_register_trigger /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2491
      #6 0x5559db5ef967 in handle_notification_thread_command /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread-events.c:2927
      #7 0x5559db5ddbb7 in thread_notification /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/notification-thread.c:693
      #8 0x5559db60e56d in launch_thread /home/frdeso/projets/lttng/tools/src/bin/lttng-sessiond/thread.c:66
      #9 0x7f19456ec608 in start_thread /build/glibc-ZN95T4/glibc-2.31/nptl/pthread_create.c:477
      #10 0x7f1945602292 in __clone (/lib/x86_64-linux-gnu/libc.so.6+0x122292)

Given that the `k` pointer used in this loop is a `uint32_t *` we might
read bytes outside of the allocated key if the key is less than 4 bytes
long. As the comment about Valgrind explains, this is not a real problem
because memory protections are typically word bounded.

I tried to use the `__SANITIZE_ADDRESS__` define to select the
Valgrind implementation of this code when building with AddressSanitizer
but that still triggers the same head-buffer-overflow warning.

Why wasn't that a problem before?
=======================================
The trigger feature will use small default names like "T0".

Workaround
==========
Exclude this function from the sanitizing using the compiler attribute
"no_sanitize_address".

Drawback
========
This removes our sanitizing coverage for this function.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I82d0d3539916ed889faa93871f9b700064f2c52a

4 months agoTests: Fail test if sessiond is not running when it should
Francis Deslauriers [Wed, 9 Dec 2020 03:05:22 +0000 (22:05 -0500)] 
Tests: Fail test if sessiond is not running when it should

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9b39bfe6bfb9f404fe2a32c27de0276386a36212

4 months agoCleanup: erroneous use of CDS_INIT_LIST_HEAD() on node
Francis Deslauriers [Thu, 10 Dec 2020 20:42:22 +0000 (15:42 -0500)] 
Cleanup: erroneous use of CDS_INIT_LIST_HEAD() on node

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6caf957af4d3e325e9f2086441d0552d64a77db5

4 months agoUST: update ABI for event notifier
Jonathan Rajotte [Fri, 10 Jan 2020 22:03:05 +0000 (17:03 -0500)] 
UST: update ABI for event notifier

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia3088ebdf0fe64e57e93c2bec02625176460ffc9

4 months agouserspace-probe: Decouple `userspace_probe_add_callsite()` from event and session
Francis Deslauriers [Wed, 22 Jan 2020 16:15:10 +0000 (11:15 -0500)] 
userspace-probe: Decouple `userspace_probe_add_callsite()` from event and session

Currently this function takes event and session pointers:
  - The event is used to get the location type of the probe,
  - the session is used to get the uid and gid of the user to used them
    with the `run_as_*()` functions.

With the incoming trigger support, we want to reuse this function to add
trigger userspace-probe callsites.

This commit extracts what will be common in both event and trigger
implementations by creating a specialized
`userspace_probe_event_add_callsite()` function that uses a generalized
`userspace_probe_add_callsite()`.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia3b90050a7bd227a30af0c11395dcdf5aca13583
Depends-on: lttng-ust: I5a800fc92e588c2a6a0e26282b0ad5f31c044479

4 months agoGeneralize disable_ust_event to support multiple types of ust object
Jonathan Rajotte [Mon, 23 Mar 2020 15:56:05 +0000 (11:56 -0400)] 
Generalize disable_ust_event to support multiple types of ust object

This will allow us to pass a trigger object later.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2d2632c80c9bcb2d2ca6966e080d0fa5d3422796

4 months agoGeneralize enable_ust_event to support multiple types of ust object
Jonathan Rajotte [Fri, 10 Jan 2020 19:47:30 +0000 (14:47 -0500)] 
Generalize enable_ust_event to support multiple types of ust object

This will allow us to pass a trigger object later.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I594c3eb0437345406e3a24fcbfbf4b7a8162908b

4 months agoGeneralize set_ust_event_exclusion to support multiple types of ust object
Jonathan Rajotte [Mon, 16 Dec 2019 21:01:03 +0000 (16:01 -0500)] 
Generalize set_ust_event_exclusion to support multiple types of ust object

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibe8f6b2c459afc698971b23b3f1a72d0a45e036f

4 months agoGeneralize set_ust_event_filter to support multiple types of ust object
Jonathan Rajotte [Mon, 16 Dec 2019 20:48:40 +0000 (15:48 -0500)] 
Generalize set_ust_event_filter to support multiple types of ust object

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iab5bc94b8895b6470c4c3339a691f71c5e9a2e3c

4 months agonotification: mark tracer source element as out of poll set
Jérémie Galarneau [Fri, 18 Dec 2020 21:15:18 +0000 (16:15 -0500)] 
notification: mark tracer source element as out of poll set

Mark the tracer source element as being out of the notification thread's
poll set once it has been removed. This has no effect right now, but it
is less error-prone considering future changes to this function.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia4d08dabd6b07ec455fe3120b7188e414232536e

4 months agoIntroduce trigger hash table with tracer token as key
Jonathan Rajotte [Mon, 13 Jan 2020 18:52:51 +0000 (13:52 -0500)] 
Introduce trigger hash table with tracer token as key

This will allow easy lookup on reception of the tracer token coming
from the tracer.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iee42539f0a664ead5ca03534549c6bbd5e505953

4 months agonotification: add/remove tracer event source
Jonathan Rajotte [Wed, 25 Mar 2020 22:49:32 +0000 (18:49 -0400)] 
notification: add/remove tracer event source

The notification thread will be responsible of consuming the tracer
notification event coming from the UST tracers and kernel tracer.

On a 'add' operation, the tracer event source (i.e read side of a pipe)
is added to the notification poll set. Book-keeping is also done via a
list for later lookup.

On 'remove', the event source is removed from the pollset and from the
list.

On cleanup (notification_thread_handle_destroy), it is expected that all
added tracer event sources be removed by their respective "adder". No
bulk cleanup is performed.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I23679922a58849c9bc86f30b2aae17b39fa2e222

4 months agoDBG: add debug statement for trigger not bound to any object
Jonathan Rajotte [Mon, 17 Aug 2020 22:24:35 +0000 (18:24 -0400)] 
DBG: add debug statement for trigger not bound to any object

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2c21c8c702017daf38648bcb835e711335a8fd77

4 months agosessiond: Extract condition hashing functions
Francis Deslauriers [Wed, 9 Sep 2020 21:36:12 +0000 (17:36 -0400)] 
sessiond: Extract condition hashing functions

Extract these functions so it can be used by other files.

The lttng_condition hashing code is kept in this (rather than
common/condition/condition.c) since it makes use of GPLv2 code
(hashtable utils), which we don't want to link in liblttng-ctl.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iaafe1402b2d198a00920d939502004038e78fff0

4 months agoCleanup: misplaced white space in `ERR()` statement
Francis Deslauriers [Wed, 9 Dec 2020 13:43:58 +0000 (08:43 -0500)] 
Cleanup: misplaced white space in `ERR()` statement

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I44291907c9973394c3edaf9b470230c59bb75eec

4 months agoAdd base support for event rule hit
Jonathan Rajotte [Mon, 17 Aug 2020 22:23:27 +0000 (18:23 -0400)] 
Add base support for event rule hit

Add some of the scafolding to support event-rule hit conditions.
This includes the hashing of event rule conditions and, consequently,
of event rules and the various probe location types.

The kernel module ABI is checked to verity that the kernel tracer
supports event notifiers.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iab4db4fc7e9f0c5a7206106fa6a4781b6b95d306

4 months agosessiond: return 'invalid protocol' error on reception error
Jérémie Galarneau [Thu, 26 Nov 2020 20:36:03 +0000 (15:36 -0500)] 
sessiond: return 'invalid protocol' error on reception error

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I15758543ae51dd2ce30b40d88a05ef0492ce0e51

4 months agoOnly perform notification related unregistering when action is notify
Jonathan Rajotte [Mon, 17 Aug 2020 22:19:47 +0000 (18:19 -0400)] 
Only perform notification related unregistering when action is notify

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id89eaf1c8f550e20adbd0b0d82462f6bf0b8ba21

4 months agoUse lttng_trigger_is_equal when iterating over the trigger ht
Jonathan Rajotte [Mon, 13 Jan 2020 18:40:12 +0000 (13:40 -0500)] 
Use lttng_trigger_is_equal when iterating over the trigger ht

Since a trigger can now have other type of actions then the notify one,
we must account for it. We use lttng_trigger_equal to perform that task.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3300b0fff66e760152c1f7065d8fbfb945cce48e

4 months agoGenerate bytecodes related to the trigger on reception
Jonathan Rajotte [Mon, 23 Mar 2020 21:26:47 +0000 (17:26 -0400)] 
Generate bytecodes related to the trigger on reception

The compositing objects of a trigger might need to generate internal
bytecode. Doing it at the registration step allows an early validation
of the filter expressions.

There is no need to generate it for the unregister command since
bytecodes are not used for comparison and are for internal use only.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia1282d55f028e6b056e8ff3877790894c582acdb

4 months agokernel: Add token field to `struct lttng_kernel_event`
Francis Deslauriers [Fri, 13 Nov 2020 21:27:59 +0000 (16:27 -0500)] 
kernel: Add token field to `struct lttng_kernel_event`

This field will be used by event notifier and counters features.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I48d71a240150355d80b5a325717ca695467c5451

4 months agokernel: Add `struct lttng_kernel_syscall` to ABI
Francis Deslauriers [Wed, 25 Nov 2020 17:06:38 +0000 (12:06 -0500)] 
kernel: Add `struct lttng_kernel_syscall` to ABI

This struct is now used by the kernel tracer to allow to selectively
turn on and off syscalls event firing.

This way, the sessiond can decide to turn on only syscall entries,
exits, or both.

This will be used by the upcoming event notifier features to only
generate a notification on syscall entry.

This new struct doesn't change the layout of the `lttng_kernel_event`
structure.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I478de30b91b415f517e9d0ac0686f3130f79d86b

5 months agoExtras: Perl 5.26 requires { to be escaped by \
Anders Wallin via lttng-dev [Wed, 25 Nov 2020 08:31:40 +0000 (09:31 +0100)] 
Extras: Perl 5.26 requires { to be escaped by \

Unescaped literal "{" characters in regular expression patterns are no
longer permissible

Signed-off-by: Anders Wallin <wallinux@gmail.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
5 months agoFix: sessiond: metadata not created on app unregistration during start
Jérémie Galarneau [Tue, 1 Dec 2020 21:51:23 +0000 (16:51 -0500)] 
Fix: sessiond: metadata not created on app unregistration during start

Issue observed
==============

A test for an incoming feature (trigger actions on on-event conditions)
hangs. While this problem was discovered using this test, it exercises a
scenario that is problematic as of this fix.

The destruction of a session can hang if a single application being
traced unregisters (dies) during the 'start' of a session.

Cause
=====

When a per-uid session is started, its buffers (channels and streams)
are allocated only if an instrumented application is registered to the
session daemon at that moment.

For historical reasons, the 'data' and 'metadata' buffers are allocated
in separate code paths. The 'data' buffers are allocated in
ust_app_synchronize() and the 'metadata' buffers are allocated in
ust_app_start_trace(). Both functions perform their own look-up for an
application session and will gracefully fail if an application session
can't be found; it typically means the application has exited.

This leaves a race window open where ust_app_synchronize() can succeed
in looking-up the application session, and ust_app_start_trace() can
fail following the death of the application.

When this occurs, the session is left with 'data' buffers allocated and
unallocated ''metadata' buffers. This is an unexpected state and results
in the rotation code attempting to rotate a partially initialized
metadata stream.

The rotation of this partially initialized metadata stream never
completes which, in turn, never allows the session to complete its
implicit rotation on destruction.

This race window is fairly narrow, but can be reproduced by sleep()-ing
at the beginning of ust_app_start_trace() and killing an application
that is being traced during the sleep period.

Solution
========

The creation of the metadata channel is performed as part of
ust_app_synchronize() if the application look-up succeeds. When it
fails, both 'data' and 'metadata' streams will fail to be created
resulting in an expected and valid state.

Known drawbacks
===============

None.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ice0ec16734a39a6bb885986d3ad70d20cd2618e0

5 months agotest: utils: lttng_pgrep performs lookup on non-existing pid
Jonathan Rajotte [Mon, 30 Nov 2020 18:54:14 +0000 (13:54 -0500)] 
test: utils: lttng_pgrep performs lookup on non-existing pid

Observed issue
==============

 # Killing (signal SIGTERM) lttng-sessiond and lt-lttng-sessiond pids: 20962 20963
 ./tests/regression/tools/trigger/start-stop//../../../../utils/utils.sh: line 103: /proc/20963/cmdline: No such file or directory

Cause
=====

lttng_pgrep performs a two step search/validation for the pattern. Since
lttng_pgrep is used during tear-down of process (staged termination
signalling) a process returned by pgrep might exit before the second
check.

Solution
========

Simply silence the error. The code flow already acknowledges the
possibility of failure here.

Known drawbacks
=========

None

References
==========

Fixes: #1292

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I82cb9fd4754c10a5104af495a8a959f4fbd92664

5 months agoFix: missing `_mutex_lock()` before signaling a condition variable
Francis Deslauriers [Mon, 30 Nov 2020 19:54:18 +0000 (14:54 -0500)] 
Fix: missing `_mutex_lock()` before signaling a condition variable

According to the PTHREAD_COND(3) man page, a condition variable
signaling and broadcast should alway be protected with a mutex.

This commit fixes two calls to `pthread_cond_signal()` function without
holding the right lock.

This commit also adds an assertion right before two calls to
`pthread_cond_broadcast()` where it's less obvious from the surrounding
code that the mutex is held. This documents the code and may be useful
for future debugging.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iebf5a8b2e4251bd1ff4cd462e548cd3486c6cb75

5 months agoCleanup: use `modprobe --remove` rather than `rmmod`
Francis Deslauriers [Tue, 15 Sep 2020 16:10:18 +0000 (12:10 -0400)] 
Cleanup: use `modprobe --remove` rather than `rmmod`

Background
==========
According to the rmmod(8) man page:
  rmmod is a trivial program to remove a module (when module unloading
  support is provided) from the kernel. Most users will want to use
  modprobe(8) with the -r option instead.

`rmmod` simply unloads the provided module and decrements the refcount
of the modules it depended on but doesn't unload those dependencies if
their refcount is zero.

Issue
=====
With the following scenario we can end up if modules with a zero
refcount still loaded in the kernel:
  modprobe lttng-test
  lttng-sessiond
  ... (test case) ...
  ctrl+c sessiond
  rmmod lttng-test

When we teardown the lttng-sessiond, some modules are kept in the kernel
because the `lttng-test` module depends on them. So unloading
`lttng-test` using `rmmod` keeps those dependencies in the kernel.

Solution
========
Use `modprobe --remove` to unload modules and their now unused
dependencies.

From the modprobe(8) man page:
  -r, --remove
     This option causes modprobe to remove rather than insert a module.
     If the modules it depends on are also unused, modprobe will try to
     remove them too. Unlike insertion, more than one module can be
     specified on the command line

Note
====
This commit also replaces existing uses of `modprobe -r` to `modprobe
--remove` for consistency.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7be83a645097e1eddd478cfbb717906b971f04ea

5 months agotrigger: consider domain on register and unregister
Jonathan Rajotte [Mon, 10 Feb 2020 01:33:55 +0000 (20:33 -0500)] 
trigger: consider domain on register and unregister

This allows the sessiond to inform the client if a trigger that requires a
particular domain (event rule based condition, for example) is at all
valid.

This is useful to fail early when a trigger being registered requires an
unavailable tracer.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I660937e64b294f6239ba15faeef705438a93a41a

5 months agotrigger: lttng_trigger_get_underlying_domain_type_restriction
Jonathan Rajotte [Wed, 25 Mar 2020 14:41:17 +0000 (10:41 -0400)] 
trigger: lttng_trigger_get_underlying_domain_type_restriction

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5fe156a09e4e4c833f84a0fe9027c838b73fe728

5 months agoaction-executor: missing include of internal event-rule header
Jonathan Rajotte [Thu, 24 Sep 2020 19:36:43 +0000 (15:36 -0400)] 
action-executor: missing include of internal event-rule header

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If541bb203f1d851750ee485fe9bd1a12d9963774

5 months agoTests: unit: lttng_condition_event_rule
Jonathan Rajotte [Wed, 4 Dec 2019 19:30:38 +0000 (14:30 -0500)] 
Tests: unit: lttng_condition_event_rule

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I281df3b2267d6ddf3b0591d181b7f276802d8963

5 months agocondition: implement event rule based condition
Jonathan Rajotte [Tue, 3 Dec 2019 20:57:08 +0000 (15:57 -0500)] 
condition: implement event rule based condition

An event rule condition is met when a tracer hit an event matching the
associated event rule.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I550903c231d83cb3852e8ef8aee2abafe9069b10

5 months agoMove conditions source files to src/common/conditions directory
Jonathan Rajotte [Tue, 3 Dec 2019 21:07:34 +0000 (16:07 -0500)] 
Move conditions source files to src/common/conditions directory

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I43165eacf82a1bf537e7187313664e32ca2833a9

5 months agotrigger: implement listing of registered trigger
Jonathan Rajotte [Thu, 23 Jan 2020 19:13:11 +0000 (14:13 -0500)] 
trigger: implement listing of registered trigger

Each client have visibility over triggers matching its user id (uid).

The root user have visibility over all registered triggers.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3e5ae75939214ed85c376bea12f1e4b307d78976

5 months agoApply policy on channel sampling
Jonathan Rajotte [Tue, 4 Feb 2020 20:14:34 +0000 (15:14 -0500)] 
Apply policy on channel sampling

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id755b73c1f976a5a7d7a188656a1de21bd703143

5 months agotrigger: introduce firing policies
Jonathan Rajotte [Thu, 23 Jan 2020 19:14:14 +0000 (14:14 -0500)] 
trigger: introduce firing policies

A firing policy controls the rate of firing of a trigger.

Two firing policy mode are implemented:
    LTTNG_TRIGGER_FIRING_POLICY_FIRE_EVERY_N
       The triggers's actions are executed every N times the
       condition occurs.
    LTTNG_TRIGGER_FIRING_POLICY_ONCE_AFTER_N
       The triggers's actions are executed once the condition was met N
       times.

Firing policies will be moved to the specific `action` objects
in a follow-up commit as not all actions can implement the firing
policies.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifaeeaaec7b6f2bed57d0d5f4ed8546762ec02e8d

5 months agoFix: lttng-ctl: deserialize on orderly shutdown of sessiond
Francis Deslauriers [Mon, 16 Nov 2020 21:50:41 +0000 (16:50 -0500)] 
Fix: lttng-ctl: deserialize on orderly shutdown of sessiond

Issue
=====
The `recv_data_sessiond()` function may return zero if the socket peer
has shutdown orderly. This happens if the session daemon is killed while
the client is blocked on the `recv_data_sessiond()` call. Currently,
when this happens, the client simply goes on to decode the uninitialized
reply buffer.

This bug was witnessed while developing the upcoming event-notifier
feature where complex objects are received from sessiond and attempts to
deserialize these objects resulted in segmentation faults.

Solution
========
Return -LTTNG_ERR_NO_SESSIOND when `recvmsg()` returns zero. This way,
the client can simply tell the user that the session daemon is no longer
available.

Signed-off-by: Francis Deslauriers <francis.deslauriers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib2387526c4101e3bae706e38181bfeb25da26fa3

5 months agoFix: trigger: erroneous check for success of trigger creation
Jérémie Galarneau [Wed, 18 Nov 2020 22:12:45 +0000 (17:12 -0500)] 
Fix: trigger: erroneous check for success of trigger creation

6808ef55e added a check for `ret == 0` to determine if a trigger
could be created from a payload. The function returns >= 0 on
success, leading to crashes when a trigger is de-serialized.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Icd769dcb04f7637aa1877436e9a5570e7f20b63b

5 months agoFix: trigger: leak of trigger on failure to set name
Jérémie Galarneau [Wed, 18 Nov 2020 19:14:02 +0000 (14:14 -0500)] 
Fix: trigger: leak of trigger on failure to set name

lttng_trigger_create_from_payload() leaks its newly-created
trigger when it fails to set the trigger's name. Drop
the reference to the new trigger whenever the function fails.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9dbf91d404fd67e4b79f2af550f3768680d6d4ec

5 months agoClean-up: trigger: use condition and action put
Jérémie Galarneau [Wed, 18 Nov 2020 19:04:03 +0000 (14:04 -0500)] 
Clean-up: trigger: use condition and action put

Use the internal *_put() functions to discard condition and
action references rather than the public *_destroy() functions
as they may cause confusion.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Idfdfda3ea2289315408245074f7cc0de6541167a

5 months agoDocs: payload/buffer view: validate is missing an argument description
Jérémie Galarneau [Wed, 18 Nov 2020 16:55:24 +0000 (11:55 -0500)] 
Docs: payload/buffer view: validate is missing an argument description

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I3d575dcda6c3e6820e911ab3c4e28b18d29f045c

5 months agoFix: unchecked buffer size for communication header
Jérémie Galarneau [Sat, 14 Nov 2020 02:39:36 +0000 (21:39 -0500)] 
Fix: unchecked buffer size for communication header

A number of object de-serialization functions rely on a
fixed-size communication header to create an object from
a payload.

A large number of those functions assume that the initial
header fits in the provided buffer or payload view. Also,
the functions that do validate that the header fits do so
in different ways:
  - checking the view's size,
  - creating a new fixed-size view and checking the 'data' pointer.

To harmonize all of those checks, the following utils are added:
  - lttng_buffer_view_is_valid()
  - lttng_payload_view_is_valid()

These functions should be used whenever a fixed-size view is
created (not passing -1 as the length parameter).

The checks are added and/or harmonized to:
  - create a new 'header' view,
  - validate it with the corresponding *_is_valid() function,
  - initialize the header pointer using the header view.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I763946feac714ecef4fc5bd427dab2d3fe5dc1a4

5 months agorelayd: logging of `trace chunk exists` command refers to the wrong command
Jérémie Galarneau [Mon, 16 Nov 2020 21:10:09 +0000 (16:10 -0500)] 
relayd: logging of `trace chunk exists` command refers to the wrong command

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6e2bf4eee379f4e1d42333779dfeaf8f087d8217

5 months agotrigger: lttng_triggers: implement a container for multiple triggers
Jonathan Rajotte [Tue, 21 Jan 2020 19:22:37 +0000 (14:22 -0500)] 
trigger: lttng_triggers: implement a container for multiple triggers

This container is exposed for the listing of triggers.

We also plan on using it internally in the sessiond for inter-thread
communication.

The current implementation is backed by a lttng_dynamic_pointer_array.

Caller of lttng_triggers_add is responsible for managing ownership via
ref-counting of the lttng_trigger object.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib541027a6d7d856daa746de5aa49f0002bbe036f

5 months agoaction-executor: evaluated object credentials are optional
Jonathan Rajotte [Wed, 23 Sep 2020 20:13:37 +0000 (16:13 -0400)] 
action-executor: evaluated object credentials are optional

Use the is_set member instead of the LTTNG_OPTIONAL_GET_PTR macro
which asserts whenever an optional member is unset.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia00e4a7f5f9b8198061a742bf6bd750c721908cf

5 months agotrigger: generate and add tracer token on registration
Jonathan Rajotte [Wed, 9 Sep 2020 21:16:53 +0000 (17:16 -0400)] 
trigger: generate and add tracer token on registration

Assign a unique tracer token to a trigger.

This token will be used as the unique id that will be communicated back
to the sessiond by the tracers for tracer notification.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I2033dcaa4c5536b29dd4d7c57933e1aa686082cd

5 months agoaction-executor: add trigger name to debugging output
Jonathan Rajotte [Thu, 24 Sep 2020 19:14:47 +0000 (15:14 -0400)] 
action-executor: add trigger name to debugging output

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I73f8fda4b7fee331700988ea73471e3cb1516ed6

5 months agotrigger: implement trigger naming
Jonathan Rajotte [Mon, 23 Mar 2020 22:27:59 +0000 (18:27 -0400)] 
trigger: implement trigger naming

A trigger can now have an optional name on the client side.

If no name is provided the sessiond will generate a name and return a
trigger object to populate the client side object.

For now, the name generation code generate the following pattern: TN

Where `N` is incremented each time a name has to be generated. If a
collision occurs, we increment `N` as needed.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5f303610713c049177e53937bfc9824cd61501e4

5 months agoport: run namespace tests only on Linux
Michael Jeanson [Tue, 13 Oct 2020 23:19:10 +0000 (19:19 -0400)] 
port: run namespace tests only on Linux

Change-Id: I574d6e7419715e191fb9102e4cfc916ea0e529aa
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
5 months agoport: FreeBSD does support fchown and fchmod on a shm fd
Michael Jeanson [Wed, 4 Nov 2020 15:04:12 +0000 (10:04 -0500)] 
port: FreeBSD does support fchown and fchmod on a shm fd

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Iadff886d593ae3f77a4e96dfbfe02d1c1ea45f1e

5 months agoport: Add pthread_setname_np FreeBSD compat
Michael Jeanson [Thu, 29 Oct 2020 10:09:41 +0000 (06:09 -0400)] 
port: Add pthread_setname_np FreeBSD compat

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I7ca8334c4ce28bc240c898aeb5a6857ff951143b

5 months agoport: only enable userspace callstack context on Linux
Michael Jeanson [Wed, 14 Oct 2020 14:32:14 +0000 (10:32 -0400)] 
port: only enable userspace callstack context on Linux

Change-Id: I55402a7058f7d0bbe11d4c59197197130fe88665
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
5 months agotrigger: implement is_equal
Jonathan Rajotte [Tue, 24 Mar 2020 15:32:08 +0000 (11:32 -0400)] 
trigger: implement is_equal

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I646c13e7fb26fda66b888ce90253e87567b2cab8

5 months agotrigger: expose trigger owner uid
Jonathan Rajotte [Fri, 18 Sep 2020 20:37:50 +0000 (16:37 -0400)] 
trigger: expose trigger owner uid

To facilitate behavior management for the root user and to allow
duplicate trigger names across users, enforce the usage of the trigger
owner user id.

The root user will be able to register and unregister triggers on behalf
of other users. The root user will also have visibility on triggers of
other users.

Only the root user can use the `lttng_trigger_set_owner_uid` function
successfully. As indicated in the comments, this function performs
a client-side validation steps to catch mis-uses, but this is
properly enforced on the sessiond's end in the register/unregister
trigger commands.

With the future addition of a trigger name (id), the owner id and the
name will act as a key tuple allowing identicaly named triggers across
users.

We plan on exposing the `--user` switch in the upcoming command line
(add-trigger, remove-trigger).

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ifca3c41b7ffd97b67e16fb80c18472b667cb2f56

6 months agoClean-up: action-executor: typo and missing tab
Jonathan Rajotte [Fri, 22 May 2020 15:27:37 +0000 (11:27 -0400)] 
Clean-up: action-executor: typo and missing tab

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5646fae2db98a3c8ffa0ba92aff9815f4ce0cf53

6 months agoTests: Fix: 99% fill ratio for high buffer usage is too high for larger events
Jonathan Rajotte [Thu, 28 May 2020 01:29:05 +0000 (21:29 -0400)] 
Tests: Fix: 99% fill ratio for high buffer usage is too high for larger events

If the event being registered is bigger than 1% of a subbuffer, the 99%
ratio cannot be achieved since the "last event" necessary to go over 99%
will always be dropped by the tracer.

e.g:
  DBG1 - 19:31:07.665963875 [Notification]: [notification-thread] High buffer usage condition being evaluated: threshold = 16220, highest usage = 16196 (in evaluate_buffer_usage_condition() at notification-thread-events.c:3733)

We use a ratio of 90% to keep a little headroom.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I06180735e0b5e88209b888e51cc83b4ac7d98193

6 months agoFix: action: invalid header offset used when serializing snapshot action
Jonathan Rajotte [Wed, 8 Jul 2020 02:51:27 +0000 (22:51 -0400)] 
Fix: action: invalid header offset used when serializing snapshot action

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I77f5fab214f6721773147968ea3b85dddfea8d62

6 months agoport: FreeBSD has no ENODATA, alias it to ENOATTR
Michael Jeanson [Tue, 13 Oct 2020 21:33:44 +0000 (17:33 -0400)] 
port: FreeBSD has no ENODATA, alias it to ENOATTR

According to 'the internet' ENOATTR is used in a similar fashion to
ENODATA on the BSDs and we used it internally only anyway.

Change-Id: Ia4e77fd6d28c9dfb43f99ddba6c32369384827f0
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
6 months agoport: tests: /proc/self/fd is Linux only, use /dev/fd on other Unices
Michael Jeanson [Tue, 20 Oct 2020 19:02:45 +0000 (15:02 -0400)] 
port: tests: /proc/self/fd is Linux only, use /dev/fd on other Unices

Change-Id: I2be8120c7dce3f12daaf12a190810a145afa50b6
Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
6 months agoCleanup: Use pkg-config to detect liburcu
Michael Jeanson [Fri, 30 Oct 2020 06:48:08 +0000 (02:48 -0400)] 
Cleanup: Use pkg-config to detect liburcu

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I88d3f853c8ee0e14a38a462ce24626800e0a4caf

6 months agoClean-up: sessiond: silence negative index warning
Jérémie Galarneau [Thu, 29 Oct 2020 15:43:35 +0000 (11:43 -0400)] 
Clean-up: sessiond: silence negative index warning

Coverity warns that `lttng_action_get_type()` can return
a negative index (LTTNG_ACTION_TYPE_UNKNOWN). This scenario
is not reachable, but a check is added to silence the analyzer.

Original report:
  1435955 Negative array index read

  A memory location at a negative offset from the beginning of the array
  will be read, resulting in incorrect values.

  In get_action_name: Negative value used to index an array in a read
  operation (CWE-129)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I5952096a1d29f0d4a3c4350a2a842874d5f3973b

6 months agocredentials: uid and gid now use LTTNG_OPTIONAL
Jonathan Rajotte [Fri, 25 Sep 2020 20:35:28 +0000 (16:35 -0400)] 
credentials: uid and gid now use LTTNG_OPTIONAL

The triggers will only use the uid element.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia96e7def5ab560d9af1476920426635fc49f92ef

6 months agoport: Add missing sock_cred macros on FreeBSD
Michael Jeanson [Tue, 13 Oct 2020 23:06:09 +0000 (19:06 -0400)] 
port: Add missing sock_cred macros on FreeBSD

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I71f51ef61bf659c758edba6fd27faeef56654acf

6 months agoport: use compat lttng_fls()
Michael Jeanson [Tue, 13 Oct 2020 22:55:23 +0000 (18:55 -0400)] 
port: use compat lttng_fls()

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I698b31a24c5b442a00fe570a0ac53e23bb817bec

6 months agoport: FreeBSD has no LOGIN_NAME_MAX, use sysconf instead
Michael Jeanson [Tue, 13 Oct 2020 22:44:40 +0000 (18:44 -0400)] 
port: FreeBSD has no LOGIN_NAME_MAX, use sysconf instead

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Id058e15608ce0332500343ce389365a6fb1a40cc

6 months agoport: no eventfd support on FreeBSD
Michael Jeanson [Tue, 13 Oct 2020 22:44:19 +0000 (18:44 -0400)] 
port: no eventfd support on FreeBSD

It's only used in the tests to create dummy fds, use fcntl to duplicate
the stdout fd instead.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I401f2bfe6a2375a9bf4d895956071f74e5684783

6 months agooptional: Add LTTNG_OPTIONAL_INIT_VALUE
Jérémie Galarneau [Fri, 2 Oct 2020 21:25:11 +0000 (17:25 -0400)] 
optional: Add LTTNG_OPTIONAL_INIT_VALUE

Add helper to initialize an optional field to a 'set' value.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I439302ebec2433abcf7edb6167bf5b02db5a9a55

6 months agoaction: Mark parameter of lttng_action_get_type as const
Jonathan Rajotte [Wed, 23 Sep 2020 18:34:59 +0000 (14:34 -0400)] 
action: Mark parameter of lttng_action_get_type as const

Remove lttng_action_get_type_const as it is no longer needed.

Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I1525bc2c89eb37ab3e75d915c6ff50bd2a7f5d21

6 months agoIntroduce lttng_domain_type_str utility
Jonathan Rajotte [Tue, 29 Sep 2020 15:46:24 +0000 (11:46 -0400)] 
Introduce lttng_domain_type_str utility

Change-Id: I1d2c7be968da6658e93407cdba26a6042177badd
Signed-off-by: Jonathan Rajotte <jonathan.rajotte-julien@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
6 months agoport: no HOST_NAME_MAX on FreeBSD, use LTTNG_HOST_NAME_MAX
Michael Jeanson [Tue, 13 Oct 2020 21:54:27 +0000 (17:54 -0400)] 
port: no HOST_NAME_MAX on FreeBSD, use LTTNG_HOST_NAME_MAX

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I83cc40a123539a668c25828144905b628df9fdef

6 months agoport: ELF_ST_TYPE is defined in elf.h on FreeBSD
Michael Jeanson [Tue, 13 Oct 2020 21:54:04 +0000 (17:54 -0400)] 
port: ELF_ST_TYPE is defined in elf.h on FreeBSD

No need to alias ELF32_ST_TYPE to ELF_ST_TYPE.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8afa2fb9d96b81d994b90c8291f2f457a037a525

6 months agoport: posix_fadvise is available in FreeBSD >= 10.0
Michael Jeanson [Tue, 13 Oct 2020 21:32:14 +0000 (17:32 -0400)] 
port: posix_fadvise is available in FreeBSD >= 10.0

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I85f823ad7be94a5860ce0104c20e5a49ce030eda

6 months agoport: fix compat/endian.h on FreeBSD
Michael Jeanson [Tue, 13 Oct 2020 21:32:00 +0000 (17:32 -0400)] 
port: fix compat/endian.h on FreeBSD

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: If591ed8d1cf50c1914a613976e9e285c3647906c

6 months agoport: ls --ignore= is a GNU extension
Michael Jeanson [Wed, 14 Oct 2020 18:32:37 +0000 (14:32 -0400)] 
port: ls --ignore= is a GNU extension

Use grep -v instead to filter README.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8fb6aba97ba1484aff511d59eeb4584e9672659e

6 months agoTests: poll: test all possible combinations of active fds in a poll set
Jérémie Galarneau [Tue, 27 Oct 2020 21:23:45 +0000 (17:23 -0400)] 
Tests: poll: test all possible combinations of active fds in a poll set

The poll compatibility layer used on all non-Linux platforms would
hang for certain combinations of active file descriptors reported
by poll.

A new test is introduced to try all combinations of active file
descriptors for a given number of file descriptors in a poll set.

The unit test tries all combinations of 8 file descriptors which
exercises all the current compatibility code and ensures the
test concludes rapidly.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie479c4f2d85917713d3f2bdc1e4f0423ca9243af

6 months agoFix: common: poll: compat_poll_wait never finishes
Jérémie Galarneau [Fri, 16 Oct 2020 18:43:39 +0000 (14:43 -0400)] 
Fix: common: poll: compat_poll_wait never finishes

compat_poll_wait hangs when poll returns an array of file
descriptors of the form:
  [ Inactive Active ]

The logic to find the first idle pollfd entry is bogus and actually
skips the first idle entry. This causes the follow-up loop to never
conclude.

The pollfd array defragmentation logic is re-written in a simpler
style to handle those cases appropriately.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I8669a870df1ec1160f05e35e83671917bb80d6f9

6 months agoTests: Add syscall enable/disable scenarios
Michael Jeanson [Thu, 3 Sep 2020 15:04:46 +0000 (11:04 -0400)] 
Tests: Add syscall enable/disable scenarios

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ic3d9e739b01a0cf2bffb7c103911b3b51520010e

6 months agoCleanup: simplify 'poll' wrapper build
Michael Jeanson [Wed, 26 Aug 2020 15:39:15 +0000 (11:39 -0400)] 
Cleanup: simplify 'poll' wrapper build

Remove the AM conditionnal and merge the sources in single files like the
other wrappers. This removes a special case from the build system.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I6078f6013e52c3bc7c74cb8937f3741453c65874

6 months agoCleanup: autoconf 'dirfd' detection
Michael Jeanson [Wed, 26 Aug 2020 15:17:10 +0000 (11:17 -0400)] 
Cleanup: autoconf 'dirfd' detection

Remove the unused AM conditionnal and use the 'HAVE_' prefix for the
define like the other detected features.

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I9a001051a14e2360e7f66fd4f627f97b11563c4f

6 months agoSet version to 2.13-pre
Michael Jeanson [Tue, 6 Oct 2020 14:24:56 +0000 (10:24 -0400)] 
Set version to 2.13-pre

Signed-off-by: Michael Jeanson <mjeanson@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ib63daa19b91c4cd94caf4fd6cbdfd6fd1e8f015b

6 months agorelayd: silence null dereference warning during viewer stream creation
Jérémie Galarneau [Fri, 16 Oct 2020 12:25:10 +0000 (08:25 -0400)] 
relayd: silence null dereference warning during viewer stream creation

Coverity warns that the vstream's trace chunk may be used NULL.
However, this won't happen if the corresponding relay stream has
an active trace chunk.

Coverity report:
  1433620 Dereference after null check

  Either the check against null is unnecessary, or there may be a
  null pointer dereference.

  In viewer_stream_create: Pointer is checked against null but then
  dereferenced anyway (CWE-476)

Reported-by: Coverity Scan
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie032ed415a99cfff149e3325d05f37ededb52d33

6 months agoFix: relayd: failure to read index entry or stream packet after clear
Jérémie Galarneau [Wed, 7 Oct 2020 18:10:35 +0000 (14:10 -0400)] 
Fix: relayd: failure to read index entry or stream packet after clear

Observed issue
==============

The clear tests occasionally fail with the following babeltrace error
when a live session is stopped following a "clear". Unfortunately, this
problem only seems to occur on certain machines. In my case, I only
managed to reproduce this on the CI's workers.

  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER lttng_live_get_stream_bytes@viewer-connection.c:1610 [lttng-live] Received get_data_packet response: error
  10-07 12:39:48.333  7679  7679 E PLUGIN/CTF/MSG-ITER request_medium_bytes@msg-iter.c:563 [lttng-live] User function failed: status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/CTF/MSG-ITER ctf_msg_iter_get_next_message@msg-iter.c:2899 [lttng-live] Cannot handle state: msg-it-addr=0x5603c28e2830, state=DSCOPE_TRACE_PACKET_HEADER_BEGIN
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE lttng_live_iterator_next_handle_one_active_data_stream@lttng-live.c:845 [lttng-live] CTF message iterator failed to get next message: msg-iter=0x5603c28e2830, msg-iter-status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE lttng_live_msg_iter_next@lttng-live.c:1665 [lttng-live] Error preparing the next batch of messages: live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
  10-07 12:39:48.333  7679  7679 W LIB/MSG-ITER bt_message_iterator_next@iterator.c:864 Component input port message iterator's "next" method failed: iter-addr=0x5603c28cb0f0, iter-upstream-comp-name="lttng-live", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE, iter-upstream-comp-class-name="lttng-live", iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER muxer_upstream_msg_iter_next@muxer.c:454 [muxer] Upstream iterator's next method returned an error: status=ERROR
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER validate_muxer_upstream_msg_iters@muxer.c:991 [muxer] Cannot validate muxer's upstream message iterator wrapper: muxer-msg-iter-addr=0x5603c28dbe70, muxer-upstream-msg-iter-wrap-addr=0x5603c28cd0f0
  10-07 12:39:48.333  7679  7679 E PLUGIN/FLT.UTILS.MUXER muxer_msg_iter_next@muxer.c:1415 [muxer] Cannot get next message: comp-addr=0x5603c28dc960, muxer-comp-addr=0x5603c28db0a0, muxer-msg-iter-addr=0x5603c28dbe70, msg-iter-addr=0x5603c28caf80, status=ERROR
  10-07 12:39:48.333  7679  7679 W LIB/MSG-ITER bt_message_iterator_next@iterator.c:864 Component input port message iterator's "next" method failed: iter-addr=0x5603c28caf80, iter-upstream-comp-name="muxer", iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER, iter-upstream-comp-class-name="muxer", iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu", iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  10-07 12:39:48.333  7679  7679 W LIB/GRAPH consume_graph_sink@graph.c:473 Component's "consume" method failed: status=ERROR, comp-addr=0x5603c28dcb60, comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK, comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x5603c28c8140, comp-class-so-handle-path="/home/jenkins/jgalar-debug/build/usr/lib/babeltrace2/plugins/babeltrace-plugin-text.so", comp-input-port-count=1, comp-output-port-count=0
  10-07 12:39:48.333  7679  7679 E CLI cmd_run@babeltrace2.c:2548 Graph failed to complete successfully
  10-07 12:39:48.333  7679  7679 E PLUGIN/SRC.CTF.LTTNG-LIVE/VIEWER lttng_live_session_detach@viewer-connection.c:1227 [lttng-live] Unknown detach return code 0

  ERROR:    [Babeltrace CLI] (babeltrace2.c:2548)
    Graph failed to complete successfully
  CAUSED BY [libbabeltrace2] (graph.c:473)
    Component's "consume" method failed: status=ERROR, comp-addr=0x5603c28dcb60,
    comp-name="pretty", comp-log-level=WARNING, comp-class-type=SINK,
    comp-class-name="pretty", comp-class-partial-descr="Pretty-print messages
    (`text` fo", comp-class-is-frozen=0, comp-class-so-handle-addr=0x5603c28c8140,
    comp-class-so-handle-path="/home/jenkins/jgalar-debug/build/usr/lib/babeltrace2/plugins/babeltrace-plugin-text.so",
    comp-input-port-count=1, comp-output-port-count=0
  CAUSED BY [libbabeltrace2] (iterator.c:864)
    Component input port message iterator's "next" method failed:
    iter-addr=0x5603c28caf80, iter-upstream-comp-name="muxer",
    iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=FILTER,
    iter-upstream-comp-class-name="muxer",
    iter-upstream-comp-class-partial-descr="Sort messages from multiple inpu",
    iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:991)
    Cannot validate muxer's upstream message iterator wrapper:
    muxer-msg-iter-addr=0x5603c28dbe70,
    muxer-upstream-msg-iter-wrap-addr=0x5603c28cd0f0
  CAUSED BY [muxer: 'filter.utils.muxer'] (muxer.c:454)
    Upstream iterator's next method returned an error: status=ERROR
  CAUSED BY [libbabeltrace2] (iterator.c:864)
    Component input port message iterator's "next" method failed:
    iter-addr=0x5603c28cb0f0, iter-upstream-comp-name="lttng-live",
    iter-upstream-comp-log-level=WARNING, iter-upstream-comp-class-type=SOURCE,
    iter-upstream-comp-class-name="lttng-live",
    iter-upstream-comp-class-partial-descr="Connect to an LTTng relay daemon",
    iter-upstream-port-type=OUTPUT, iter-upstream-port-name="out", status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:1665)
    Error preparing the next batch of messages:
    live-iter-status=LTTNG_LIVE_ITERATOR_STATUS_ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (lttng-live.c:845)
    CTF message iterator failed to get next message: msg-iter=0x5603c28e2830,
    msg-iter-status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (msg-iter.c:2899)
    Cannot handle state: msg-it-addr=0x5603c28e2830,
    state=DSCOPE_TRACE_PACKET_HEADER_BEGIN
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (msg-iter.c:563)
    User function failed: status=ERROR
  CAUSED BY [lttng-live: 'source.ctf.lttng-live'] (viewer-connection.c:1610)
    Received get_data_packet response: error

This occurs immediately following a 'stop' on the session. As the error
indicates, a request to obtain a data packet fails with a generic
error reply.

Moreover, the following LTTNG_VIEWER_DETACH_SESSION appears to fail
with an invalid status code. This is addressed in a different commit.

Reproducing the test's failure without redirecting the relay daemon's
allows us to see the following errors after the first stop:
  PERROR - 14:33:44.929675253 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.030037417 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.130429370 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.230829447 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)
  PERROR - 14:33:45.331223320 [25108/25115]: Failed to open fs handle to ust/uid/1001/64-bit/index/chan_0.idx, open() returned: No such file or directory (in fd_tracker_open_fs_handle() at fd-tracker.c:550)

This is produced with the following back-trace:
  (gdb) bt
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x00007ffff69648b1 in __GI_abort () at abort.c:79
  #2  0x00005555555b4f1f in fd_tracker_open_fs_handle (tracker=0x55555582c620, directory=0x7fffe8006680,
      path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx", flags=0, mode=0x7ffff0a24508) at fd-tracker.c:550
  #3  0x0000555555595c34 in _lttng_trace_chunk_open_fs_handle_locked (chunk=0x7fffe0002130, file_path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx",
      flags=0, mode=432, out_handle=0x7ffff0a24710, expect_no_file=true) at trace-chunk.c:1388
  #4  0x0000555555595eef in lttng_trace_chunk_open_fs_handle (chunk=0x7fffe0002130, file_path=0x7ffff0a25870 "ust/uid/1001/64-bit/index/chan_1.idx", flags=0,
      mode=432, out_handle=0x7ffff0a24710, expect_no_file=true) at trace-chunk.c:1433
  #5  0x00005555555da6c2 in _lttng_index_file_create_from_trace_chunk (chunk=0x7fffe0002130, channel_path=0x7fffe8018c30 "ust/uid/1001/64-bit",
      stream_name=0x7fffe8018c10 "chan_1", stream_file_size=0, stream_file_index=0, index_major=1, index_minor=1, unlink_existing_file=false, flags=0,
      expect_no_file=true, file=0x7fffe0002270) at index.c:97
  #6  0x00005555555dad8a in lttng_index_file_create_from_trace_chunk_read_only (chunk=0x7fffe0002130, channel_path=0x7fffe8018c30 "ust/uid/1001/64-bit",
      stream_name=0x7fffe8018c10 "chan_1", stream_file_size=0, stream_file_index=0, index_major=1, index_minor=1, expect_no_file=true, file=0x7fffe0002270)
      at index.c:186
  #7  0x000055555557640f in try_open_index (vstream=0x7fffe0002250, rstream=0x7fffe8018c50) at live.c:1378
  #8  0x0000555555577155 in viewer_get_next_index (conn=0x7fffd4001440) at live.c:1643
  #9  0x0000555555579a01 in process_control (recv_hdr=0x7ffff0a27c30, conn=0x7fffd4001440) at live.c:2311
  #10 0x000055555557a1db in thread_worker (data=0x0) at live.c:2482
  #11 0x00007ffff6d1c6db in start_thread (arg=0x7ffff0a28700) at pthread_create.c:463
  #12 0x00007ffff6a45a3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

That problem is mostly cosmetic in nature (the open can fail
"legitimately") as the PERROR should simply not be printed and is
addressed in a different commit.

This error is also produced after a 'clear' is issued:
  PERROR - 14:33:45.532782268 [25108/25115]: Failed to read from file system handle of viewer stream id 1, offset: 4096: No such file or directory (in viewer_get_packet() at live.c:1849)

Which is produced with the following back-trace:
  #0  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
  #1  0x00007f53e297c8b1 in __GI_abort () at abort.c:79
  #2  0x000055dd77ccef2c in viewer_get_packet (conn=0x7f53c4001100) at live.c:1850
  #3  0x000055dd77cd0a15 in process_control (recv_hdr=0x7f53dca3fc30, conn=0x7f53c4001100) at live.c:2315
  #4  0x000055dd77cd11db in thread_worker (data=0x0) at live.c:2483
  #5  0x00007f53e2d346db in start_thread (arg=0x7f53dca40700) at pthread_create.c:463
  #6  0x00007f53e2a5da3f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

A similar problem occurs, although more rarely, when reading an
index entry in viewer_get_next_index().

Cause
=====

The following situation leads to both failures to get a
packet and failures to get the next index:
  - Viewer connects to an existing session,
  - Viewer consumes a number of packets, alternating the
    GET_NEXT_INDEX and GET_PACKET command,
  - The session's streams are rotated to a new trace chunk
    (as part of a clear),
  - The session is started and stopped, causing new packets
    to be produced and received,
  - The session is stopped and destroyed, causing the session's
    streams to rotate into a "null" trace chunk (no active
    trace files),
  - Viewer issues GET_NEXT_INDEX or GET_PACKET, but the fact
    that a rotation occurred on the receiving end is not detected
    as the relay streams' trace chunk are "null".

The crux of the problem is that lttng_trace_chunk_ids_equal() is
bypassed when the current trace chunk of a relay stream is "null".

The rationale for skipping this check is that it is assumed that the
files currently opened by the live server can can still be used even
if the consumer has rotated the corresponding streams into a 'null'
trace chunk, meaning no trace chunk is 'set' for those streams.

This makes sense in one scenario: the session was destroyed and we wish
to allow a connected live client to finish consuming the trace packets
up to the end of the session's lifetime.

Here, the situation is different. The viewer is reading chunk 'A'.
Meanwhile, a rotation occurs into chunk 'B' and packets are received for
chunk 'B'. Then, a rotation to a 'null' chunk (no active chunk) occurs.

In essence, the live server never sees the rotation between chunk 'A'
and 'B', and simply assumes that a rotation from 'A' to 'null' occurred,
as would happen at the end of a session.

In terms of the code, in viewer_get_next_index(), a call to
check_index_status() is performed to determine if an index is available.
The function checks that `index_received_seqcount` is greater than
`index_sent_seqcount`. In that case, it determines that an index must be
available.

Unfortunately, there is no way for the live server to determine that the
remaining indexes are in a chunk that doesn't exist anymore (chunk 'B').
Thus, viewer_get_next_index() attempts to read an index entry from the
current index file and fails.

Solution
========

1) lttng_trace_chunk_ids_equal() is modified to properly handle
'null' trace chunks:
  - A null and a non-null trace chunk are not equal,
  - Two null trace chunks are equal.

2) Rotation count
  A rotation counter is introduced to track the number of rotations
  that occurred during a relay stream's lifetime. This counter is
  sampled by the matching viewer streams on creation and on rotation
  and is used to determine if all rotations were "seen" by the viewer
  stream.

  Hence, this allows us to handle the special case where a viewer
  is consuming the contents of a relay stream that just transitioned
  into a 'null' trace chunk (see comments in patch).

The rest of the modifications simply allow the live server to handle
null trace chunks in viewer streams. This fixes another unrelated bug
that I observed while investigating this: sessions that don't have an
active trace chunk are not shown when listing sessions with babeltrace.

To reproduce, simply stop, clear a session, and attempt to list the
sessions of the associated relay daemon.

Known drawbacks
===============

None.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ibb3116990e34b7ec3b477f3482d0c0ff1e848d09

6 months agoFix: lttng-ctl: erroneous uses of LTTNG_PACKED
Jérémie Galarneau [Tue, 13 Oct 2020 18:55:33 +0000 (14:55 -0400)] 
Fix: lttng-ctl: erroneous uses of LTTNG_PACKED

The LTTNG_PACKED macro uses gcc attributes to indicate that a structure
should be packed. Hence, this macro obeys the same rules as the gcc
attribute.

Various mis-uses of the LTTNG_PACKED macros may result in structure not
being packed:
  - The LTTNG_PACKED macro should always be placed _before_ an identifier
    when a structure is declared in-place.
  - Adding LTTNG_PACKED at the definition site has no effect if the
    structure was declared elsewhere.

Those mis-uses cause issues when mixing the bitness (32/64) of the
session daemon and liblttng-ctl.

Outstanding issues include the following structures that are not
tagged as LTTNG_PACKED:
  - struct lttng_event
  - struct lttng_channel
  - struct lttng_event_context

Unfortunately, those structures are exposed by the public API and
can't be tagged as being "packed". Doing so would break the ABI
of liblttng-ctl.

These structures should be packed/unpacked explicitly.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I095dc0dffc6bf9e15dc7a7ec797958a5780ef150

6 months agoFix: relayd: live: invalid return code on DETACH_SESSION
Jérémie Galarneau [Fri, 9 Oct 2020 16:04:10 +0000 (12:04 -0400)] 
Fix: relayd: live: invalid return code on DETACH_SESSION

Babeltrace 2 reports an invalid return code being returned in reply to a
DETACH_SESSION command.

Reviewing the relevant Babeltrace 2 code, the logging can only be
produced if the reception of the lttng_viewer_detach_session_response
structure succeeds.

This elemininated my first guess that this was caused by the relay
daemon closing the socket before sending the reply. In that case, an
invalid status code of '0' could have been erroneously returned as a
status code since the recv() call on the socket would return 0.

It turns out that on a failure to return a packet, viewer_get_packet()
returns an error status code, but also sends a zero-initialized payload
buffer of the size of the requested packet.

This causes live clients which detach following the error of the
GET_PACKET command to interpret the still-enqueued zero-initialized
buffer as a reply to the DETACH_SESSION command. Since zero is not a
valid status code, it is correctly interpreted as a protocol error.

The reply_size is set to the header's size to only transmit the header
when an error reply is sent.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I69ed74f83404a16353d2bdbaa9f3adcdc2a03892

6 months agoTests: clear: remove test workspace directory
Jérémie Galarneau [Thu, 8 Oct 2020 22:15:34 +0000 (18:15 -0400)] 
Tests: clear: remove test workspace directory

The clear tests only removes its workspace's subdirectory, but
leaves an empty directory behind. Remove the wildcard and remove
the root of the workspace on clean-up.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: I551f892af5423c6ed5933beb0c1a13f41a30a26e

7 months agoTests: ns_contexts: discarded events result in test failure
Jérémie Galarneau [Mon, 21 Sep 2020 21:24:50 +0000 (17:24 -0400)] 
Tests: ns_contexts: discarded events result in test failure

A follow-up change makes all events emited by gen-ust-events
a bit larger, which causes them to no longer fit in the default
channel configuration's buffers.

This causes the test to fail occasionnaly when the consumer daemon
fails to consume the packets fast enough to leave room in the
buffers for new events.

The test doesn't need to produce 10,000 events; reducing to
1,000 produced events makes no material difference and works
around the problem.

Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ie87583bb9bb9cdd813f80443231a65164ef67df1

This page took 0.077157 seconds and 4 git commands to generate.