From: Jérémie Galarneau Date: Wed, 9 May 2018 01:41:08 +0000 (-0400) Subject: Fix: propagate archive id to the consumer daemon on stream creation X-Git-Tag: v2.11.0-rc1~184 X-Git-Url: https://git.lttng.org/?a=commitdiff_plain;h=e098433c90550d74288498f8c4474ef4c2daea68;hp=e098433c90550d74288498f8c4474ef4c2daea68;p=lttng-tools.git Fix: propagate archive id to the consumer daemon on stream creation This is the first of a series of fixes addressing a number of problems with the way session rotation completions handled. Those issues can result in: - A stop never completing, - A rotation never completing, - A rotation being marked as completed while the consumerd/relayd are still writing to the completed chunk's trace archive, resulting in a temporarily corrupted trace. This first commit performs a relatively simple modification to ensure that the session's current archive id is propagated to the consumer daemon. Detailed description of the problems --- At the core of the problem is the fact that in per-pid buffering, we are not guaranteed that the sessiond will be able to see an application's channel(s) if it was torn down before (or even during) the rotation. When an application is torn down, it is removed from the ust_app_ht. That doesn't mean its buffers were received by the relayd or even consumed by the consumerd. The session daemon issues a "flush channel" command, but there is no guarantee/synchronization to ensure the buffers have been consumed. The current design assumes that the sessiond knows all the channels to rotate and that we can monitor those channels for the completion of a rotation. Given that an application can disappear or appear while we iterate on the ust_app_ht, this assumption does not hold. We also don't want to prevent/delay applications from registering or exiting just because a rotation is ongoing. * Problem 1 * A rename can happen before the relay has received all data for a given chunk, leading to the data pending issue explained previously. Rename should be performed as the last action after the rotation has been completed since data can still be in-flight, causing the creation of indexes upon its arrival on the relayd's end. See: https://github.com/lttng/lttng-tools/blob/cea6c68/src/bin/lttng-sessiond/rotation-thread.c#L392 Currently, the rotation thread waits for all channels (known to the sessiond at the start of the rotation) to have reached their rotation point. More specifically, the consumer will write to the channel_rotation pipe everytime a channel's subbuffers have been read up to the point of the rotation position. This does not guarantee that the data has been commited to disk on the relay's end. At that point, the command to rename the destination folder is sent to the relayd and the sessiond checks for the pending rotation periodically (every 200ms) if the output was to a relayd. That check is assumed not to be needed when tracing locally since reaching the rotation point implies the contents being written to disk. This scheme is not safe. If the sessiond sees no channel to iterate on, it will issue the rename command immediately. If an application's buffers were being flushed by the consumerd, the relayd will receive the data, attempt to create index files, and fail since the folder has been moved. From an architectural standpoint, the rename command also leaves the 'path' of streams that were unknown to the sessiond pointing to a path that does not exist anymore. * Problem 2 * In per-pid tracing mode, an application can appear after the rotation was initiated and cause the rotate pending check to never complete. A RELAYD_ROTATE_PENDING command is applied to a unique session id and a chunk id. When handling a RELAYD_ROTATE_PENDING commands, the relayd will perform the following check: - Iterate on every stream known at that point: - Check if the stream is rotating (stream->rotate_at_seq_num != -1ULL) - If the stream is not rotating, "stream->chunk_id < chunk_id" is checked. - If true, the rotation is considered incomplete. See: https://github.com/lttng/lttng-tools/blob/cea6c68/src/bin/lttng-relayd/main.c#L2850 Given that streams, at their creation, are initialized with their current "chunk_id" set to 0, the rotation will never be considered complete if a stream is created between a ROTATE_STREAM and ROTATE_PENDING command. This can happen whenever an application is registered during a rotation. * Problem 3 * Since the sessiond can't accurately monitor the channels that have to be rotated, the "rotation completed" notification (and state, if queried with the lttng_rotation_handle_get_state() interface) is not reliable. A client could see that the rotation is marked as completed and attempt to read a trace archive that has not been completely written. Signed-off-by: Jérémie Galarneau ---