From d8f644d930cbd10bccec3c522bc8de95099827f3 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Thu, 18 Nov 2021 10:37:29 -0500 Subject: [PATCH] Fix: relayd: compare viewer chunks by ID rather than address MIME-Version: 1.0 Content-Type: text/plain; charset=utf8 Content-Transfer-Encoding: 8bit Observed issue ============== In "Fix: relayd: live: erroneous message timestamp observed from live viewer", we observe that sometimes viewer streams unexpectedly end up performing a viewer stream rotation at session destroy. Cause ===== This may happen in the following scenario: 1) Relay stream A is rotated to NULL. 2) viewer_get_next_index for viewer stream A: 2.1) observes a NULL rstream->trace_chunk, updates the viewer session current trace chunk to NULL (viewer_session_set_trace_chunk_copy). 2.2) "Transition the viewer stream into the latest trace chunk available." does not issue viewer_stream_rotate_to_trace_chunk, because the condition (rstream->completed_rotation_count == vstream->last_seen_rotation_count + 1 && !rstream->trace_chunk) evaluates to "true", and thus the entire if () evaluates to false. 3) check_index_status detects rstream->closed and index_received_seqcount == index_sent_seqcount, thus replying HUP to viewer, effectively releasing ownership of the viewer stream. 4) viewer_get_next_index for viewer stream B (not rotated to NULL yet): 4.1) observes a non-NULL rstream->trace_chunk, updates the viewer session current trace chunk to *a new copy* of the non-NULL rstream->trace_chunk (viewer_session_set_trace_chunk_copy). 4.2) the comparison (conn->viewer_session->current_trace_chunk != vstream->stream_file.trace_chunk) done by pointer don't match, because the viewer session current trace chunk is a new copy. Therefore, due to those stream close scenarios where the viewer session can go back and forth between NULL and _different copies_ of the relay chunk, we cannot use a comparison of chunks by address on the viewer chunks. Solution ======== Compare the viewer stream chunks by ID rather than address. Known drawbacks =============== The comparison is probably slightly slower, but I don't expect this to be significant. Signed-off-by: Mathieu Desnoyers Signed-off-by: Jérémie Galarneau Change-Id: I05e4f97f26b2659c007726cc29d3edafa17bdb98 --- src/bin/lttng-relayd/live.cpp | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/src/bin/lttng-relayd/live.cpp b/src/bin/lttng-relayd/live.cpp index 92c3bf4a7..bfef8b9f9 100644 --- a/src/bin/lttng-relayd/live.cpp +++ b/src/bin/lttng-relayd/live.cpp @@ -1661,9 +1661,9 @@ int viewer_get_next_index(struct relay_connection *conn) * This allows clients to consume all the packets of a trace chunk * after a session's destruction. */ - if (conn->viewer_session->current_trace_chunk != vstream->stream_file.trace_chunk && + if (!lttng_trace_chunk_ids_equal(conn->viewer_session->current_trace_chunk, vstream->stream_file.trace_chunk) && !(rstream->completed_rotation_count == vstream->last_seen_rotation_count + 1 && !rstream->trace_chunk)) { - DBG("Viewer session and viewer stream chunk differ: " + DBG("Viewer session and viewer stream chunk IDs differ: " "vsession chunk %p vstream chunk %p", conn->viewer_session->current_trace_chunk, vstream->stream_file.trace_chunk); @@ -2032,8 +2032,8 @@ int viewer_get_metadata(struct relay_connection *conn) } if (conn->viewer_session->current_trace_chunk && - conn->viewer_session->current_trace_chunk != - vstream->stream_file.trace_chunk) { + !lttng_trace_chunk_ids_equal(conn->viewer_session->current_trace_chunk, + vstream->stream_file.trace_chunk)) { bool acquired_reference; DBG("Viewer session and viewer stream chunk differ: " -- 2.34.1