From: Jérémie Galarneau Date: Fri, 31 Jul 2020 15:20:40 +0000 (-0400) Subject: Fix: extraneous empty/inactive flush on rotation out of a trace chunk X-Git-Tag: v2.13.0-rc1~554 X-Git-Url: https://git.lttng.org/?a=commitdiff_plain;h=c1dcb8bb058481eb59317fb442fb9b3cc01a7e35;hp=c1dcb8bb058481eb59317fb442fb9b3cc01a7e35;p=lttng-tools.git Fix: extraneous empty/inactive flush on rotation out of a trace chunk Observed issue ============== A test (tests/regression/tools/tracefile-limits/test_tracefile_count) occasionally fails on ppc64. The trace validation steps in the fails in the case where the trace file count limit is set to 1. Examining the resulting trace shows that the last packet of data produced by the test application appears to be missing. The test case enables a channel in "overwrite" mode. Normally, this would guarantee that the last data produced will always be available in the resulting trace. Cause ===== An empty/inactive flush is performed when rotating "out" of a non-null trace chunk to ensure that the trace chunk contained at least one packet. Looking at the test's resulting trace and by following the consumerd logs, we see that the test application runs on one CPU for most of its lifetime. The stream file is repeatedly replaced to make room for the latest data. Eventually, the application is migrated to another CPU. A number of packets are written to this new stream. The session is then stopped which causes an active flush to occur to close the current packet of all streams (see `ust_app_stop_trace_all`). Then, when the session is destroyed, an empty/inactive flush is performed to ensure that at least one packet was produced in the current trace chunk [1]. At the moment of writing this empty packet, the consumer daemon sees that there is not enough space left in the stream file to honour the trace file size restriction. It thus overwrites the file resulting in the loss of the last events to replace them with the empty "end of chunk" packet that occupies a single page. While the problem is not specific to PowerPC 64, it has a lot more chances to occur there as pages are typically configured to be of 64kb length. Due to current implementation limitations, empty packets have a size of one page. In other words, 4kb pages typically fit in the space left in the file, causing the problem to not be easily reproducible on x64. Note that while the file size limit is specified as "3 * PAGE_SIZE" in the test, it is rounded-up to 512kb to accomodate at least one sub-buffer. Solution ======== [1] This empty/inactive flush is no longer necessary since f96af312b as an "open packet" (which performs an empty/inactive flush) is performed when a stream enters a non-null trace chunk. There is no concern that a trace chunk will be left empty unless this initial flush fails (see patch comments and work-around). The empty flush that was performed for data streams is converted into an active flush under most circumstances; the packet is simply closed. Known drawbacks =============== None. Signed-off-by: Jérémie Galarneau Change-Id: I5602b7ab8318374f75060489cf9c27af4e058805 ---