From: David Goulet Date: Tue, 11 Sep 2012 17:48:56 +0000 (-0400) Subject: Add new thread in consumer for metadata handling X-Git-Tag: v2.1.0-rc3~2 X-Git-Url: https://git.lttng.org/?p=lttng-tools.git;a=commitdiff_plain;h=fb3a43a9284f3300e9b66edc2f2c2d2767895423;hp=fb3a43a9284f3300e9b66edc2f2c2d2767895423 Add new thread in consumer for metadata handling To prioritize the consumption of the metadata, this patch introduce a new thread in the consumer which exclusively handles metadata in order to separate them from the trace data. The motivation behind this change is that once a start command is done on the tracer (kernel or UST), the start waits up to 10 seconds for the metadata to be written (LTTNG_METADATA_TIMEOUT_MSEC). However, there is a case where there is not enough space in the metadata buffers and the tracer waits so to not drop data. After the timeout, if the write(s) is unsuccessful, the start session command fails. The previous problem can occur with network streaming with high throughput data such as enable-event -a -k and a low bandwitdh connection. The separation between metadata and trace data does the trick where consuming metadata does not depend anymore on the arbitrary time to stream trace data while metadata buffers needs to get consumed. Of course, this fix is more _visible_ on multiprocessor/core machines but can also help on single processor to prioritize metadata consumption. It helps on single-processor too because the scheduler will schedule both the data and metadata threads. Even if the data thread need to send many MB of data, if the metadata thread sends small enough metadata we should be good with half of the CPU time. I see that the metadata reaches easily 192k for kernel traces though. On a 5KB/s connection, this sums up to 38s. However, thanks to the fact that the 10s delay is allowed between each sub-buffer, we don't reach the limit. This limits us to small trace packet sizes though, if we ever have lots of metadata. E.g. on a 5KB/s connection, metadata buffers configured as 2x64KB, with metadata size of e.g. 512KB, would trigger the 10s delay error. So we should be good for now, but removing this arbitrary 10s delay is something to keep in mind as future improvement. Acked-by: Mathieu Desnoyers Signed-off-by: David Goulet ---