Fix: unchecked return value in trace_clock_read64_monotonic
[lttng-tools.git] / doc / calibrate.txt
CommitLineData
17edf6ce
MD
1LTTng calibrate command documentation
2Mathieu Desnoyers, August 6, 2011
3
4The LTTng calibrate command can be used to find out the combined average
5overhead of the LTTng tracer and the instrumentation mechanisms used.
6This overhead can be calibrated in terms of time or using any of the PMU
7performance counter available on the system.
8
9For now, the only calibration implemented is that of the kernel function
10instrumentation (kretprobes).
11
12
13* Calibrate kernel function instrumentation
14
15Let's use an example to show this calibration. We use an i7 processor
16with 4 general-purpose PMU registers. This information is available by
17issuing dmesg, looking for "generic registers".
18
19This sequence of commands will gather a trace executing a kretprobe
20hooked on an empty function, gathering PMU counters LLC (Last Level
21Cache) misses information (see lttng add-context --help to see the list
22of available PMU counters).
23
24(as root)
25lttng create calibrate-function
26lttng enable-event calibrate --kernel --function lttng_calibrate_kretprobe
27lttng add-context --kernel -t perf:LLC-load-misses -t perf:LLC-store-misses \
28 -t perf:LLC-prefetch-misses
29lttng start
30for a in $(seq 1 10); do \
31 lttng calibrate --kernel --function;
32done
33lttng destroy
9674ce7a 34babeltrace $(ls -1drt ~/lttng-traces/calibrate-function-* | tail -n 1)
17edf6ce
MD
35
36The output from babeltrace can be saved to a text file and opened in a
37spreadsheet (e.g. oocalc) to focus on the per-PMU counter delta between
38consecutive "calibrate_entry" and "calibrate_return" events. Note that
39these counters are per-CPU, so scheduling events would need to be
40present to account for migration between CPU. Therefore, for calibration
41purposes, only events staying on the same CPU must be considered.
42
43The average result, for the i7, on 10 samples:
44
45 Average Std.Dev.
46perf_LLC_load_misses: 5.0 0.577
47perf_LLC_store_misses: 1.6 0.516
48perf_LLC_prefetch_misses: 9.0 14.742
49
50As we can notice, the load and store misses are relatively stable across
51runs (their standard deviation is relatively low) compared to the
52prefetch misses. We can conclude from this information that LLC load and
53store misses can be accounted for quite precisely, but prefetches within
54a function seems to behave too erratically (not much causality link
55between the code executed and the CPU prefetch activity) to be accounted
56for.
This page took 0.046272 seconds and 4 git commands to generate.