+++ /dev/null
-LTTng calibrate command documentation
-Mathieu Desnoyers, August 6, 2011
-
-The LTTng calibrate command can be used to find out the combined average
-overhead of the LTTng tracer and the instrumentation mechanisms used.
-This overhead can be calibrated in terms of time or using any of the PMU
-performance counter available on the system.
-
-For now, the only calibration implemented is that of the kernel function
-instrumentation (kretprobes).
-
-
-* Calibrate kernel function instrumentation
-
-Let's use an example to show this calibration. We use an i7 processor
-with 4 general-purpose PMU registers. This information is available by
-issuing dmesg, looking for "generic registers".
-
-This sequence of commands will gather a trace executing a kretprobe
-hooked on an empty function, gathering PMU counters LLC (Last Level
-Cache) misses information (see lttng add-context --help to see the list
-of available PMU counters).
-
-(as root)
-lttng create calibrate-function
-lttng enable-event calibrate --kernel --function lttng_calibrate_kretprobe
-lttng add-context --kernel -t perf:LLC-load-misses -t perf:LLC-store-misses \
- -t perf:LLC-prefetch-misses
-lttng start
-for a in $(seq 1 10); do \
- lttng calibrate --kernel --function;
-done
-lttng destroy
-babeltrace $(ls -1drt ~/lttng-traces/calibrate-function-* | tail -n 1)
-
-The output from babeltrace can be saved to a text file and opened in a
-spreadsheet (e.g. oocalc) to focus on the per-PMU counter delta between
-consecutive "calibrate_entry" and "calibrate_return" events. Note that
-these counters are per-CPU, so scheduling events would need to be
-present to account for migration between CPU. Therefore, for calibration
-purposes, only events staying on the same CPU must be considered.
-
-The average result, for the i7, on 10 samples:
-
- Average Std.Dev.
-perf_LLC_load_misses: 5.0 0.577
-perf_LLC_store_misses: 1.6 0.516
-perf_LLC_prefetch_misses: 9.0 14.742
-
-As we can notice, the load and store misses are relatively stable across
-runs (their standard deviation is relatively low) compared to the
-prefetch misses. We can conclude from this information that LLC load and
-store misses can be accounted for quite precisely, but prefetches within
-a function seems to behave too erratically (not much causality link
-between the code executed and the CPU prefetch activity) to be accounted
-for.