The LTTng trace format

Last update: 2008/06/02

This document describes the LTTng trace format. It should be useful mainly to developers who code the LTTng tracer or the traceread LTTV library, as this library offers all the necessary abstractions on top of the raw trace data.

A trace is contained in a directory tree. To send a trace remotely, the directory tree may be tar-gzipped. The trace foo, placed in the home directory of user john, /home/john, would have the following contents:


$ cd /home/john
$ tree foo
foo/
|-- control
|   |-- facilities_0
|   |-- facilities_1
|   |-- facilities_...
|   |-- interrupts_0
|   |-- interrupts_1
|   |-- interrupts_...
|   |-- modules_0
|   |-- modules_1
|   |-- modules_...
|   |-- network_0
|   |-- network_1
|   |-- network_...
|   |-- processes_0
|   |-- processes_1
|   `-- processes_...
|-- cpu_0
|-- cpu_1
`-- cpu_...

The root directory contains a tracefile for each cpu, numbered from 0, in .trace format. A uniprocessor thus only contains the file cpu_0. A multi-processor with some unused (possibly hotplug) CPU slots may have some unused CPU numbers. For instance an 8 way SMP board with 6 CPUs randomly installed may produce tracefiles named 0, 1, 2, 4, 6, 7.

The files in the control directory also follow the .trace format and are also per cpu. The "facilities" files only contain "core" marker_id, marker_format and time_heartbeat events. The first two are used to describe the events that are in the trace. The other control files contain the initial system state and various subsequent important events, for example process creations and exit. The interest of placing such subsequent events in control trace files instead of (or in addition to) in the per cpu trace files is that they may be accessed more quickly/conveniently and that they may be kept even when the per cpu files are overwritten in "flight recorder mode".

Trace format

Each tracefile is divided into equal size blocks with a header at the beginning of the block. Events are packed sequentially in the block starting right after the block header.

Each block consists of :


block start/end header
trace header
event 1 header
event 1 variable length data
event 2 header
event 2 variable length data
....
padding

The block start/end header


begin
	* the beginning of buffer information
	uint64 cycle_count
		* TSC at the beginning of the buffer
	uint64 freq
		* frequency of the CPUs at the beginning of the buffer.
end
	* the end of buffer information
	uint64 cycle_count
		* TSC at the end of the buffer
	uint64 freq
		* frequency of the CPUs at the end of the buffer.
uint32 lost_size
	* number of bytes of padding at the end of the buffer.
uint32 buf_size
	* size of the sub-buffer.

The trace header


uint32 magic_number
	* 0x00D6B7ED, used to check the trace byte order vs host byte order.
uint32 arch_type
	* Architecture type of the traced machine.
uint32 arch_variant
	* Architecture variant of the traced machine. May be unused on some arch.
uint32 float_word_order
	* Byte order of floats and doubles, sometimes different from integer byte
	  order. Useful only for user space traces.
uint8 arch_size
	* Size (in bytes) of the void * on the traced machine.
uint8 major_version
	* major version of the trace.
uint8 minor_version
	* minor version of the trace.
uint8 flight_recorder
	* Is flight recorder mode activated ? If yes, data might be missing
	  (overwritten) in the trace.
uint8	has_heartbeat
	* Does this trace have heartbeat timer event activated ?
		Yes (1) -> Event header has 32 bits TSC
		No (0) -> Event header has 64 bits TSC
uint8 alignment
	* Are event headers in this trace aligned ?
		Yes -> the value indicates the alignment
		No (0) -> data is packed.
uint8 tsc_lsb_truncate
	* Used for compact channels
uint8 tscbits
	* Used for compact channels
uint8 compact_data_shift
	* Used for compact channels
uint32 freq_scale
		event time is always calculated from :
			trace_start_time + ((event_tsc - trace_start_tsc) * (freq / freq_scale))
uint64 start_freq
	* CPUs clock frequency at the beginnig of the trace.
uint64 start_tsc
	* TSC at the beginning of the trace.
uint64 start_monotonic
	* monotonically increasing time at the beginning of the trace.
		(currently not supported)
start_time
	* Real time at the beginning of the trace (as given by date, adjusted by NTP)
		This is the only time reference with the real world : the rest of the trace
		has monotonically increasing time from this point (with TSC difference and
		clock frequency).
	uint32 seconds
	uint32 nanoseconds

Event header

Event headers differ according to the following conditions : does the traced system have a heartbeat timer? Is tracing alignment activated?

Event header :


{ uint32 timestamp
	or
	uint64 timestamp }
	* if has_heartbeat : 32 LSB of the cycle counter at the event record time.
	* else : 64 bits complete cycle counter.
uint8 facility_id
	* Numerical ID of the facility corresponding to the event. See the facility
	  tracefile to know which facility ID matches which facility name and
		description.
uint8 event_id
	* Numerical ID of the event inside the facility.
uint16 event_size
	* Size of the variable length data that follows this header.

Event header alignment

If trace alignment is activated (alignment), the event header is aligned. In addition, padding is automatically added after the event header so the variable length data is automatically aligned on the architecture size.