move everything out of trunk
[lttv.git] / lttng-xenomai / LinuxTraceToolkitViewer-0.8.61-xenoltt / doc / developer / format.html
CommitLineData
0bc7c2cd 1<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
2<html>
3<head>
4 <title>The LTTng trace format</title>
5</head>
6 <body>
7
8<h1>The LTTng trace format</h1>
9
10<P>
11This document describes the LTTng trace format. It should be used only by
12developers who code the LTTng tracer or the traceread LTTV library, as this
13library offers all the necessary abstractions on top of the raw trace data.
14
15<P>
16A trace is contained in a directory tree. To send a trace remotely,
17the directory tree may be tar-gzipped. Trace foo, placed in the home
18directory of user john, /home/john, would have the following content:
19
20<PRE><TT>
21$ cd /home/john
22$ tree foo
23foo/
24|-- eventdefs
25| |-- core.xml
26| |-- fs.xml
27| |-- ipc.xml
28| |-- kernel.xml
29| |-- memory.xml
30| |-- network.xml
31| |-- process.xml
32| |-- s390_kernel.xml
33| |-- socket.xml
34| |-- timer.xml
35| `-- ...
36|-- info
37| |-- bookmarks.xml
38| `-- system.xml
39|-- control
40| |-- facilities_0
41| |-- facilities_1
42| |-- facilities_...
43| |-- interrupts_0
44| |-- interrupts_1
45| |-- interrupts_...
46| |-- modules_0
47| |-- modules_1
48| |-- modules_...
49| `-- processes_0
50| `-- processes_1
51| `-- processes_...
52|-- cpu_0
53|-- cpu_1
54`-- cpu_...
55
56</TT></PRE>
57
58<P>
59The eventdefs directory contains the events descriptions for all the
60facilities used. The syntax is a simple subset of XML; XML is widely
61known and easily parsed or hand edited. Each file contains one or more
62<FACILITY NAME=name>...</FACILITY> elements. Indeed, several
63facilities may have the same name but different content (and thus will
64generate a different checksum). It typically happens when, while tracing
65is enabled, a module using the named facility is unloaded, modified
66(along with the description of some events), recompiled and reloaded.
67Then, the trace will contain events from two different, similarly named,
68facility versions.
69
70<P>
71A small number of events are predefined, part of the "core" facility,
72and are not present there. These "core" events include "facility_load",
73"facility_unload", "time_heartbeat" and "state_dump_facility_load".
74
75<P>
76The root directory contains a tracefile for each cpu, numbered from 0,
77in .trace format. A uniprocessor thus only contains the file cpu_0.
78A multi-processor with some unused (possibly hotplug) CPU slots may have some
79unused CPU numbers. For instance a 8 way SMP board with 6 CPUs randomly
80installed may produce tracefiles named 0, 1, 2, 4, 6, 7.
81
82<P>
83The files in the control directory also follow the .trace format and are also
84per cpu.
85The "facilities" file only contains "core" facility_load, facility_unload,
86time_heartbeat and state_dump_facility_load events
87and is used to determine the facilities used and the code range assigned
88to each facility. The other control files contain the initial system
89state and various subsequent important events, for example process
90creations and exit. The interest of placing such subsequent events
91in control trace files instead of (or in addition to) in the per cpu
92trace files is that they may be accessed more quickly/conveniently
93and that they may be kept even when the per cpu files are overwritten
94in "flight recorder mode".
95
96<P>
97The info directory contains in system.xml a description of the system on which
98the trace was created as well as different user annotations in bookmark.xml.
99This directory may also contain various information about the trace, generated
100during trace analysis (statistics, index...).
101
102
103<H2>Trace format</H2>
104
105<P>
106Each tracefile is divided into equal size blocks with a header at the beginning
107of the block. Events are packed sequentially in the block starting right after
108the block header.
109<P>
110Each block consists of :
111<PRE><TT>
112block start/end header
113trace header
114event 1 header
115event 1 variable length data
116event 2 header
117event 2 variable length data
118....
119padding
120</TT></PRE>
121
122<P>
123The block start/end header
124
125<PRE><TT>
126begin
127 * the beginning of buffer information
128 uint64 cycle_count
129 * TSC at the beginning of the buffer
130 uint64 freq
131 * frequency of the CPUs at the beginning of the buffer.
132end
133 * the end of buffer information
134 uint64 cycle_count
135 * TSC at the beginning of the buffer
136 uint64 freq
137 * frequency of the CPUs at the end of the buffer.
138uint32 lost_size
139 * number of bytes of padding at the end of the buffer.
140uint32 buf_size
141 * size of the sub-buffer.
142</TT></PRE>
143
144
145
146<P>
147The trace header
148
149<PRE><TT>
150uint32 magic_number
151 * 0x00D6B7ED, used to check the trace byte order vs host byte order.
152uint32 arch_type
153 * Architecture type of the traced machine.
154uint32 arch_variant
155 * Architecture variant of the traced machine. May be unused on some arch.
156uint32 float_word_order
157 * Byte order of floats and doubles, sometimes different from integer byte
158 order. Useful only for user space traces.
159uint8 arch_size
160 * Size (in bytes) of the void * on the traced machine.
161uint8 major_version
162 * major version of the trace.
163uint8 minor_version
164 * minor version of the trace.
165uint8 flight_recorder
166 * Is flight recorder mode activated ? If yes, data might be missing
167 (overwritten) in the trace.
168uint8 has_heartbeat
169 * Does this trace have heartbeat timer event activated ?
170 Yes (1) -> Event header has 32 bits TSC
171 No (0) -> Event header has 64 bits TSC
172uint8 has_alignment
173 * Is the information in this trace aligned ?
174 Yes (1) -> aligned on min(arch size, atomic data size).
175 No (0) -> data is packed.
176uint32 freq_scale
177 event time is always calculated from :
178 trace_start_time + ((event_tsc - trace_start_tsc) * (freq / freq_scale))
179uint64 start_freq
180 * CPUs clock frequency at the beginnig of the trace.
181uint64 start_tsc
182 * TSC at the beginning of the trace.
183uint64 start_monotonic
184 * monotonically increasing time at the beginning of the trace.
185 (currently not supported)
186start_time
187 * Real time at the beginning of the trace (as given by date, adjusted by NTP)
188 This is the only time reference with the real world : the rest of the trace
189 has monotonically increasing time from this point (with TSC difference and
190 clock frequency).
191 uint32 seconds
192 uint32 nanoseconds
193</TT></PRE>
194
195
196<P>
197Event header
198
199<P>
200Event headers differs depending on those conditions : does the traced system has
201a heartbeat timer ? Is tracing alignment activated ?
202
203<P>
204Event header :
205<PRE><TT>
206{ uint32 timestamp
207 or
208 uint64 timestamp }
209 * if has_heartbeat : 32 LSB of the cycle counter at the event record time.
210 * else : 64 bits complete cycle counter.
211uint8 facility_id
212 * Numerical ID of the facility corresponding to the event. See the facility
213 tracefile to know which facility ID matches which facility name and
214 description.
215uint8 event_id
216 * Numerical ID of the event inside the facility.
217uint16 event_size
218 * Size of the variable length data that follows this header.
219</TT></PRE>
220
221<P>
222Event header alignment
223
224<P>
225If trace alignment is activated (has_alignment), the event header is aligned
226on the architecture size (void pointer size). In addition, a padding is
227automatically added after the event header so the variable length data is
228automatically aligned on the architecture size.
229
230<P>
231
232<H2>System description</H2>
233
234<P>
235The system type description, in system.xml, looks like:
236
237<PRE><TT>
238&lt;system
239 node_name="vaucluse"
240 domainname="polymtl.ca"
241 cpu=4
242 arch_size="ILP32"
243 endian="little"
244 kernel_name="Linux"
245 kernel_release="2.4.18-686-smp"
246 kernel_version="#1 SMP Sun Apr 14 12:07:19 EST 2002"
247 machine="i686"
248 processor="unknown"
249 hardware_platform="unknown"
250 operating_system="Linux"
251 ltt_major_version="2"
252 ltt_minor_version="0"
253 ltt_block_size="100000"
254&gt;
255Some comments about the system
256&lt;/system&gt;
257</TT></PRE>
258
259<P>
260The system attributes kernel_name, node_name, kernel_release,
261 kernel_version, machine, processor, hardware_platform and operating_system
262come from the uname(1) program. The domainname attribute is obtained from
263the "hostname --domain" command. The arch_size attribute is one of
264LP32, ILP32, LP64 or ILP64 and specifies the length in bits of integers (I),
265long (L) and pointers (P). The endian attribute is "little" or "big".
266While the arch_size and endian attributes could be deduced from the platform
267type, having these explicit allows analysing traces from yet unknown
268platforms. The cpu attribute specifies the maximum number of processors in
269the system; only tracefiles 0 to this maximum - 1 may exist in the cpu
270directory.
271
272<P>
273Within the system element, the text enclosed may describe further the
274system traced.
275
276
277<H2>Event type descriptions</H2>
278
279<P>
280A facility contains the descriptions of several event types. When a structure
281is reused in several event types, a named type is defined and may be referenced
282by several other event types or named types.
283
284<PRE><TT>
285&lt;facility name=facility_name&gt;
286 &lt;description&gt;Some text&lt;/description&gt;
287 &lt;event name=eventtype_name&gt;
288 &lt;description&gt;Some text&lt;/description&gt;
289 --type structure--
290 &lt;/event&gt;
291 ...
292 &lt;type name=type_name&gt;
293 --type structure--
294 &lt;/type&gt;
295&lt;/facility&gt;
296</TT></PRE>
297
298<P>
299The type structure may be one of the following primitive type elements.
300Whenever the keyword isize is used, the allowed values are
301short, medium, long, 1, 2, 4, 8, indicating the size in bytes.
302The fsize keyword represents one of medium, long, 4 and 8 bytes.
303
304<PRE><TT>
305&lt;int size=isize format="printf format"/&gt;
306
307&lt;uint size=isize format="printf format"/&gt;
308
309&lt;float size=fsize format="printf format"/&gt;
310
311&lt;string format="printf format"/&gt;
312
313&lt;enum size=isize format="printf format"&gt;label1 label2 ...&lt;/enum&gt;
314</TT></PRE>
315
316<P>
317The string is null terminated. For the enumeration, the size of the integer
318used for its representation is specified.
319
320<P>
321The type structure may also be a compound type.
322
323<PRE><TT>
324&lt;array size=n&gt; --type structure-- &lt;/array&gt;
325
326&lt;sequence lengthsize=isize&gt; --type structure-- &lt;/sequence&gt;
327
328&lt;struct&gt;
329 &lt;field name=field_name&gt;
330 &lt;description&gt;Some text&lt;/description&gt;
331 --type structure--
332 &lt;/field&gt;
333 ...
334&lt;/struct&gt;
335
336&lt;union typecodesize=isize&gt;
337 &lt;field name=field_name&gt;
338 &lt;description&gt;Some text&lt;/description&gt;
339 --type structure--
340 &lt;/field&gt;
341 ...
342&lt;/union&gt;
343</TT></PRE>
344
345<P>
346Array is a fixed size array of length size. Sequence is a variable size
347array with its length stored as a prepended uint of length lengthsize.
348A structure is simply an aggregation of fields. An union is one of its n
349fields (variant record), as indicated by a preceeding code (0 to n - 1)
350of the specified size typecodesize.
351
352<P>
353Finally the type structure may be defined by referencing a named type.
354
355<PRE><TT>
356&lt;typeref name=type_name/&gt;
357</PRE></TT>
358
359<H2>Core events</H2>
360
361<P>
362The facility named "core" is always present and contains at least the
363following event types.
364
365<PRE><TT>
366&lt;event name=facility_load&gt;
367 &lt;description&gt;Facility used in the trace&lt;/description&gt;
368 &lt;struct&gt;
369 &lt;field name="name"&gt;&lt;string/&gt;&lt;/field&gt;
370 &lt;field name="checksum"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
371 &lt;field name="id"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
372 &lt;field name="int_size"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
373 &lt;field name="long_size"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
374 &lt;field name="pointer_size"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
375 &lt;field name="size_t_size"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
376 &lt;field name="has_alignment"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
377 &lt;/struct&gt;
378&lt;/event&gt;
379
380&lt;event name=state_dump_facility_load&gt;
381 &lt;description&gt;Facility used in the trace&lt;/description&gt;
382 &lt;struct&gt;
383 &lt;field name="name"&gt;&lt;string/&gt;&lt;/field&gt;
384 &lt;field name="checksum"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
385 &lt;field name="id"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
386 &lt;field name="int_size"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
387 &lt;field name="long_size"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
388 &lt;field name="pointer_size"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
389 &lt;field name="size_t_size"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
390 &lt;field name="has_alignment"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
391 &lt;/struct&gt;
392&lt;/event&gt;
393
394&lt;event name=time_heartbeat&gt;
395 &lt;description&gt;System time values sent periodically to minimize cycle counter
396 drift with respect to real time clock and to detect cycle counter
397 rollovers
398 &lt;/description&gt;
399 &lt;typeref name=timestamp/&gt;
400&lt;/event&gt;
401
402&lt;type name=timestamp&gt;
403 &lt;struct&gt;
404 &lt;field name="seconds"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
405 &lt;field name="nanoseconds"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
406 &lt;field name="cycle_count"&gt;&lt;uint size=8/&gt;&lt;/field&gt;
407 &lt;/struct&gt;
408&lt;/event&gt;
409
410</TT></PRE>
411
412<H2>Control files</H2>
413
414<P>
415The interrupts file reflects the content of the /proc/interrupts system file.
416It contains one event describing each interrupt. At trace start, events are
417generated describing all the current interrupts. If the assignment of
418interrupts changes later, due to devices or device drivers being activated or
419deactivated, additional events may be added to the file. Each interrupt
420event has the following structure.
421
422<PRE><TT>
423&lt;event name=interrupt&gt;
424 &lt;description&gt;Interrupt request number assignment&lt;description&gt;
425 &lt;struct&gt;
426 &lt;field name="number"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
427 &lt;field name="count"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
428 &lt;field name="controller"&gt;&lt;string/&gt;&lt;/field&gt;
429 &lt;field name="name"&gt;&lt;string/&gt;&lt;/field&gt;
430 &lt;/struct&gt;
431&lt;/event&gt;
432</TT></PRE>
433
434<P>
435The processes file contains the list of processes already created when the
436trace starts. Each process describing event is modeled after the
437/proc/self/status system file. The number of fields in this event is
438expected to be expanded in the future to include groups, signal masks,
439opened file descriptors and address maps.
440
441<PRE><TT>
442&lt;event name=process&gt;
443 &lt;description&gt;Existing process&lt;description&gt;
444 &lt;struct&gt;
445 &lt;field name="name"&gt;&lt;string/&gt;&lt;/field&gt;
446 &lt;field name="pid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
447 &lt;field name="ppid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
448 &lt;field name="tracer_pid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
449 &lt;field name="uid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
450 &lt;field name="euid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
451 &lt;field name="suid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
452 &lt;field name="fsuid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
453 &lt;field name="gid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
454 &lt;field name="egid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
455 &lt;field name="sgid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
456 &lt;field name="fsgid"&gt;&lt;uint size=4/&gt;&lt;/field&gt;
457 &lt;field name="state"&gt;&lt;enum size=4&gt;
458 Running WaitInterruptible WaitUninterruptible Zombie Traced Paging
459 &lt;/enum&gt;&lt;/field&gt;
460 &lt;/struct&gt;
461&lt;/event&gt;
462</TT></PRE>
463
464<H2>Facilities</H2>
465
466<P>
467Facilities define a granularity of events grouping for filtering, activation
468and compilation. Each facility does cost a table entry in the kernel (name,
469checksum, event type code range), or somewhere between 20 and 30 bytes. Having
470one facility per tracing statement in the kernel would be too much (assuming
471that they eventually are routinely inserted in the kernel code and replace
472the 80000+ printk statements in some proportion). However, having a few
473facilities, up to a few tens, would make sense.
474
475<P>
476The "builtin" facility contains a small number of predefined events which must
477always exist. The "core" facility contains a small subset of OS events which
478are almost always of interest (scheduling, interrupts, faults, system calls).
479Then, specialized facilities may exist for each subsystem (network, disks,
480USB, SCSI...).
481
482
483<H2>Bookmarks</H2>
484
485<P>
486Bookmarks are user supplied information added to a trace. They contain user
487annotations attached to a time interval.
488
489<PRE><TT>
490&lt;bookmarks&gt;
491 &lt;location name=name cpu=n start_time=t end_time=t&gt;Some text&lt;/location&gt;
492 ...
493&lt;/bookmarks&gt;
494</TT></PRE>
495
496<P>
497The interval is defined using either "time=" or "start_time=" and
498"end_time=", or "cycle=" or "start_cycle=" and "end_cycle=".
499The time is in seconds with decimals up to nanoseconds and cycle counts
500are unsigned integers with a 64 bits range. The cpu attribute is optional.
501
502</BODY>
503</HTML>
504
505
506
507
This page took 0.046764 seconds and 4 git commands to generate.