Commit | Line | Data |
---|---|---|
2df048c2 DG |
1 | .TH LTTNG_HEALTH_CHECK 3 2012-09-19 "LTTng" "LTTng Developer Manual" |
2 | .SH NAME | |
05ec9ff7 DG |
3 | .B DEPRECATED |
4 | ||
2df048c2 DG |
5 | lttng_health_check \- Monitor health of the session daemon |
6 | .SH SYNOPSIS | |
7 | .nf | |
8 | .B #include <lttng/lttng.h> | |
9 | .sp | |
10 | .BI "int lttng_health_check(enum lttng_health_component c); | |
11 | .fi | |
12 | ||
13 | Link with -llttng-ctl. | |
14 | .SH DESCRIPTION | |
15 | The | |
16 | .BR lttng_health_check () | |
17 | is used to check the session daemon health for either a specific component | |
18 | .BR c | |
19 | or for all of them. Each component represent a subsystem of the session daemon. | |
20 | Those components are set with health counters that are atomically incremented | |
21 | once reached. An even value indicates progress in the execution of the | |
22 | component. An odd value means that the code has entered a blocking state which | |
23 | is not a poll(7) wait period. | |
24 | ||
25 | A bad health is defined by a fatal error code path reached or any IPC used in | |
26 | the session daemon that was blocked for more than 20 seconds (default timeout). | |
27 | The condition for this bad health to be detected is that one or many of the | |
28 | counters are odd. | |
29 | ||
30 | The health check mechanism of the session daemon can only be reached through | |
31 | the health socket which is a different one from the command and the application | |
32 | socket. An isolated thread serves this socket and only computes the health | |
33 | counters across the code when asked by the lttng control library (using this | |
34 | call). This subsystem is highly unlikely to fail due to its simplicity. | |
35 | ||
36 | The | |
37 | .BR c | |
38 | argument can be one of the following values: | |
39 | .TP | |
40 | .BR LTTNG_HEALTH_CMD | |
41 | Command subsystem which handles user commands coming from the liblttng-ctl or | |
42 | the | |
43 | .BR lttng(1) | |
44 | command line interface. | |
45 | .TP | |
46 | .BR LTTNG_HEALTH_APP_MANAGE | |
47 | The session daemon manages application socket in order to route client command | |
48 | and check if they get closed which indicates the application shutdown. | |
49 | .TP | |
50 | .BR LTTNG_HEALTH_APP_REG | |
51 | The application registration mechanism is an important and vital part of for | |
52 | user space tracing. Upon startup, applications instrumented with | |
53 | .BR lttng-ust(3) | |
54 | try to register to the session daemon through this subsystem. | |
55 | .TP | |
56 | .BR LTTNG_HEALTH_KERNEL | |
57 | Monitor the Kernel tracer streams and main channel of communication | |
58 | (/proc/lttng). If this component malfunction, the Kernel tracer is not usable | |
59 | anymore by lttng-tools. | |
60 | .TP | |
61 | .BR LTTNG_HEALTH_CONSUMER | |
62 | The session daemon can spawn up to | |
63 | .BR three | |
64 | consumer daemon for kernel, user space 32 and 64 bit. This subsystem monitors | |
65 | the consumer daemon(s). A bad health state means that the consumer(s) are not | |
66 | usable anymore hence likely making tracing not usable. | |
67 | .TP | |
68 | .BR LTTNG_HEALTH_ALL | |
69 | Check all components. If only one of them is in a bad state, a health check | |
70 | error is returned. | |
71 | ||
72 | .SH "RETURN VALUE" | |
73 | Return 0 if the health is OK, or 1 is it's in a bad state. A return code of \-1 | |
74 | indicates that the control library was not able to connect to the session | |
75 | daemon health socket. | |
76 | ||
77 | .SH "LIMITATIONS" | |
78 | ||
79 | For the LTTNG_HEALTH_CONSUMER, you can not know which consumer daemon has | |
80 | failed but only that either the consumer subsystem has failed or that a | |
81 | lttng-consumerd died. | |
82 | ||
83 | .SH "AUTHORS" | |
9b22d135 JG |
84 | lttng-health-check was originally written by David Goulet and is currently |
85 | maintained by Jérémie Galarneau <jeremie.galarneau@efficios.com>. |