From 2df048c2c2bc00bcb2cde50b5a420e4d840a0057 Mon Sep 17 00:00:00 2001 From: David Goulet Date: Tue, 18 Sep 2012 12:24:51 -0400 Subject: [PATCH] Add lttng_health_check(3) man page Signed-off-by: David Goulet --- doc/man/lttng-health-check.3 | 82 ++++++++++++++++++++++++++++++++++++ 1 file changed, 82 insertions(+) create mode 100644 doc/man/lttng-health-check.3 diff --git a/doc/man/lttng-health-check.3 b/doc/man/lttng-health-check.3 new file mode 100644 index 000000000..67aafd4c4 --- /dev/null +++ b/doc/man/lttng-health-check.3 @@ -0,0 +1,82 @@ +.TH LTTNG_HEALTH_CHECK 3 2012-09-19 "LTTng" "LTTng Developer Manual" +.SH NAME +lttng_health_check \- Monitor health of the session daemon +.SH SYNOPSIS +.nf +.B #include +.sp +.BI "int lttng_health_check(enum lttng_health_component c); +.fi + +Link with -llttng-ctl. +.SH DESCRIPTION +The +.BR lttng_health_check () +is used to check the session daemon health for either a specific component +.BR c +or for all of them. Each component represent a subsystem of the session daemon. +Those components are set with health counters that are atomically incremented +once reached. An even value indicates progress in the execution of the +component. An odd value means that the code has entered a blocking state which +is not a poll(7) wait period. + +A bad health is defined by a fatal error code path reached or any IPC used in +the session daemon that was blocked for more than 20 seconds (default timeout). +The condition for this bad health to be detected is that one or many of the +counters are odd. + +The health check mechanism of the session daemon can only be reached through +the health socket which is a different one from the command and the application +socket. An isolated thread serves this socket and only computes the health +counters across the code when asked by the lttng control library (using this +call). This subsystem is highly unlikely to fail due to its simplicity. + +The +.BR c +argument can be one of the following values: +.TP +.BR LTTNG_HEALTH_CMD +Command subsystem which handles user commands coming from the liblttng-ctl or +the +.BR lttng(1) +command line interface. +.TP +.BR LTTNG_HEALTH_APP_MANAGE +The session daemon manages application socket in order to route client command +and check if they get closed which indicates the application shutdown. +.TP +.BR LTTNG_HEALTH_APP_REG +The application registration mechanism is an important and vital part of for +user space tracing. Upon startup, applications instrumented with +.BR lttng-ust(3) +try to register to the session daemon through this subsystem. +.TP +.BR LTTNG_HEALTH_KERNEL +Monitor the Kernel tracer streams and main channel of communication +(/proc/lttng). If this component malfunction, the Kernel tracer is not usable +anymore by lttng-tools. +.TP +.BR LTTNG_HEALTH_CONSUMER +The session daemon can spawn up to +.BR three +consumer daemon for kernel, user space 32 and 64 bit. This subsystem monitors +the consumer daemon(s). A bad health state means that the consumer(s) are not +usable anymore hence likely making tracing not usable. +.TP +.BR LTTNG_HEALTH_ALL +Check all components. If only one of them is in a bad state, a health check +error is returned. + +.SH "RETURN VALUE" +Return 0 if the health is OK, or 1 is it's in a bad state. A return code of \-1 +indicates that the control library was not able to connect to the session +daemon health socket. + +.SH "LIMITATIONS" + +For the LTTNG_HEALTH_CONSUMER, you can not know which consumer daemon has +failed but only that either the consumer subsystem has failed or that a +lttng-consumerd died. + +.SH "AUTHORS" +Written and maintained by David Goulet . -- 2.34.1