Fix: Relay daemon ownership and reference counting
[lttng-tools.git] / doc / relayd-architecture.txt
1 LTTng Relay Daemon Architecture
2 Mathieu Desnoyers, August 2015
4 This document describes the object model and architecture of the relay
5 daemon, after the refactoring done within the commit "Fix: Relay daemon
6 ownership and reference counting".
8 We have the following object composition hierarchy:
10 relay connection (main.c, for sessiond/consumer)
11 |
12 \-> 0 or 1 session
13 |
14 \-> 0 or many ctf-trace
15 |
16 \-> 0 or many stream
17 | |
18 | \-> 0 or many index
19 |
20 \-------> 0 or 1 viewer stream
22 live connection (live.c, for client)
23 |
24 \-> 1 viewer session
25 |
26 \-> 0 or many session (actually a reference to session as created
27 | by the relay connection)
28 |
29 \-> ..... (ctf-trace, stream, index, viewer stream)
31 There are global tables declared in lttng-relayd.h for sessions
32 (sessions_ht, indexed by session id), streams (relay_streams_ht, indexed
33 by stream handle), and viewer streams (viewer_streams_ht, indexed by
34 stream handle). The purpose of those tables is to allow fast lookup of
35 those objects using the IDs received in the communication protocols.
37 There is also one connection hash table per worker thread. There is one
38 worker thread to receive data (main.c), and one worker thread to
39 interact with viewer clients (live.c). Those tables are indexed by
40 socket file descriptor.
42 A RCU lookup+refcounting scheme has been introduced for all objects
43 (except viewer session which is still an exception at the moment). This
44 scheme allows looking up the objects or doing a traversal on the RCU
45 linked list or hash table in combination with a getter on the object.
46 This getter validates that there is still at least one reference to the
47 object, else the lookup acts just as if the object does not exist. This
48 scheme is protected by a "reflock" mutex in each object. "reflock"
49 mutexes can be nested from the innermost object to the outermost object.
50 IOW, the session reflock can nest within the ctf-trace reflock.
52 The relay_connection (connection between the sessiond/consumer and the
53 relayd) is the outermost object of its hierarchy.
55 The live connection (connection between a live client and the relayd)
56 is the outermost object of its hierarchy.
58 There is also a "lock" mutex in each object. Those are used to
59 synchronize between threads (currently the main.c relay thread and
60 live.c client thread) when objects are shared. Locks can be nested from
61 the outermost object to the innermost object. IOW, the ctf-trace lock can
62 nest within the session lock.
64 A "lock" should never nest within a "reflock".
66 RCU linked lists are used to iterate using RCU, and are protected by
67 their own mutex for modifications. Iterations should be confirmed using
68 the object "getter" to ensure its refcount is not 0 (except in cases
69 where the caller actually owns the objects and therefore can assume its
70 refcount is not 0).
72 RCU hash tables are used to iterate using RCU. Iteration should be
73 confirmed using the object "getter" to ensure its refcount is not 0
74 (except again if we have ownership and can assume the object refcount is
75 not 0).
77 Object creation has a refcount of 1. Each getter increments the
78 refcount, and needs to be paired with a "put" to decrement it. A final
79 put on "self" (ownership) will allow refcount to reach 0, therefore
80 triggering release, and thus free through call_rcu.
82 In the composition scheme, we find back references from each composite
83 to its container. Therefore, each composite holds a reference (refcount)
84 on its container. This allows following pointers from e.g. viewer stream
85 to stream to ctf-trace to session without performing any validation,
86 due to transitive refcounting of those back-references.
88 In addition to those back references, there are a few key ownership
89 references held. The connection in the relay worker thread (main.c)
90 holds ownership on the session, and on each stream it contains. The
91 connection in the live worker thread (live.c) holds ownership on each
92 viewer stream it creates. The rest is ensured by back references from
93 composite to container objects. When a connection is closed, it puts all
94 the ownership references it is holding. This will then eventually
95 trigger destruction of the session, streams, and viewer streams
96 associated with the connection when all the back references reach 0.
98 RCU read-side locks are now only held during iteration on RCU lists and
99 hash tables, and within the internals of the get (lookup) and put
100 functions. Those functions then use refcounting to ensure existence of
101 the object when returned to their caller.
This page took 0.046066 seconds and 5 git commands to generate.