From 3b8b68e73ec9b2b3cf550048046d3f7f69050688 Mon Sep 17 00:00:00 2001 From: Mathieu Desnoyers Date: Wed, 13 Jun 2012 04:26:34 -0400 Subject: [PATCH] Fix: liblttng-ust-fork deadlock * Mathieu Desnoyers (mathieu.desnoyers@efficios.com) wrote: > * Burton, Michael (mburton@ciena.com) wrote: > > Mathieu, > > > > I think there is a deadlock scenario in UST, which has been causing my problem. > > Good catch ! > > > > > sessiond is started as root: > > - creates global sockets ONLY > > - DOES NOT CREATE shm in $HOME/lttng-ust-wait- > > > > application linked against ust is run as root: > > - in lttng_ust_init constructor > > - ust_listener_thread (local_apps) > > - fails to connect to local_apps in $HOME/.lttng (as expected) > > - prev_connect_failed=1 > > - ust_unlock() > > - restart > > - wait_for_sessiond() > > --> - ust_lock() > > | - get_map_shm() > > | - get_wait_shm() > > DEADLOCK - shm_open() FAILS (not created by sessiond when run by root) > > | - fork() (trying to create shared memory itself) > > | - ust_before_fork() > > ------------> - ust_lock() > > > > > > You should be able to create this with an empty main, with no > > tracepoints. As long as sessiond is started as root so > > $HOME/lttng-ust-wait- is not created. You can also make the > > lttng-ust constructor (lttng_ust_init) wait forever and then you'll be > > able to see the deadlock in gdb without even leaving the > > lttng_ust_init constructor. > > Ah, I see. This deadlock is caused by the interaction between > [ liblttng-ust-fork ] and liblttng-ust (the fork override is > performed by [ liblttng-ust-fork ]). This can be reproduced easily with the in-tree tests: by removing the lttng-ust-apps-wait* files belonging to the user in /dev/shm, running the "tests/fork" test (with ./run) hangs. If we run "hello" first, and then the fork test, it works fine. Fixing this by keeping a nesting counter around the fork() call, so we return immediately from the pre/post fork handlers if they are overridden by liblttng-ust-fork. Reported-by: Michael Burton Signed-off-by: Mathieu Desnoyers --- liblttng-ust/lttng-ust-comm.c | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/liblttng-ust/lttng-ust-comm.c b/liblttng-ust/lttng-ust-comm.c index 5e738b06..8c864f8c 100644 --- a/liblttng-ust/lttng-ust-comm.c +++ b/liblttng-ust/lttng-ust-comm.c @@ -78,6 +78,12 @@ static sem_t constructor_wait; */ static int sem_count = { 2 }; +/* + * Counting nesting within lttng-ust. Used to ensure that calling fork() + * from liblttng-ust does not execute the pre/post fork handlers. + */ +static int __thread lttng_ust_nest_count; + /* * Info about socket and associated listener thread. */ @@ -469,7 +475,9 @@ int get_wait_shm(struct sock_info *sock_info, size_t mmap_size) * If the open failed because the file did not exist, try * creating it ourself. */ + lttng_ust_nest_count++; pid = fork(); + lttng_ust_nest_count--; if (pid > 0) { int status; @@ -1019,6 +1027,8 @@ void ust_before_fork(sigset_t *save_sigset) sigset_t all_sigs; int ret; + if (lttng_ust_nest_count) + return; /* Disable signals */ sigfillset(&all_sigs); ret = sigprocmask(SIG_BLOCK, &all_sigs, save_sigset); @@ -1044,6 +1054,8 @@ static void ust_after_fork_common(sigset_t *restore_sigset) void ust_after_fork_parent(sigset_t *restore_sigset) { + if (lttng_ust_nest_count) + return; DBG("process %d", getpid()); rcu_bp_after_fork_parent(); /* Release mutexes and reenable signals */ @@ -1061,6 +1073,8 @@ void ust_after_fork_parent(sigset_t *restore_sigset) */ void ust_after_fork_child(sigset_t *restore_sigset) { + if (lttng_ust_nest_count) + return; DBG("process %d", getpid()); /* Release urcu mutexes */ rcu_bp_after_fork_child(); -- 2.34.1