On application registration, the event notifier error accounting file
descriptors are duplicated to send the error accounting counter objects
to the application.
Those are left open until the application unregisters.
There is one file descriptor per CPU, so on larger systems (228 CPUs
Intel or 192 CPUs AMD EPYC), this adds up to a lot of file descriptors
when the number of registered applications is large, which can result in
file descriptor exhaustion errors.
Moreover, the application unregistration is done from delete_ust_app(),
which is used from a call_rcu() worker thread, thus after an RCU grace
period delay. This means that a steady stream of short-lived
applications with a short enough lifetime could end up allocating more
file descriptors than can be closed.
Fix this by closing those file descriptors immediately after the objects
are sent to the application, similarly to what is done for the ring
buffer streams.
Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Signed-off-by: Jérémie Galarneau <jeremie.galarneau@efficios.com>
Change-Id: Ia1bbc3ff09a20f37d069ade7e267fb043ea1ac7f
(int) app->pid,
app->name);
status = EVENT_NOTIFIER_ERROR_ACCOUNTING_STATUS_ERR;
+ lttng_ust_ctl_release_object(-1, new_counter_cpu);
goto error_send_cpu_counter_data;
}
+ lttng_ust_ctl_release_object(-1, new_counter_cpu);
}
+ lttng_ust_ctl_release_object(-1, new_counter);
app->event_notifier_group.counter = new_counter;
new_counter = nullptr;
*/
break;
}
-
- lttng_ust_ctl_release_object(-1, cpu_counters[i]);
free(cpu_counters[i]);
}