Fix: workqueue: futex wait: handle spurious futex wakeups
authorMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Wed, 22 Jun 2022 20:28:53 +0000 (16:28 -0400)
committerMathieu Desnoyers <mathieu.desnoyers@efficios.com>
Mon, 27 Jun 2022 14:30:13 +0000 (10:30 -0400)
Observed issue
==============

The workqueue thread futex_wait() returns with a workqueue->futex state
of -1, which is unexpected. In this situation, the workqueue thread is
observed to use 99% of CPU as workqueue->futex values are decremented to
very low negative values while the workqueue is empty.

This issue will cause spurious unexpected high CPU use, but will not
lead to data corruption.

Cause
=====

From futex(5):

       FUTEX_WAIT
              Returns 0 if the caller was woken up.  Note that a  wake-up  can
              also  be caused by common futex usage patterns in unrelated code
              that happened to have previously used the  futex  word's  memory
              location  (e.g., typical futex-based implementations of Pthreads
              mutexes can cause this under some conditions).  Therefore, call‐
              ers should always conservatively assume that a return value of 0
              can mean a spurious wake-up, and  use  the  futex  word's  value
              (i.e.,  the user-space synchronization scheme) to decide whether
              to continue to block or not.

Solution
========

We therefore need to validate whether the value differs from -1 in
user-space after the call to FUTEX_WAIT returns 0.

Known drawbacks
===============

None.

Signed-off-by: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Change-Id: Id024e7d3b2dab75d30fc01280fd27e5f2d8af0d1

src/workqueue.c

index e5931b5e74bfee68c3efd1ff61b53e9bc68af23f..c9f03b56fbb3ac85c58e5ef21330e03d0377d340 100644 (file)
@@ -129,18 +129,29 @@ static int set_thread_cpu_affinity(struct urcu_workqueue *workqueue __attribute_
 
 static void futex_wait(int32_t *futex)
 {
+       int ret;
+
        /* Read condition before read futex */
        cmm_smp_mb();
-       if (uatomic_read(futex) != -1)
-               return;
-       while (futex_async(futex, FUTEX_WAIT, -1, NULL, NULL, 0)) {
+       while (uatomic_read(futex) == -1) {
+               if (!futex_async(futex, FUTEX_WAIT, -1, NULL, NULL, 0)) {
+                       /*
+                        * Prior queued wakeups queued by unrelated code
+                        * using the same address can cause futex wait to
+                        * return 0 even through the futex value is still
+                        * -1 (spurious wakeups). Check the value again
+                        * in user-space to validate whether it really
+                        * differs from -1.
+                        */
+                       continue;
+               }
                switch (errno) {
-               case EWOULDBLOCK:
+               case EAGAIN:
                        /* Value already changed. */
                        return;
                case EINTR:
                        /* Retry if interrupted by signal. */
-                       break;  /* Get out of switch. */
+                       break;  /* Get out of switch. Check again. */
                default:
                        /* Unexpected error. */
                        urcu_die(errno);
This page took 0.02612 seconds and 4 git commands to generate.