Joining a thread with a specific ID

I encountered some rather undesirable behaviour of the join command. We have an application that uses multiple background threads, each doing a different RPC call. Later on in the program, we join the threads to ensure a reply is in before continuing. We do this to avoid blocking the robot while waiting for RPC calls.

However, we noticed that join thread_id doesn’t join a specific thread, but rather waits for any thread to finish.

An example program to demonstrate the problem:

def main():
    thread doThreadA():
        textmsg("Thread A started")
        sleep(1.0)
        textmsg("Thread A done")
    end

    thread doThreadB():
        textmsg("Thread B started")
        sleep(0.1)
        textmsg("Thread B done")
    end

    thread_a = run doThreadA()
    textmsg("Spawned thread A with ID ", thread_a)
    thread_b = run doThreadB()
    textmsg("Spawned thread B with ID ", thread_b)

    join thread_a
    textmsg("Joined thread A.")

    join thread_b
    textmsg("Joined thread B.")

    sleep(1)
end

The program produces the following output (tested on ursim 3.3.4.310):

[2017-03-14 15:13:33.925954] Spawned thread A with ID 1
[2017-03-14 15:13:33.926099] Spawned thread B with ID 2
[2017-03-14 15:13:33.926232] Thread A started
[2017-03-14 15:13:33.926376] Thread B started
[2017-03-14 15:13:34.025884] Thread B done
[2017-03-14 15:13:34.026116] Joined thread A.
[2017-03-14 15:13:34.026229] Joined thread B.
[2017-03-14 15:13:34.927159] Thread A done

The output shows that join thread_a returns right after thread B finishes, while thread A is still busy. The join thread_b then finishes instantly, which makes sense since thread B did indeed finish already. Only the sleep(1) at the end of the program keeps the program alive long enough to really see thread A finish.

Another thing I noticed is that thread IDs are re-used as soon as possible. This seems unpractical considering join id is supposed to return instantly if the thread already finished. It is quite possible that the ID has been reused by a different thread in the mean time, so the join ... will block even though the original thread that we got the ID from is already finished.

We’re currently working around these issues by using global variables to keep track of the state of a thread from within the thread itself. It works, but it’s quite cumbersome and the workarounds aren’t possible if you want to run the same function in multiple threads at the same time.

Thanks @de-vri-es, I’ve reported the bug.

2 Likes

@rwi Thanks :slight_smile: I’ll keep an eye on the release notes.