Boolean semaphore across MPI ranks.
Only one thread on one rank gets access to a certain code region. This means that this particular semaphore represents a critical section across both threads and ranks in one go. It is inherently multithreading-safe.
Each boolean semaphore constructs a unique global number which is used to identify the semaphore within the whole system. When we request a lock, we send an integer message with this number to the global master through the semaphore tag. If we we release it, we send the negative number back.
As an mpi semaphore includes all ranks and all threads, it is something that is more general than a "normal" tarch::multicore::BooleanSemaphore. Therefore, the mpi semaphore hosts a multicore semaphore. As long as a thread on the local rank has blocked a critical section, we don't even have to go to the global master.
However, this is not the only semaphore: On the global master, we need a second semaphore to protect the global semaphore map. This map might be changed even if we are not in a critical section.
Implementation
The implementation is straightforward:
- Lock the local rank.
- Send the integer tied to this boolean semaphore to the global master. Wait for the go ahead message to return.
- The master receives the lock request in the BooleanSemaphoreService. It tries to acquire a lock. Until it is successful, it continues to call receiveDanglingMessages().
- Once the master is successful in getting the lock, it send an integer message back to the sender. The sender now is allowed to proceed.
- Once successful, the rank releases the lock, which materialises in another message to the global master. This time, we can send logically unblocking, i.e. we don't have to wait for an answer.
Multiple request handling routines (calls of acquireLock()) can be active at the same time, i.e. reside on the global master's call stack. They will be served one by one, although implicitly in reverse order as we work with the call stack. It is important to poll receiveDanglingMessages() all the time, as we might wait for a lock to be freed, but for the lock to be freed, we have to check unexpected aka dangling messages.
- Author
- Tobias Weinzierl
Definition at line 71 of file BooleanSemaphore.h.