Dynamic Datarace Detection

1. Dynamic Datarace Detection

How do tools like thread sanitizer work? For example, some bad code:

1static void ThreadBody(int &x, int &y, int &z, std::mutex &m) {
2  m.lock();
3  x = x + 1;
4  m.unlock();
5  
6  m.lock();
7  y = y + 1;
8  m.unlock();
9
10  z = z + 1; // SHOULD BE PROTECTED
11}

Although this program does have a datarace, there is no room for the compiler to exploit undefined behavior, and the chance of concurrent threads stepping on each other's toes is low. This means some dataraces are very difficult to find without a thread sanitizer.

To santize, the compiler will have to record every read and write which may end up in shared memory (local variables where the address is not leaked). It will also record:

Thread launch / join.
Mutex acquire / release.
Atomic operations.

Each library function updates the dynamic race detection state. This state allows conditions for data races to be checked as a program runs. The library functions must be mutually exclusive to avoid data races in the race detection state itself. This leads to significant runtime overhead.

2. Vector Clock Algorithm

Suppose:

There are threads with IDs .
Synchronisation is done with a set of locks, .
is a set of possibly shared memory locations where races can occur.
Each thread keeps track of a logical clock: a non negative integer incremented each time the thread releases a mutex.
A vector clock is a mapping , one per thread . is the logical clock for thread . is the set of all vector clocks.
The bottom vector clock is .
There is a partial order on vector clocks: .
To join vector clocks: .
An increment function

2.1 Algorithm State

The algorithm state is given by where:

is mapping of threads to their vector clocks.
is mapping of locks to their vector clocks.
is mapping of memory locations to the vector clock of the last read.
is mapping of memory locations to the vector clock of the last write.

2.1.1 Per-Thread Clocks

A thread's clock represents what thread knows about the logical clocks of other threads:

is my logical clock, always positive.
means I know 's logical clock is at least .

When acquires a lock, it gets information about logical clocks of the threads that previously held the lock.

2.1.2 Per-Lock Clocks

A lock's clock represents the vector clock associated with the last thread that released , when the release occured. The last thread to release knew that 's logical clock was at least when it released the mutex.

2.1.3 Per-Location Clocks

A location's read clock represents the vector clock associated with the last read of . The last thread to read knew that 's logical clock was at least when it read . means that has never read .

A location's write clock represents the vector clock associated with the last write of . The last thread to write knew that 's logical clock was at least when it wrote . means that has never written .

Example

Suppose a thread reads from : .

Recall partial order on vector clocks: .

If , then a datarace is detected: some thread wrote to without knowing about it, i.e. .

This only happens when and have not synchronised and a write-read race occurs.

2.2 Intercepted Operations

Every read/write to/from possibly shared memory is intercepted:

: thread reads from location .
: thread writes to location .

Every lock acquire/release is intercepted:

: thread acquires mutex .
: thread releases mutex .

2.3 Initial State

Initially, we set:

. Each thread knows its logical clock is and thinks all other threads' logical clocks are .
no mutex has been released.
no location has been read.
no location has been written.

2.4 Operation Semantics

2.4.1 Shared Memory Rules

The checks for a write-read race.

The checks for a write-write race, and the checks for a read-write race.

2.4.2 Lock Rules

The thread updates its vector clock to include knowledge from the lock's vector clock.

With thread makes the world aware of its logical clock and the latest values knows about other threads. When another thread acquires the lock, it will learn this information.
With , thread increments its own logical clock since it is performing a release operation.

2.4.3 Datarace Rules

If a thread has written to location without thread knowing about it, then a write-read race is detected.

If a thread has written to location without thread knowing about it, then a write-write race is detected.

2.5 Correctness

Only is aware of its current logical clock:

Initially, so and .
Subsequently, only reveals its logical clock when it releases a lock, but immediately increments it afterwards. Thus, no other thread can ever know 's current logical clock.

Suppose a the rule reports a race. Suppose that this rule did not work and there is no race. Then:

The accesses are by the same thread. If then is impossible since always knows its own logical clock.
The accesses are ordered by synchronisation. Assume wlog that there are no interleaving operations. Then:


Initial

Since we know that , we have , which is a contradiction.

2.6 Problems

Space Inefficient - requires a vector clock per thread for every possible memory location. We can optimise this by only tracking who last updated a memory location - this is enough.

Back to Home

Table of Contents