Dynamic Data Race Detection For Multithreaded Programs Computer Science Essay

Published: Last Edited:

This essay has been submitted by a student. This is not an example of the work written by our professional essay writers.

The advent of multi-core processors encourages programmers to develop multi-threaded programs to leverage the available computational powers. However, any improper synchronization among threads to manipulate share data may lead the programs to produce failures. Techniques that detect data races in multi-threaded programs dynamically can help locate real data races in programs for subsequent software evolution. However, to make such techniques practical, researchers have to solve many involving challenges, and great achievements have been gained. This report summarizes my current progress in understanding this topic. The report firstly reviews the problem of dynamic data race detection followed by discussing the merits and drawbacks of several representative existing techniques. After that, the report reviews two selective existing frameworks that may potentially reduce the development efforts if researchers choose to implement dynamic data race detectors.


Software testing includes many aspects, such as debugging, data race detection, and test case optimization. My study in this course mainly focuses on dynamic data race detection for multi-threaded programs.

Data race exists if there is a data race condition. A race condition is two or more threads access a shared variable (memory location) in an undetermined order and at least one of these accesses is a write, resulting in such a write operation to the shared variable interfering with the correct execution of the other threads. Program in Error: Reference source not found (a) shows a kind of representative data races.

The expected execution result of the program in Error: Reference source not found (a) is that the value of variable counter is 2, which differs from the actual execution result (as showed in Error: Reference source not found (b)) because of incorrect interleaving of the two threads T1 and T2. When data race occurs, the program may result in a failure. Therefore, to detect data races programs is one of useful techniques to improve the correctness of software. That is data race detection for multi-threaded programs.

To my best knowledge, researchers have proposed several algorithms for detection of data races, which can be classified into static, dynamic and hybrid. The static data race detection focuses on program analysis without actually executing the program, while the dynamic ones analyze the program executing to find data races. The static ones differ from the dynamic ones in scalability, runtime overhead and coverage of reported data races. Dynamic race detectors are scalable; but static ones are limited to small programs. However, dynamic ones usually incur higher runtime overhead [7] than static detectors. Moreover, in order to detect more data races, dynamic detectors need to schedule the interleaving order among different threads. Another difference is that static detectors may be designed to find data races covering all potential execution paths, which, on the other hands, lead to reporting of false alarms; while dynamic detectors can detect races from an execution path whenever the path contains data races(s).

Figure 1 an example of Data Race

(b) Two correct interleaving between T1 and T2

Execution order of two threads: T1 and T2

T1.increaseByOne () T2.increaseByOne ()

tmp = counter;

tmp = counter;

tmp = tmp+1;

counter = tmp;

tmp = tmp+1;

counter = tmp;

Result: counter = 1.


shared variable:

int counter = 0;

increaseByOne (){

int tmp = counter;

tmp = tmp+1;

counter = tmp;


T1 T2

tmp = counter;

tmp = tmp+1;

counter = tmp;

tmp = counter;

tmp = tmp+1;

counter = tmp;

Result: counter = 2.

T1 T2

tmp = counter;

tmp = tmp+1;

counter = tmp;

tmp = counter;

tmp = tmp+1;

counter = tmp;

Result: counter = 2.

(a) A program and an incorrect interleaving between T1 and T2.

In the following sections, I am going to explain what I have learned in the past few weeks. Section 2 and Section 3 give several techniques and framework tools respectively. Section 4 reviews my study and Section 5 presents my future work.


Researchers have proposed several algorithms to detect data races dynamically for multi-threads programs, such as Eraser [9] , Djit+ [8] , LiteRace [7] and FastTrack [2] ; and they can be classified into two categories: lockset based algorithms ([9] ) and Happen-before relation based algorithms ([2] , [7] , [8] ). These two kinds of algorithms commonly concern the following critical operations: read (rd) or write (wr) to a variable v; acquire (acq) or release (rel) a lock m; fork or join a thread t.

Lockset based algorithms generally maintain a set of locks (called lockset) for each shared variable in each execution of a multi-threaded program, monitor every critical operations, and update lockset of the variable (such as add new locks or remove existing locks). Once the lockset of a variable becomes empty, an occurrence of data race will be reported. However, some data races are benign; some of detectable data races are not reported. Hence, this kind of algorithms has both false positive and false negative problems.

Happen-before relation is a partial order of pairs of events in a multi-threaded programs or concurrent systems, proposed by Lamport [6] . Algorithms based on this relation have to maintain a vector of timestamps recording (logging) each thread’s current timestamp and other threads’ timestamps known to it and a vector of each shared variable recording the latest operation timestamps of each thread. If there is a critical operation on a variable, an event will be called to compare vectors between the involved thread and the variable to find whether the global order of involved threads is disturbed by the operation (usually write and read to a shared variable). If so, data race will be reported. This kind of algorithms may also miss reporting some data races because certain data races may only exist in certain execution paths. However, dynamic detectors never report false alarms. Hence, only false positive problems exist in this kind of algorithms.

Both categories of algorithms need to instrument the source code files or binary files (such as Java byte code files) and execute part or whole instrumented programs to detect data races based on execution strategies.

In following subsection, I am going to present several techniques I have learned in this course.

Eraser [9]

Savage et al. [9] proposed a lockset based algorithm known as Eraser. A lockset is a set of locks on a shared variable. To avoid failure illustrated in Error: Reference source not found (a), programmers heuristically put locks on shared variables; however, locks may be used incorrectly. Savage et al. firstly introduced a simple lockset algorithm based on the above heuristics and then revised it to be more practical.

Assumed discipline for simple lockset algorithm: each shared variable v is protected by a non-empty set of locks whenever a thread reads or writes tov.

wr, first thread



rd / wr

rd, new thread

rd /wr, first thread






Figure 3 a finite state machine of variable ð'£, adapted from [9] .Simple lockset algorithm: let be a set of locks on variable . Initially, contains all locks. On each access to , update to be the intersection of and , where is the set of locks held right before the access to v by thread . After updating , if becomes empty, a data race is reported.

Error: Reference source not found shows an example using the simple lockset algorithm to detect data race.

The simple lockset algorithm successfully detected a data race in Error: Reference source not found, in which variable is protected by different locks (lock_1 and lock_2) in different statements of two threads. However, the assumed environment of this simple lockset algorithm is less practical. Savage et al. discussed three problems in above algorithm:

Programmers usually put no lock on a variable when they initialize it. For example, to define and initialize an integer counter in Java, we may simply write “static int counter = 0;” which contains no lock.

For read-shared variable, no lock needs to protect it. In this situation, no data race should be reported.

A shared variable is programmed to allow only single thread to write it and multi-threads to read it.

In scene of three problems, simple lockset algorithm reports false positive. Hence, Savage et al. introduced four states to each variable: Virgin, Exclusive, Shared, and Shared-Modified and revised their simple lockset algorithm to report data race according to the state of . The state is maintained as following.

On allocating memory to , the state is denoted by Virgin (in this state, there is no write or read to yet). After value initialization, the state changes to Exclusive.

In state Exclusive, if there is only one thread accessing , state of remains Exclusive.

On the second thread’s first read to , state of changes to Shared; or the second thread’s first write to , state of changes to Shared-Modified.

Shared state can be changed to Shared-Modified only when a thread writes to .

The State of Shared-Modified is the final state of .

Error: Reference source not found illustrates the state transaction through a finite state machine.

In each state, Eraser updates and reports data race as follows:

At state of Exclusive, Eraser does not update and does not report data race.

At state of Shared, Eraser updates but does not report data race regardless of emptiness of .

At state of Shared-Modified, Eraser updates and reports data race.

In implementation, Eraser needs to instrument the original binary program so that Eraser functions can be called when critical operations executed.

Djit+ [8]

Figure 4 an example of Djit+ algorithm with two threads of Figure 2 (‘â†'’:execution order; ‘<x, y>*’ : updated values; ‘<x, y>’: values to be compared. The comparison with is omitted here for that it keeps unchanged.)

Operation Operation Comparison

<1,1> <1,1> <0,0> <0,0> <0,0> <0,0>

â†" Acq(lock_1)

<1,1> * <1,1> <0,0> <0,0> <0,0> <0,0>

â†" Acq(lock_2)

<1,1> * <1,1> <0,0> <0,0> <0,0> <0,0>

â†" Wr(v)

<1,1> <1,1> <0,0> <0,0> <1,0>~ <0,0> <1,1> <1,0> ? √

â†" Rel(lock_2)

<2,1> * <1,1> <0,0> <2,1>* <1,0> <0,0>

â†" Acq(lock_2)

<2,1> <2,1> * <0,0> <2,1> <1,0> <0,0>

â†" Wr(v)

<2,1> <2,1> <0,0> <2,1> <1,1> ~ <0,0> <2,1> <1,1> ? √

â†" Rel(lock_2)

<2,1> <2,2> * <0,0> <2,2> * <1,1> <0,0>

â†" Wr(v)

<2,1> * <2,2> <0,0> <2,2> <2,1>~ <0,0> <2,1> <2,1> ? ╳

Djit+, monitors the six critical operations and maintains one vector for each thread t, one vector for each lock m, and two vectors and for write and read of v, respectively. Each thread owns a timestamp which can only be increased after each release operation. records the latest timestamps of the thread t and others threads on acquire operations. records latest timestamps of the threads when they release the lock m. On acquire lock m, the algorithm collects the timestamp of all threads involved in , and then updates the vectors (clock) of all threads by assigning the collected timestamps to each of these vectors if the collected timestamp is latest than the corresponding timestamp kept by the vector in question. Consequently, represents the timestamp of the thread t. and record the latest time-stamps of write and read to variable v of all threads, respectively.

Specifically, every acquire of lock m performed by thread t will trigger updating of to be max {, } while every release will trigger updating of to be at position of t while other values in keep unchanged, followed by updating to be max {, }, orderly. Moreover, every read and write will trigger updating and to be + 1 and at position of t and others in and keep unchanged, respectively.

Immediately after each read to v, a comparison between and will be conducted to make sure that no write before this read. If any thread i () violates the condition , a data race is reported. Also immediately after each write, in addition to the above comparison, a second comparison between and will be conducted to make sure that no write before this read. If any thread i () violates the condition and , a data race is reported.

Error: Reference source not found is an example showing that Djit+ algorithm can successfully detect the data race illustrated in Error: Reference source not found under certain interleaving between two threads.

Djit+ is able to detect real data races. However, as mentioned above, it does not warrant detection of all data races because its ability is corresponding to coverage of execution paths.

Multi-Race [8]

Lockset based algorithms may produce false alarms while happen-before relation based ones detect a subset of data races in an execution path. Hence, proper combination of both algorithms may produce more powerful detectors. Multi-Race has done this combination.

Multi-Race applies both lockset based algorithm and Djit+ algorithm to an execution of program at the same time. It maintains a lock set, , two vectors, and , for each variable v, a vector for each thread t and a vector for each lock m (here , , , , have the same meaning as that in Eraser and Djit+ sections, respectively). At each critical operation, Multi-Race updates corresponding lockset and vectors and checks whether is empty. However, unlike Eraser, if is empty, Multi-Race further compares , , and the same way as that presented in section 2.2 and report data race accordingly. By doing this, it eliminates the false data races incurred by lockset based algorithm and also improves the performance of Djit+ algorithm by reducing the number of vector comparisons.

FastTrack [2]

Actually, FastTrack [2] is the first algorithm for dynamic data race detection of multi-threaded programs I learned. From this algorithm, I expanded my knowledge to the other dynamic data race detection algorithms (such as Djit+, Eraser, Multi-Race and Lite-Race).

FastTrack is based on happen-before relation and can be regarded as an improved version of Djit+ algorithm. Its improvement from Djit+ is in two aspects:

Record the timestamp of last write to a shared variable. Unlike read-read race, a write cannot be shared with another write. So if no data race exists in the execution before the current write, the timestamp of the last write to v is sufficient to detect a data race.

Record a switchable read information for each variable: single value or vector value. Before a read/write or after a shared write to v, single value will be used for v. Any later read operations of different threads occurs the algorithm to switch single value to vector value until a new write to v; in this case, the vector is changed back to the single value.

Flanagan et al. showed through an empirical study that only 0.1% of reads will switch single value to vector value for read information of v, which improves the performance obviously compared to Djit+ algorithm by reducing space and further reducing time.

Since there is only single value for write information of v, it can only detect data race existing in two concurrent threads and only the first data race on v. Like Djit+, FastTrack has no false positive problems but false negative problems

The implementation of FastTrack was based on a framework tool called RoadRunner Dynamic Analysis Framework for Concurrent Programs [3] . Several dynamic data race detectors were already implemented on this framework by Flanagan et al. I will give more about this framework in section 3 as well as other framework I know.

LiteRace [7]

LiteRace is also based on Happen-before relation. There is already some introduction to happen- before relation and its application in data race detection above; so I will explain LiteRace from another view point: strategies in executing instrumented programs, as well as strategies LiteRace adopted.

Dynamic data race detectors need to instrument original programs (source code or binary files) by adding caller functions in order to trigger events for each critical operation to analyze potential data races. There are some strategies to executing instrumented programs. Some detectors execute whole instrumented programs while others may execute portion of them. LiteRace is one of later detectors.

Figure 5 RoadRunner Framework






Java Programs

Event Receiver:



As I mentioned above, the number of detected data race of dynamic data race detector is corresponding to its coverage of execution paths. A dynamic data race detector may detect some data races for each executed path. However, it seems impossible to execute all paths. Then a question comes: which path should a detector adopt to execute? Certainly the ideal execution path should lead a detector to find more data races than other paths.

For some famous programs, such as Apache web server, Firefox web browser and gzip (a famous file compressor), the ideal path should be among those that do not or minor executed path based on statistics. Because these programs have been revised for many times and almost no data race exists in the most executed paths. However, for a newly developed program, nobody may know which path a detector should take.

LiteRace uses a sampling algorithm to execute portion of programs. It classifies the portion to be hot region and cold region dynamically for each thread. Following two paragraphs are a description of sampling algorithm used by LiteRace.

Instrumentation: when instrument the programs, LiteRace create two portions for each original portion. Since portion here is based on function, instrumented programs will contain two versions for each original function. One function is only instrumented by adding synchronization operations (lightly-instrumented) while the second function is instrumented by adding memory logging operations except synchronization operations (heavily-instrumented). In order to decide which instrumented function should be executed dynamically, an additional dispatch checker is added to each pair of differently instrumented functions. The dispatch checker is always called before execution of corresponding function to do some extra work and select one of two functions to be executed.

Selection of functions: for each pair of differently instrumented functions, there are two counters, frequency counter and sampling counter. Initially, frequency counter (denoted as f-c) is zero and sampling counter (denoted as s-c) is 100%. On each call to this function, f-c will be increased by one; then s-c will be compared to zero. If s-c is not zero, lightly-instrumented function will be selected; otherwise, heavily-instrumented function will be selected. At last, s-c will be updated according to f-c.

Through executing selected functions, only a small percent of variable accesses are monitored, which obviously improve the performance. According to [7] , LiteRace can detect more than 70% of data races by sampling less than 2% of variable accesses, from which we can see that strategies play a great important role in dynamic data race detection for multi-threaded programs.


Dynamic data race detector needs to instrument source code or binary code files to analyze executing programs on the fly, which usually is a burden.

However, several framework tools have been developed to light our burden, such as Calfuzzer [5] and RoadRunner [3] . I learned RoadRunner in details and did some experiments.

RoadRunner [3] is a framework designed for dynamic data race detection of concurrent programs of Java language and was also written in Java language (Eclipse environment). It instruments Java programs by adding shadow fields and methods to call events (such as memory accessing, lock synchronization, threads forking and joining and, especially method entering and quitting) and further call events-receiver which can be extended by user tools. Users can also expand it by revising or adding features to it. RoadRunner allows combination of severalalgorithms. For example, Multi-Race combined Lockset and Djit+ algorithms. Under RoadRunner, this combination can be easily implemented. Error: Reference source not found shows RoadRunner framework.

Based on this framework, C. Flanagan and S. N. Freund implemented many dynamic data race detectors, such as Eraser, Djit+, Multi-Race, and FastTrack, as well as other tools, such as Atomizer [3] .

Calfuzzer is also a java based framework for dynamic data race detection. It rewrote part of java.util and java.lang library to perform its event calling. Therefore, it instrumented the java library other than the programs. When using it, we can simply change the import statement referring to revised library.


Multi-threaded programs may fail owning to occurrence of data races. Debugging a program to find data race may be very difficult. But detection of data races is a possible way. Several algorithms have been proposed to detect it including static analysis of program codes or binary files and dynamic analysis of interleaving executing thread. Static algorithms not only miss some data races but also report false data races. Recent years, dynamic algorithms are developed much more. These dynamic algorithms are based on two concepts: lockset and Happen-before relation. Lockset based algorithms check whether a variable is protected by at least one lock on each access to this variable. Eraser is a notable lockset algorithm based tool. Algorithms based on happen-before guarantee consistence of global order from thread-local orders of involved threads by logging each access to a shared variable for each thread. Djit+ and FastTrack are based on Happen-before relation. Multi-Race is the combination of lockset and Happen-before.

Dynamic data race detector needs to instrument original programs to call its analysis modules. It is a heavy burden. However, several existing tools may reduce our burden, such as Calfuzzer and RoadRunner framework for Java language programs. In my opinion, RoadRunner, as an event stream based framework, is much simpler to understand and to implement a tool to detect data race in Java programs and it can be expand to support more events. LiteRace instruments programs to support both original and instrumented functions to be selected to execute dynamically, which reduces number of monitoring accesses to memory locations and further improve performance.


When I did some experiments on RoadRunner framework, I found an interesting thing. If I directly run the original multi-threaded program that contained data race, there were no wrong results; however, the same program was executed on RoadRunner framework, wrong results would be produced at a high rate. My supervisor told me that, in order to detect, and detect more, data races, interleaving between different threads would be scheduled which is called thread scheduling and is only one stage out of four. The four stages include test data generation, test execution, which includes thread scheduling, thread and memory access monitoring, test result analysis, which include correctness checking and reporting.

So I will focus on test data generation and thread scheduling for coming weeks. Here I can give the advantage with adoption of test data generation in dynamic data race detection. Some data races occur only under special execution paths, which, with random test data inputs, can be hardly probed. In this case, using of directed test data may lead to programs’ execution covering minor executed paths and further detect more data races. DART [4] and KLEE [1] are two references about test data generation.

Finally I expect to get a deep understanding of the area of software engineering, especially, the software testing.