FJava: Fork Join for Java

By: Ankit Agarwal and Iosef Kaver


Project Proposal

Summary:

We are going to implement a high level fork join framework for the Java programming language. Additionally, we will compare the performance of different scheduling and job stealing strategies for the framework.


Background:

Fork join model is a natural way to express independent work inherent in divide and conquer algorithms. Many divide and conquer problems can gain almost a linear speedup with the number of processors when a lightweight, efficient fork join framework is used for parallelism.

One such framework for expressing work in fork join model is Cilk in C and C++. We want to implement a framework which allows programmers similar level of functionality in Java. Additionally, we want to experiment with several scheduling and work distribution strategies. Also, we will provide a simple API that allows for error reporting and performance metrics.

Here is an example of our API:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
public class Sorter<T> implements Runnable {

  private T [] array;
  private int left;
  private int right;
  private static final int LOCAL_SORT_DIFF = 10;

  public Sorter(T [] array, int left, int right) {
    this.array = array;
    this.left = left;
    this.right = right;
  }

  @Override
  public void run() {
    if(this.right-this.left <= LOCAL_SORT_DIFF) {
      Arrays.sort(array, left, right);
      return;
    }
    int mid = partition();
    JobToken token = FJava.createJob();
    FJava.runAsync(new Sorter<T>(array, left, mid), token);
    new Sorter<T>(array, mid+1, right).run();
    FJava.sync(token);
  }

  private int partition() {
    //implement partition here.
  }
}

Challenges:

  • The dependencies in the fork-join model are usually described explicitly by the user code. We would like to figure out the dependencies at either compile or run time; an alternative would be to force the user to use framework level constructs for specifying them explicitly.
  • There usually exists high locality between the childs of a job and the creator of the child jobs. The scheduling and work stealing algorithms that we implement should exploit this characteristic of the jobs.
  • Understand and design different scheduling and work stealing strategies for several types of fork-join tasks.
  • Implementing work stealing algorithms can be interesting. How do we reduce contention? How do we minimize work imbalance?
  • We want to experiment with user provided hints for making scheduling decision. Specifically, we want to provide a mechanism for the user to indicate the characteristics of the work which could be used for making better scheduling decisions. The mechanism could either be annotations (preferable) or a function call(less preferable).
  • Generate a test suite that demonstrate the framework’s performance for real world divide and conquer tasks for different scheduling and work stealing algorithms.
  • Define a simple to use, friendly API for the framework.

Goals and deliverables:

Plan to achieve:

  • Implement a correct, fork join framework for the Java programming language.
    • We expect different results from our framework, according to the type of workload that is tested. For problems like Fibonacci, or for-loop chunking, we expect a linear speedup with the number of cores of the machine. The speedup will be slightly lower in other tasks that not have such high degrees of parallelism. We must say that the speedup might not be ideal due to communication, task creation and task stealing costs, as well as work imbalance problems that the framework may involve.
  • Implement at least 2 scheduling strategies and 2 work stealing strategies.
  • Implement code to report metrics of the framework. For instance, total job time, statistics about the spawned job execution times, statistics about the idleness of the threads, statistics about the queue length of each thread. This information can potentially help the user to tune the forkjoin properly for the machine (for example, using an optimal threshold level for stopping parallelism).
  • Compare the system to the current Java’s implementation of ForkJoinPool, and also compare it to a basic threading model that uses a regular thread pool instead of a fork join thread pool.
  • Design and implement several test suites. These we will be used to measure the performance of the framework and to compare it to the baseline models. Test suite will include at least 4 of these problems:
    • sorting
    • matrix multiplication
    • for-loop chunking (applications may include map, filter, reduce).
    • Fibonacci
    • Karatsuba

Hope to achieve:

  • Outperform Java’s implementation of ForkJoinPool under some workloads.
  • Handling exceptions: Implement code to handle and report the exceptions generated by the user supplied code.
  • Incorporate user supplied inputs for making smarter scheduling and workload distribution decisions.

Presentation:

We will do a live demo of our code. We will show the performance of the system under several workloads, and we will compare it to Java’s ForkJoinPool implementation and the regular thread pool implementation. We will show several graphs for different workloads:

  • Job completion time for different pool sizes.
  • Job completion time for different sequential threshold limits.
  • Job completion time for different scheduling and work stealing algorithms.
  • Job completion time for Java’s ForkJoinPool implementation and the regular thread pool implementation.
  • Number of steals for different scheduling and work stealing algorithms.
  • Average work time in the queue for different scheduling and work stealing algorithms.

Platform choice:

We will use the Java programming language for this project. Our idea is that the framework should perform well on any computer. The scheduling decisions are made accordingly to the underlying characteristics of the machine that is running the code.


Resources:


Schedule:

You can find the schedule on the home page.