ForkJoin Framework in Java Merge Sort

Snippet of programming code in IDE
Published on

Understanding ForkJoin Framework in Java

In the world of parallel computing, the ForkJoin framework in Java offers a powerful tool for efficiently handling tasks that can be broken down into smaller subtasks and executed concurrently. This framework is particularly useful for implementing divide-and-conquer algorithms, such as merge sort. In this article, we will delve into the concept of the ForkJoin framework and how it can be utilized to implement the merge sort algorithm in Java.

What is the ForkJoin Framework?

The ForkJoin framework was introduced in Java 7 as part of the java.util.concurrent package. It is designed to facilitate parallel processing of tasks by efficiently distributing them across multiple processing cores. At the core of the ForkJoin framework are the ForkJoinPool, ForkJoinTask, and RecursiveTask classes.

The ForkJoinPool represents a thread pool specifically tailored for executing ForkJoinTasks. These tasks, derived from the ForkJoinTask class, can be further specialized into RecursiveTask for tasks that produce a result or RecursiveAction for those that do not.

One of the key features of the ForkJoin framework is its support for work-stealing. This means that idle threads can "steal" subtasks from other threads' queues, ensuring a balanced distribution of the workload.

Applying ForkJoin to Merge Sort

Merge sort is a classic example of a divide-and-conquer algorithm. It works by recursively dividing the array into smaller subarrays, sorting them, and then merging them back together. We can leverage the ForkJoin framework to parallelize the sorting and merging of these subarrays, improving the overall performance of the merge sort algorithm.

Let's dive into a practical implementation of merge sort using the ForkJoin framework in Java.

import java.util.Arrays;
import java.util.concurrent.RecursiveTask;
import java.util.concurrent.ForkJoinPool;

public class MergeSort extends RecursiveTask<int[]> {
    private final int[] array;
    
    public MergeSort(int[] array) {
        this.array = array;
    }
    
    @Override
    protected int[] compute() {
        if (array.length <= 1) {
            return array;
        }
        
        int mid = array.length / 2;
        int[] left = Arrays.copyOfRange(array, 0, mid);
        int[] right = Arrays.copyOfRange(array, mid, array.length);

        MergeSort leftTask = new MergeSort(left);
        MergeSort rightTask = new MergeSort(right);

        invokeAll(leftTask, rightTask);

        int[] sortedLeft = leftTask.join();
        int[] sortedRight = rightTask.join();

        return merge(sortedLeft, sortedRight);
    }

    private int[] merge(int[] left, int[] right) {
        int[] merged = new int[left.length + right.length];
        int i = 0, j = 0, k = 0;
        
        while (i < left.length && j < right.length) {
            if (left[i] <= right[j]) {
                merged[k++] = left[i++];
            } else {
                merged[k++] = right[j++];
            }
        }
        
        while (i < left.length) {
            merged[k++] = left[i++];
        }
        
        while (j < right.length) {
            merged[k++] = right[j++];
        }
        
        return merged;
    }

    public static void main(String[] args) {
        int[] unsortedArray = {5, 3, 8, 6, 2, 7, 1, 4};
        ForkJoinPool pool = new ForkJoinPool();
        int[] sortedArray = pool.invoke(new MergeSort(unsortedArray));
        System.out.println(Arrays.toString(sortedArray));
    }
}

Breaking Down the Code

The MergeSort Class

  • The MergeSort class extends RecursiveTask<int[]>, indicating that it will return an array of integers.
  • In the compute method, the array is recursively divided into subarrays, and new MergeSort tasks are created for each subarray.
  • The merge method is a private utility function to merge two sorted arrays into a single sorted array.

Parallelization with ForkJoin

  • By creating new MergeSort tasks for the left and right subarrays and invoking them using invokeAll, the sorting process is parallelized.
  • The results of the subtasks are obtained using the join method, and then merged in the compute method.

Running the Merge Sort

In the main method, an unsorted array is defined, and a ForkJoinPool is used to invoke the merge sort on the entire array. The sorted array is then printed to the console.

Lessons Learned

In this article, we explored the concept of the ForkJoin framework in Java and its application in implementing the merge sort algorithm. By leveraging the parallel processing capabilities of the ForkJoin framework, we were able to optimize the sorting process of merge sort. The parallelization of merge sort using ForkJoin can lead to significant performance improvements, particularly for large arrays and computational tasks. As parallel and concurrent programming continue to gain prominence in the realm of software development, understanding and harnessing tools like the ForkJoin framework becomes increasingly valuable for building efficient and responsive applications.