Analyzing Sorting Algorithms With Java

In this blog post we’ll be looking at two basic sorting algorithms implemented with Java. We’ll try to analyze the run-time efficiency of each algorithm by plotting the time taken for each sort iteration with a given random array. We’ll try to demystify some basic misconceptions on how each algorithm works by trying to implement them with Java.

We’ll cover the following algorithms:

  • Bubble Sort
  • Quick Sort
    • With choosing a random pivot
    • With choosing a basic pivot (media of three)

Bubble Sort


Above is a standard implementation of bubble sort with a little yet important tweak. Rather then iterating the array until its end, where each iteration we progress the rightPos index a step more towards the end of the array, we count how many swaps were made each iteration. If at the end of and iteration no swaps were made, the array is sorted and the algorithm should be stopped. In this case, if the passed array already ordered, the algorithm will run run in O(n) instead of O(n^2)

Quick Sort


Quick sort works as a “divide and conquer” algorithm. It splits a given array into 2 parts around a chosen “pivot” element. In our implementation, Quick Sort will move all of the elements that are smaller than the pivot to its left, and all the elements that are bigger than the pivot to its right. The algorithm stops dividing the array when it receives an array equal or smaller to the size of three elements.

For many Computer Sciences undergraduates, when first trying to implement the algorithm on their own they seem to get confused on the technicalities of properly choosing the pivot and arranging all elements around it. As you might already know, there are all sorts of ways to choose the pivot. Each way might be suitable for different requirements. The point of this post is to provide you with some boilerplate code for further testing.

Quick sort with a random pivot

First, we choose out random pivot by selecting it as the last element of the given array. Because we can’t assume anything on the elements of the array, choosing the last element has the same “random” effect as generating a number between 1 to the length of the array and than selecting the pivot at that location. Here, we gradually move inwards from the leftmost index and the rightmost index of the array in order to swap the desired numbers. Because we need to swap the pivot element with the last swapped element we’ll need to create a checkPos variable to hold the position of the last swapped index. All of the elements to the right of checkPos will be greater than the pivot. After swapping all the required elements, a final swap is required to position the pivot at its designated place. The function will return the index position of the pivot. This way, Quick Sort can be run again on the right and left sub arrays around the pivot (not including it, of course).

Quick Sort with a median of three pivot

Here we choose the median of three strategy to choosing a pivot more efficiently. We look at the first, middle, and last elements of the array. We sort them using a basic sorting algorithm. Because we do this for only 3 elements, fancy sorting algorithms are unnecessary. For the sake of our implementation, we’ll use Bubble Sort as implemented above. You can choose whatever algorithm you like. Then, we arrange them in place, and swap the median of them with the element to the left of the rightmost (the highest element in the array we just sorted) element. Effectively, this gives us a starting point of a chosen pivot and an element larger than him, in its designated place. The Sorting itself is done by the same QSRandomPartition function we presented above. The only difference is, the rightPos index should be the median’s position we swapped above, like so:

Plotting The Run-Time Results


Capture

(Figure 1)

As seen in figure 1, the Y axis is the amount taken for each plotted algorithm to sort the array, and the X axis represent the array’s length. Each array is generated randomly and passed to the plotted sorting algorithms. The green graph represents O(nlogn) just for comparison reasons. The red graph is the run-time for Quick Sort (median of three pivot)  and the blue graph is the run-time for Bubble Sort.

As we already should know, Bubble Sort has average performance of O(n^2) as clearly seen above. The blue graph has an asymptotic shape as expected, whereas the green graph is almost identical to O(nlogn), which is the average case performance of Quick Sort.

Capture1

(Figure 2)

As seen in figure 2, the axes represent the same metrics. However, in these graphs the generated arrays are already sorted. The green graph is as seen in figure 1, representing O(nlogn) for comparison reasons. The blue graph, which is barely visible, represents Bubble Sort. Thanks to the performance tweak we implemented, we can see that the graph is way below O(nlogn). It is safe to say that our addition to the algorithm made it linear for sorted arrays. The red graph represents Quick Sort (median of three pivot). Here, the graph looks something like O(n^2) which is its worst case performance.

Capture2

(Figure 3)

In figure 3, the axes yet again represent the same metrics as in figure 1 and 2. Here, as in Figure 1, the arrays are generated randomly. The green graph is O(nlogn). The red graph is Quick Sort with a random pivot, and the blue graph is Quick Sort with a median of three pivot. They are extremely similar to each other, as the media of three strategy has no real benefit in real-world case scenarios.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s