So You Want to Learn Quicksort: Step-by-Step Guide to Basic, Lomuto & Hoare Partitioning.

I don’t have a Bachelor’s degree in computer science so I haven’t had any formal algorithm or data structure training. However, I am trying to remedy that by reading books and solving algorithmic questions on online platforms. One question in Hackerrank needed a sort implementation, which I couldn’t do without the help of AI, so I decided to learn about sort algorithms. I searched and found that quicksort has widespread usage, so I decided to learn it. However it was just too hard for me to grasp what was going on, so like a good programmer I decided to divide the problem into smaller parts and solve those :P.

The quicksort algorithm asks: what if we had an algorithm that takes a number (called the pivot) and divides our array into two parts, the ones smaller than the pivot and the ones larger than it? If an algorithm does this, we now have two arrays of numbers that we know don’t need to be compared. We could do the very same to those two arrays, then we keep doing this with the subarrays that are getting smaller and smaller in size, et voila! The resulting array is sorted. This is the essence of the quicksort algorithm.

The algorithm that partitions our arrays into subarrays that are all bigger or smaller than a certain number is called a partitioning algorithm. Most educational materials on quicksort conflate Lomuto’s and Hoare’s partitioning algorithms with the quicksort algorithm itself, so we could easily lose sight of the big picture. I have written a C quicksort program that is inefficient, yet easy to understand.

Let’s learn about the silliest partitioning algorithm possible; then we can easily learn quicksort, even when it uses the right partitioning algorithms.

Basic Partitioning

In the following code, we’ll partition an array of numbers into those bigger than the pivot and those smaller than the pivot in the silliest way: we go through all the numbers in the array one by one and put each number in one of the three temporary arrays we make — one temporary array for numbers bigger than the pivot, one for those smaller than the pivot, and one for those equal to it.

#include<stdio.h>
#include<stdlib.h>
#define MAX_INTS 100

// Counts from zero
void printArray(int arr[], int place){
    for(int i=0;i<=place;i++){
        printf("%d,",arr[i]);
    }
    puts("");
}

// Counts from zero
// Partitions everything around the last array element (_high_ is the pivot) for elements **low** to high (not zero to high), returns the last array's (pivot's) new placement in the partitioned array.
int basicPartition(int arr[], int low, int high){
    //nE/L/H == number of equals/lows/highs
    //idx is defined at a function scope here and used later to place the elements in their correct position
    int idx=low,nE=0,nL=0,nH=0;
    int pivot=arr[high];
    int Ls[MAX_INTS]={},Hs[MAX_INTS]={},Ps[MAX_INTS]={};

    // Element enumeration
    // Here we place the numbers into temporary high, low, or equal arrays. 
    for(int i=low;i<=high;i++){
        if(arr[i]>pivot){
            Hs[nH]=arr[i];
            nH++;
        } else if (arr[i]<pivot){
            Ls[nL]=arr[i];
            nL++;
        } else if (arr[i] == pivot){
            Ps[nE]=arr[i];
            nE++;
        }
    }

    // Place from temp to arr
    for(int i=0; i<nL;i++,idx++){
        arr[idx]=Ls[i];
    }
    for(int i=0; i<nE;i++,idx++){
        arr[idx]=Ps[i];
    }
    for(int i=0; i<nH;i++,idx++){
        arr[idx]=Hs[i];
    }

    // return the new position of the pivot 
    return low+nL+nE-1;
}

// Counts from zero
int quickSort(int arr[], int low, int high){
    if(low>=high){
        return 0;
    }
    int piv=basicPartition(arr, low, high);
    quickSort(arr, low, piv-1);
    quickSort(arr, piv+1, high);
    return 0;
}

int main(void){
    int arr[] = {12,899,93,789,1000,95,4,56,11,18,12,180,43,93};
    int n = sizeof(arr)/sizeof(int);
    /* printArray(arr,n-1); */
    /* int res = basicPartition(arr, 0, n-1); */
    int res = quickSort(arr, 0, n-1);
    printArray(arr,n-1);
    return EXIT_SUCCESS;
}

If I were to draw a picture it would look like this:

The problem with this implementation is that it uses O(n) in its partitioning. How can we improve that? Maybe instead of keeping everything in a temporary array of equal size, we could do in-place replacement? That’s what the Lomuto’s partitioning algorithm does. Let’s learn that.

Lomuto’s Partitioning Algorithm

So far we made a simple partitioning algorithm, but it left much to be desired in its memory usage. If we want to use less memory, how about we swap the elements instead of keeping them in temporary arrays? That’s what Lomuto’s algorithm does. Imagine two little soldiers that we hire for sorting. The first one is named i, and the second one is named j. Our little guy i’s job is to protect the weak. He creates a firm boundary and all the elements smaller than the pivot line up before him. Our adventurous fella j’s job is to foray into the unknown, find those weaker than or equal to the pivot, and tell them they can find refuge if they line up before i. So here is how the algorithm works:

the last element of the array is chosen as our pivot
i stands at low-1, behind where the array even begins.
j stands at low, where the array begins.
j marches forward, on its way forward, it finds an element that is lower than or equal to the pivot, once it does, it stops to let i know.
j and i cooperate, i moves one step forward to mark the new boundary, the element at the new boundary is swapped with the weak/equal element.
j keeps doing this and marching on until it reaches arr[high] which is our pivot.
When j is at one step behind arr[high], we already know that arr[high] is going to be equal to the pivot since the last element is the pivot itself, so we swap arr[i+1] with arr[high] manually without doing a condition check.
At the end of the process, we return i+1, which is where our boundary (i.e. our pivot’s position in array) is after this one final swap.

Intuition: In Lomuto’s algorithm we use i as a boundary, and everything before i is smaller than or equal to the pivot. So at each step if arr[j] is smaller than the pivot we increase our boundary and move the smaller numbers inside our boundary. Now, since everything before our pivot is smaller than or equal to it and everything after that is bigger than it, our pivot is in fact in its sorted position by the definition of sort.

How does this relate to quicksort? Using Lomuto’s algorithm, you have an algorithm that efficiently partitions an array, such that one number of your array (the pivot) is already in its sorted position, and you have two separate arrays before and after the pivot, in a way that they don’t need to be compared with each other since we already know some of them are always bigger than the others. Isn’t that great? If we now ignore the pivot which is now in its sorted position and keep partitioning for arrays that come before and after it, we’ll have a fully sorted array.

Here’s my implementation of quicksort using Lomuto’s partitioning algorithm. Try not to repeat the mistakes I made during writing this code:

I used i=-1 instead of i=low-1
I swapped arr[high] and arr[i] instead of arr[high] and arr[i+1] at the last step. The boundary is at i+1, since we’re not doing that i++ which is inside our loop. Everything up to i+1 is lower than or equal to the pivot which is always at arr[i+1] at the end of the algorithm.

#include<stdio.h>
#include<stdlib.h>
#define MAX_INTS 100

// Counts from zero
void printArray(int arr[], int place){
    for(int i=0;i<=place;i++){
        printf("%d,",arr[i]);
    }
    puts("");
}

void arrSwapInt(int arr[], int a, int b){
    int temp;
    temp = arr[b];
    arr[b] = arr[a];
    arr[a] = temp;
}

// Counts from zero
int lomutoPartition(int arr[], int low, int high){
    int i=low-1;
    int piv=arr[high];
    for(int j=low;j<high;j++){
        if(arr[j]<=piv){
            i++;
            arrSwapInt(arr,i,j);
        }
    }

    arrSwapInt(arr,high,i+1);
    return i+1;
}

// Counts from zero
int quickSort(int arr[], int low, int high){
    if(low>=high){
        return 0;
    }
    int piv=lomutoPartition(arr, low, high);
    quickSort(arr, low, piv-1);
    quickSort(arr, piv+1, high);
    return 0;
}

int main(void){
    int arr[] = {12,899,93,789,1000,95,4,56,11,18,12,180,43,93};
    int n = sizeof(arr)/sizeof(int);
    int res = quickSort(arr, 0, n-1);
    printArray(arr,n-1);
    return EXIT_SUCCESS;
}

Hoare’s Partitioning Algorithm

Now let’s learn Hoare’s partitioning algorithm.

In Lomuto’s our soldiers i and j had different jobs, now they have the same job description. The only difference between i and j is that i starts one step behind the beginning of the array and j starts from one step after the end of the array. They take rounds to march on and stop where they find elements that shouldn’t be where they are. That is, first i marches towards the end of the array and stops if it sees any element that is bigger than the pivot, and then j marches towards the beginning of the array and stops if it sees any element that is smaller than the pivot. Once they’re both stopped, we can very astutely move two elements to their correct place with one swap and that’s what we do. Our soldiers i and j both keep moving until they face each other (i==j) or get past one another (i>j). That is when they know they’re done.

In short, in Hoare’s algorithm, you move from both sides to the other, stop at the wrong elements from each side, and swap them. Note that just like the two algorithms above, at the end of the Hoare’s algorithm it is guaranteed that all items left of the returned j are smaller than the pivot, and all items right of the retuned j are bigger than the pivot. But unlike those two, the pivot itself doesn’t end up at the returned j and it won’t be at its sorted position. That is, while Hoare’s partitioning algorithm returns the boundary of our partitioned arrays (j), our pivot is not at j (even though we can move it with one extra swap since we already know the boundary, we don’t do this since at the end of the recursion each element will be in its sorted position and this is unnecessary). So this time in the quicksort algorithm, we don’t do quicksort on (low,piv-1) and (piv+1,high) which ignores the element at piv, instead we do (low,piv) (piv+1,high) which includes the element at piv.

Try not to repeat the mistakes I made while writing this code:

I forgot to put the = in the loop break condition, so my code would get stuck in a loop.
I forgot to make the pivot equal to arr[low] and made it equal to arr[0].

#include<stdio.h>
#include<stdlib.h>
#define MAX_INTS 100

// Counts from zero
void printArray(int arr[], int place){
    for(int i=0;i<=place;i++){
        printf("%d,",arr[i]);
    }
    puts("");
}

void arrSwapInt(int arr[], int a, int b){
    int temp;
    temp = arr[b];
    arr[b] = arr[a];
    arr[a] = temp;
}

// Counts from zero
// Returns piv_index instead of where the pivot is
// pivot is not in its sorted position
int hoarePartition(int arr[], int low, int high){
    // low, not zero!
    int piv=arr[low];
    int i=low-1,j=high+1;
    while(1){
        do {
            i++;
        } while (arr[i]<piv);
        do {
            j--;
        } while (arr[j]>piv);
        // if you don't put the equal, for 11,4,18,13 it will get stuck in a loop since the j that is returned will be 0 instead of 1.
        // since the pivot index is the same as the one that was called, it will get stuck in a loop, unless you break when both i and j are equal
        // breaking on j==i makes sense, since if i==j then we are already done comparing boundaries and at the correct pivot index.
        if(i>=j){
            break;
        }
        arrSwapInt(arr, i, j);
    }
    return j;
}

// Counts from zero
int quickSort(int arr[], int low, int high){
    if(low>=high){
        return 0;
    }
    int piv_index=hoarePartition(arr, low, high);

    // Different from Lomuto's and basic partitioning
    quickSort(arr, low, piv_index);
    quickSort(arr, piv_index+1, high);
    return 0;
}

int main(void){
    int arr[] = {12,899,93,789,1000,95,4,56,11,18,12,180,43,93};
    /* int arr[] = {11,4,18,13}; */
    int n = sizeof(arr)/sizeof(int);
    /* int res = hoarePartition(arr, 0, n-1); */
    int res = quickSort(arr, 0, n-1);
    puts("Sorted array:");
    printArray(arr,n-1);
    return EXIT_SUCCESS;
}

Extra Tips

Pivots: In Lomuto’s algorithm, the last element (arr[high]) must be our pivot, because during the quicksort call on (low,piv-1) and (piv+1,high) we assume that the pivot is in its sorted position (that is, at the end of the boundary, where we have done our last swap), so we must choose the last element as our pivot. But if we need another array element to be our pivot, we can always swap it with the last element before we’re calling the algorithms and make the chosen element the last element. But in Hoare’s all we need is an arbitrary number that, ideally, is larger that half of the numbers and smaller than the other half, but we can choose any number really.
Best case and worse case: Running the quicksort algorithm on an already sorted array will have a time complexity of O(n^2). So we usually choose a random element as our pivot (and if it is Lomuto’s, we need to swap it and make it the last element).
Stability of the sort algorithms: The basic partitioning algorithm is stable. But neither Lomuto’s nor Hoare’s algorithms are stable, meaning that for equal items (to which we have attached extra data) the order could be changed. That is {(1,red),(2,blue),(1,blue)} could become {(1,blue), (1,red), (2,blue)}. This would be really bad if we were sorting data in a database with timestamps for example.

Conclusion

Partition Method	Pivot Choice	Extra Memory	Pivot Position After Partition	Stability	Average Time	Worst-Case Time
Basic (temp arrays)	Doesn’t matter	O(n)	Pivot(s) end up in the middle of the concatenated lows-equals-highs block	Yes	O(n log n)	O(n²)
Lomuto	Must be the last element	O(1)	Pivot placed at its final sorted index (i+1)	No	O(n log n)	O(n²)
Hoare	Doesn’t matter but usually the first element	O(1)	Pivot not guaranteed in final position; partitions around boundary j	No	O(n log n)	O(n²)

Did you see what we did for sorting? There is much to be learned from what we did here. We used Divide and Conquer to solve our problem. At each section we only had an algorithm that could do a single sort (a partitioning algorithm can put at most one element in its sorted position) but we used this algorithm - which can’t be seen as a sorting algorithm - 1. multiple times and 2. on smaller and smaller subsets of our problem, so at the end sorting happened efficiently. Any time you’re thinking about a problem that can’t be solved efficiently, maybe you could think about single items and then use D&C to solve the problem.

TODO

I asked AI what I should learn next and here are its opinions:

You never mention the importance of pivot choice on expected running time or practical performance (cache behavior, branch prediction, etc.). Real-world quicksort typically uses “median-of-three” or introspective quicksort (Introsort) that switches to heapsort if recursion gets too deep.
Neither Lomuto’s nor Hoare’s implementations do tail recursion elimination. In the worst case (unbalanced splits), your recursion depth is O(n) and you risk a stack overflow. Best practice is to recurse on the smaller subarray first, then iterate on the larger one.
Discuss three-way partitioning for many-duplicate inputs. And I think I should:
Add pictures of our valiant soldiers i and j performing their duties in Lomuoto’s and Hoare’s algorithms.