About usWhy usInstructorsReviewsCostFAQContactBlogRegister for Webinar

Heap Sort Algorithm

Whether it is for the role of software developer, coding engineer, software engineer, or any such position in the IT industry, heap sort is an essential part of the technical interview prep. 

In fact, it’s almost as if its primary use is cracking job interviews! It is rarely used in real-world scenarios, despite being one of the most interesting sorting algorithms. 

In this article, we’ll discuss:

  • What Is Heap Sort?
  • Applications of Heap Sort
  • How Does Heap Sort Work?
  • Heap Sort Algorithm
  • Heap Sort Pseudocode
  • Heap Sort Code
  • Heap Sort Complexities
  • Strengths and Weaknesses of Heap Sort
  • FAQs on Heap Sort

What Is Heap Sort? 

To understand how heap sort works, we first need to understand some basic concepts related to binary heaps. Feel free to skip them if you are already familiar with these concepts.

Binary Heap

Heap is a tree-based data structure in which all the tree nodes are in a particular order, such that the tree satisfies the heap properties (that is, there is a specific parent-child relationship that is followed throughout the tree). 

A heap data structure where the tree is a complete binary tree is referred to as a binary heap.

A complete binary tree is a binary tree in which all levels except the bottom-most level are completely filled, and all nodes in the bottom-most level are as far left as possible. (The last level may or may not be completely filled.)

A full binary tree is a binary tree where every node has 0 or 2 children.

Properties of a Binary Heap

1. They are complete binary trees: This means all levels are totally filled (except maybe the last level), and the nodes in the last level are as left as possible. This property makes arrays a suitable data structure for storing binary heaps. 

We can easily calculate indices of a node’s children. So, for parent index i, the left child will be found at index 2*i+1, and the right child will be found at index 2*i+2 (for indices that start with 0). Similarly, for a child at index i, its parent can be found at index i/2.

2. Heaps are typically of two types — max heap and min heap: In a max heap, the value of a node is always greater than or equal to the value of each of its children. Conversely, in a min heap, the value of a parent is always <= the value of each of its children.


3. In a max heap, the element at the root will always be the maximum. In a min heap, the element at the root will always be the minimum. Heap sort algorithm takes advantage of this property to sort an array using heaps. 

Heap Sort Definition

Heap sort is an efficient comparison-based sorting algorithm that creates a heap from the input array and then sorts the array by taking advantage of a heap's properties.

Please keep in mind, since the heap is a tree-based data structure, this also means that the knowledge of arrays, trees, binary trees, and heaps is key to understanding the heap sort algorithm.

Heapify Method

Before going into the workings of heap sort, we’ll visualize the array as a complete binary tree. Next, we turn it into a max heap by using a process we call heapification. 

The brilliance of heapification lies in the fact that if all the subtrees in a binary tree are MaxHeaps themselves, the whole tree is a MaxHeap. One way to implement this idea would be:

  1. Start at the bottom of the tree.
  2. Iterate through all the nodes as we travel to the top.
  3. At each step, ensure that the node and all its children form a valid max heap. 

If we successfully do that, we will have transformed the whole binary tree into a valid MaxHeap after processing all the nodes. One way to optimize this process is by ignoring all the leaf nodes since they don't have any children:

  1. Go to the rightmost node in the second bottom-most layer, which has any children.
  2. Process that rightmost node to make sure it forms a MaxHeap with its children
  3. Traverse to the node to its left and repeat the process. 
  4. At the end of the layer, we jump to the rightmost node of the layer above it. 

This journey ends when we eventually reach the topmost node and process it.

Let’s see this in more detail:

  1. Compare the value of the node with the value stored in the child nodes. If the parent's value is more than each of the values stored in the child nodes, do nothing.
  2. If the value of a child node is more than the parent node, swap the values between the parent and child node. If both the child nodes have a higher value than the parent, swap the parent’s value with the child with the greater value.
  3. Now, for the child node that got updated, we repeat steps 1 and 2. 

If this sounds like a recursive method, that's because it is! We keep calling this method recursively for the child nodes that got updated until we reach a stage where the child node is either a leaf or has children, each of whose values are lower.

You might have wondered why we decided to traverse bottom to top and not top to bottom. That's because steps 1-3 for heapifying a node only work if the child nodes are heapified already. 

At the end of this process, a max heap is fully formed. We can also make a min heap simply by changing the condition to “parent value should be <= each of its children’s values” (swap values if the condition isn’t met). 

Have a look at the following example:

When sorting in-place, a max heap can be used to sort the array in ascending order, and a min heap can be used to sort the array in descending order. If sorting doesn’t have to be in-place, we can use an auxiliary array to place the extracted element from the heap’s top in its correct position, whether we use a min heap or a max heap for the sorting. 

But even when sorting is not the aim, a min/max heap in itself is a useful construction. The root element of a max heap always contains the maximum element, and that of a min heap always has the minimum element. This quality of heaps can come in handy when we want to extract only the largest or smallest element from an array without keeping the remaining items in the sorted order. 

Applications of Heap Sort 

  • Implementation of priority queues
  • Security systems
  • Embedded systems (for example, Linux Kernel)

Because algorithms like merge sort and quicksort are better in practice, heap sort has limited usage. Heaps are extensively used for problems like getting the largest or smallest elements in an array, sorting an almost sorted array, etc.

How Does Heap Sort Work?

Now that we’ve learned how to create a heap from an array using the heapify method, we will look into using the heap to sort the array.

After the heap formation using the heapify method, the sorting is done by:

  1. Swapping the root element with the last element of the array and decreasing the length of the heap array by one. In heap representation, it is equivalent to swapping the root with the bottommost and rightmost leaf and then deleting the leaf.
  2. Restoring heap properties (reheapification) after each deletion, where we need to apply the heapify method only on the root node as the subtree heaps will still have their heap properties intact at the beginning of the process.
  3. Repeating this process of root removal, its storage in the position of the highest index value used by the heap, and heap length decrement, until every element in the array is sorted.

On a max heap, this process will sort the array in ascending order. On a min heap, it will sort in descending order.

This process can be best illustrated using an example:

The process above ends when heap size = 2, because a two-element heap is always considered sorted.

So basically, the heap sort algorithm has two parts that run recursively till heap size >= 2:

  • Creating a heap from the currently unsorted elements of the array.
  • Swapping the root element with the last element of the heap (rightmost leaf node) and reducing heap size by 1.

Heap Sort Algorithm

Here’s the algorithm for heap sort:

Step 1: Build Heap. Build a heap from the input data. Build a max heap to sort in increasing order, build a min heap to sort in decreasing order.

Step 2: Swap Root. Swap the root element with the last item of the heap. 

Step 3: Reduce Heap Size. Reduce the size of the heap by 1.

Step 4: Re-Heapify. Heapify the remaining elements into a heap of the new heap size by calling heapify on the root node.

Step 5: Call Recursively. Repeat steps 2,3,4 as long as the size of the heap is greater than 2.

Each time the last array position is discarded from the heap once it contains the right element. The process is repeated until all the input array elements are sorted. This happens when the heap size is reduced to 2, since for a heap that satisfies the heap property, the first two elements will automatically be in order.

Heap Sort Pseudocode

Following is the pseudocode for heap sort. Please have a look and try to implement this in a programming language of your choice.


Input: Array A, size N
heapSort()
For all non-leaf elements (i=N/2-1;i>=0;i--)
      Build Heap (Heapify)
Initialize indexEnd
While indexEnd>1
      Swap(A[0],A[indexEnd]
      indexEnd=indexEnd-1
      Build heap (apply heapify on the root node), considering array  from A[0] to A[indexEnd]
Output the sorted array[]
end heapSort()

Heap Sort Code

We have implemented the heap sort algorithm to sort in ascending order in C++:


The Heap Sort Program
#include 
  using namespace std;
  
  void heapify(int array[], int sizeHeap, int parentIndex) 
  {
    // Establishing relationship between indices of a node and indices of 
    // its left and right children 
    int larger = parentIndex;
    int leftChildIndex = 2 * parentIndex + 1;
    int rightChildIndex = 2 * parentIndex + 2;
    
    // Making sure the parent is greater than or equal to its left and right 
    // children
    if (leftChildIndex < sizeHeap && array[leftChildIndex] > array[larger])
      larger = leftChildIndex;
  
    if (rightChildIndex < sizeHeap && array[rightChildIndex] > array[larger])
      larger = rightChildIndex;
  
    // Swap and heapify if parent/root is not largest
    if (larger != parentIndex) 
    {
      swap(array[parentIndex], array[larger]);
      heapify(array, sizeHeap, larger);
    }
  }
  
 
  void heapSort(int array[], int sizeArray) 
  {
  	
    // Creating max  heap, iterating for all  non=leaf indices, since leaf 
    // indices don't have children to check for

    for (int nonleafNodeIndex = sizeArray / 2 - 1; nonleafNodeIndex >= 0; nonleafNodeIndex--)
      heapify(array, sizeArray, nonleafNodeIndex);
  
    // Root element of the heap is swapped with the last heap index, the 
    // size of the heap is reduced till size of heap is 2 (last heap index 
    // is 1)

    for (int lastHeapIndex = sizeArray - 1; lastHeapIndex >= 1; lastHeapIndex--) 
    {
      swap(array[0], array[lastHeapIndex]);
  
     // Heapifying root element so that the highest element is again at the 
     // root
      heapify(array, lastHeapIndex, 0);
    }
    
  }
  
 
  int main() 
  {
    int array[] = {77, 15, 91, 21, 6, 46};
    
    int sizeArray = sizeof(array) / sizeof(array[0]);
    
    heapSort(array, sizeArray);
  
    for (int i = 0; i < sizeArray; ++i)
      cout << array[i] << " ";
  }

Output: 6 15 21 46 77 91

Heap Sort Complexity

Heap Sort Time Complexity

The time complexity of heap sort is non-quadratic and comes out the same in the best, worst and average cases: O(nlogn)

Let’s see how.

(Note: The following sections are based on working with MaxHeaps)

Time Complexity of the Heapify Method

The heapify method is run on a node whose child nodes are already heapified. 

The worst-case run time will be experienced when the heapify method is run on a node that is smaller than all of its children. This means the node has to be swapped through all of its levels to position it at the leaf level. So the worst-case run time will be a function of the height of the subtree, h. 

Thus, the worst-case time complexity of each heapify method invocation is O(h). This height h is not a constant. At the bottom of the tree, h is 0, and at the top of the tree, h is equal to log2N. 

The time complexity for calling the heapify method for all the nodes of the tree (from bottom to top):

Taking advantage of the properties of Big-O notation, in the last step, we raised the upper limit of the summation from lg(N) to ∞. This will help us simplify the calculation. We’ll do so with the help of known mathematical properties involving the summation of numeric expressions from 0 to ∞.

We will use the following mathematical property:

We can notice that in our equation, we can use the above property by replacing x with 1/2. So, our equation now becomes:

Thus, the first step of heap sort, which is building a heap out of a randomly arranged array, can be done in O(N).

Time Complexity of Getting a Sorted Array Out of A max Heap

This step involves swapping the leftmost value in the array with the rightmost value in the array occupied by the heap and reheapification of the new smaller heap. 

Swapping the max element with the bottom level rightmost element and reducing the size of the heap can be done in constant time, O(1).

Now, let’s discuss reheapification. In the worst case, the new value at the root position will have to be swapped log(N) times to be sent to the bottom of the heap to achieve a MaxHeap once again. So each reheapification after the extraction costs O(logN). 

We will be performing this extraction N times, so the total time complexity of getting a sorted array out of a MaxHeap is O(N*log(N)).

Total Time Complexity of Heap Sort

The total time complexity of heap sort can be calculated as:

Time for creating a MaxHeap + Time for getting a sorted array out of a MaxHeap

=O(N) +O(Nlog(N))

=O(Nlog(N))

Heap Sort Space Complexity

Heap sort’s space complexity is a constant O(1) due to its auxiliary storage.

Strengths of Heap Sort

  • No quadratic worst-case run time.
  • It is an in-place sorting algorithm and performs sorting in O(1) space complexity. 
  • Compared to quicksort, it has a better worst-case time complexity — O(nlog n).
    The best-case complexity is the same for both quick sort and heap sort — O(nlog n).
  • Unlike merge sort, it does not require extra space.
  • The input data being completely or almost sorted doesn’t make the complexities suffer.
  • The average-case complexity is the same as that of merge sort and quicksort.

Weaknesses of Heap Sort

  • Heap sort is typically not stable since the operations on the heap can change the relative order of equal key items. It’s typically an unstable sorting algorithm.
  • If the input array is very large and doesn’t fit into the memory and partitioning the array is faster than maintaining the heap, heap sort isn’t an option. In such cases, something like merge sort or bucket sort, where parts of the array can be processed separately and parallelly, works best. 

Heap Sort FAQs

Question 1: Does the heap data structure have to be binary-tree-based?

No, a heap does not always need to be a binary tree. But in heap sort, we use arrays to represent the heap. We can easily calculate and track the relationship between a parent index, its left child index, and right child index for a binary heap using the array. And a binary heap has to be binary-tree-based.

Question 2: Can heap sort be made stable?

While heap sort is typically not stable, it can be made stable by taking into account the position of the elements with the same value. During heapification, treat the element towards the right as greater than the element towards the left, and your sorting will be stable. 

Question 3: Why are arrays used to visualize and implement binary heaps?

Storing and accessing values in an array is faster and less complicated than using a more complex data structure. One of the main advantages of using more complex data structures is the use of methods provided by the standard library for common operations related to the data structure, e.g., push() and pop() methods for a stack. 

However, storing a complete binary tree in an array still allows us to perform all operations relevant to the tree with much ease. We can find the left child, right child, parent node, root, and the last element of a tree with basic arithmetic operations on the index of the current node or the variable maintaining the size of the tree.

Question 4: How much time does it take to find the maximum and minimum element in a max heap?

The maximum element is present at the root and can be found in O(1) time. The minimum element will be present in the leaf nodes, and all leaf nodes have to be checked to find the minimum element. Hence, the minimum element can be found in O(n) time.

Are You Ready to Nail Your Next Coding Interview?

Sorting algorithms interview questions feature in almost every coding interview for software developers. If you’re looking for guidance and help to nail these questions and more, sign up for our free webinar

As pioneers in the field of technical interview prep, we have trained thousands of software engineers to crack the toughest coding interviews and land their dream jobs at Google, Facebook, Apple, Netflix, Amazon, and other Tier-1 tech companies.

Sign up now!

----------

Article contributed by Tanya Shrivastava