Interview Kickstart has enabled over 3500 engineers to uplevel.

For tech interviews, an understanding of sorting algorithms is crucial at any level. When we think about interview questions on sorting algorithms, topological sort (aka topo sort) is an important topic, as it can help solve complicated interview questions with ease.

If you’re preparing for a technical interview — for the role of software engineer, coding engineer, software developer, or other such positions — you’ll benefit from understanding topological sort and practicing topological sorting questions.

In this article, we’ll discuss:

- What Is Topological Sort?
- Kahn’s Algorithm
- Cyclic Graphs of Topological Sort Algorithm
- Applications of Topological Sort
- How Does Topological Sort Work?
- Topological Sort Algorithm
- Topological Sort Pseudocode
- Topological Sort Code
- Topological Sort Complexities
- Strengths and Weaknesses of Topological Sort
- FAQs on Topological Sort

Often, in life and work, the order in which we do things is important. Every task requires completing a few prerequisite tasks first.

For example, to eat food, we need to buy groceries, prep the ingredients, cook the food, clean the plates, and finally serve the food on the plates before we can eat it. We can’t prep the ingredients before we buy them. But we can clean the plates first or buy groceries first as they have no tasks that must be done before they can be attempted.

This is the essence of the topological sort. We have a list of tasks, where some tasks have a condition that some other tasks must occur before them. Such tasks can be visualized as a graph with directed edges.

Going back to the example:

Here, the nodes in the directed graph represent the tasks, and the directed edges between nodes tell us which task has to be done before the other. With the tasks and dependencies represented as a directed graph, we can use topological sort and find all the possible valid ways to complete a task.

Topological sorting is a linear ordering defined for vertices of a directed acyclic graph (DAG). For every directed edge (u,v), vertex u comes before vertex v in the topologically sorted order. This means that topological sorting for a cyclic graph is not possible. We discuss the reasons for this later in the article.

Topological sorting is used mainly when tasks or items have to be ordered, where some tasks or items have to occur before others can. For example:

- Scheduling jobs
- Course arrangement in educational institutions
- Compile-time build dependencies

Now that we understand the concept of topological sorting, let us look at how we can implement it using an algorithm. We’ll also see how it works with the help of an example.

Kahn’s algorithm takes a directed acyclic graph and sorts its nodes in topological order. It works by keeping track of the number of incoming edges into each node, also called in-degree. It repeatedly:

- Finds nodes with no incoming edge, that is, nodes with zero in-degree (no dependency)
- Stores the nodes with zero in-degree
- Removes all outgoing edges from the nodes with zero in-degree
- Decrements the in-degree for each node after a connected edge removal

(Steps 3 and 4 can be performed simultaneously.)

This process is done until no element with zero in-degree can be found. This can happen when the topological sorting is complete and also when a cycle is encountered. Therefore, the algorithm checks if the result of topological sorting contains the same number of elements that were supposed to be sorted:

- If the numbers match, then no cycle was encountered, and we print the sorted order.
- If a cycle was encountered, the topological sorting was not possible. This is conveyed by returning null or printing a message.

Example:

Here, “In” = indegree

**Order of removal: **First 0, then 1 and 2, then 3, and finally, 4 and 5.

**Topological order:** 0,1,2,3,4,5

**Note:** This is not the only possible topological order.

For example, 0,1,3,4,5,2 is also a valid topological order.

If we had no clothes to go out, and we needed to go out to get clothes, we would be in a cycle where one task depends on the other, and no task independent of a prerequisite exists. Therefore, we can’t begin!

This is why topological sorting doesn’t work for cyclic graphs. In any graph where there’s a cycle, there comes a time where there’s no node left that is devoid of dependencies.

To understand the process more concretely, let’s take an example:

Edges: { 0, 1 }, { 0, 2 }, { 1, 3}, {1, 4), { 3, 4}, { 3, 5 }

In this example, we start with the node with in-degree zero, remove that node and its edges, add it to our sorted list.

Then, we decrement the in-degrees of the nodes it was connected to and repeat all these steps until all elements are sorted. Note that there may exist more than one valid topological ordering in an acyclic graph, as is the case in this example as well.

Now that we’ve looked at Kahn’s Algorithm for topological sorting in detail with an example, we can condense it as the topological sort algorithm:

**Step 0:** Find indegree for all nodes.

**Step 1:** Identify a node with no incoming edges.

**Step 2:** Remove the node from the graph and add it to the ordering.

**Step 3:** Remove the node’s out-going edges from the graph.

**Step 4:** Decrement the in-degree where connected edges were deleted.

**Step 5: **Repeat Steps 1 to 4 till there are no nodes left with zero in-degree.

**Step 6: **Check if all elements are present in the sorted order.

**Step 7:** If the result of Step 6 is true, we have the sorted order. Else no topological ordering exists.

**Step 8:** Exit.

Now we can represent the topological sort algorithm as pseudocode:

topologicalSort()

For(vertex=0; vertex<inputSize; vertex++)

Find indegree[vertex]

while(node with in-degree zero exists)

{

Find vertex U with in-degree = 0

Remove U and all its edges (U, V) from the graph.

For vertices where edges connected to them were removed.

in-degree[vertex]=in-degree[vertex]-1

)

if(elements sorted = all elements)

Return or Print nodes in topologically sorted order

Else

Return null or Print no topological ordering exists

end topologicalSort()

Now that we have looked at the algorithm and pseudocode, let us see how it is implemented in C++.

// Topological Sort C++ Code

#include <iostream>

#include <vector>

using namespace std;

// Structure to store an edge of a graph

struct Edge {

int from, to;

};

// A class that represents a graph

class Graph

{

public:

// Each element of an adjacency list contains a node and every other node it points to.

vector<vector<int>> adjacencyList;

// To store indegree of a node

vector<int> indegree;

// Constructing a graph

Graph(vector<Edge> const &Edges, int allNodes)

{

// Resizing the vector to hold all nodes

adjacencyList.resize(allNodes);

// Initializing indegree of all nodes to zero

indegree.assign(allNodes, 0);

// Adding directed edges to the graph start node-->end node

for (auto &edge: Edges)

{

// Adding an edge from start node to end node

adjacencyList[edge.from].push_back(edge.to);

// Incrementing in-degree of end node by 1

indegree[edge.to]++;

}

}

};

bool topologicalSort(Graph const &inputGraph, vector<int> &sorted, int allNodes)

{

vector<int> indegree = inputGraph.indegree;

// Storing all the nodes with no incoming edges

vector<int> zeroIndegreeNodes;

for (int i = 0; i < allNodes; i++)

{

if (indegree[i]==0) {

zeroIndegreeNodes.push_back(i);

}

}

while (!zeroIndegreeNodes.empty())

{

// Deleting fromNode from zeroIndegreeNodes

int fromNode = zeroIndegreeNodes.back();

zeroIndegreeNodes.pop_back();

// Pushing fromNode into sorted

sorted.push_back(fromNode);

for (int toNode: inputGraph.adjacencyList[fromNode])

{

// Deleting from the graph the edge from fromNode to toNode

indegree[toNode] -= 1;

// If the updated in-degree of toNode is now zero, inserting toNode into zeroIndegreeNodes

if (indegree[toNode]==0) {

zeroIndegreeNodes.push_back(toNode);

}

}

}

// A cycle was encountered if the sorting is done for all zero in-degree nodes in the loop above, but not all input nodes have been sorted

if( (int) sorted.size()!=allNodes)

{

return false;

}

return true;

}

int main()

{

// Count of all the nodes in the graph

int allNodes = 6;

// All the edges of the graph

vector<Edge> edges =

{

{ 0, 1 }, { 0, 2 }, { 1, 3}, {1, 4}, { 3, 4}, { 3, 5 }

};

// A vector to store elements in the sorted order

vector<int> sorted;

// Building a graph, given the number of nodes and all it edges

Graph inputGraph(edges, allNodes);

// Topological sorting. Printing the topologically sorted order if one exists.

if (topologicalSort(inputGraph, sorted, allNodes))

{

for (int i;i< (int) sorted.size();i++)

{

cout << sorted[i] << " ";

}

}

else

{

cout << "Topological sorting is not possible as the graph isn't acyclic";

}

return 0;

}

The output is 0 2 1 3 5 4, which is another valid topological ordering of the directed acyclic graph we discussed in the example earlier. Nodes (2,1) and (5, 4) reach in-degree zero at the same time; therefore, they can be written in any order relative to each other. So, 0 | 1,2 | 3 | 4,5 is just as valid as 0 | 2,1 | 3 | 5,4. Of course, there are even more valid topological orders possible in this case.

Let’s look at the time and space complexity of topological sorting. We’ll also explain each complexity in detail.

The time complexity of topological sort using Kahn’s algorithm is **O(V+E),** where V = Vertices, E = Edges.

To find out the time complexity of the algorithm, let's try to find the time complexity of each of the steps:

- To determine the in-degree of each node, we will have to iterate through all the edges of the graph. So the time complexity of this step is
**O(E)**. - Next, look for nodes with in-degrees equal to 0. This will require us to iterate through the entire array that stores the in-degrees of each node. The size of this array is equal to V. So, the time complexity of this step is
**O(V)**. - For each node with an in-degree equal to zero, we remove it from the graph and decrement the in-degrees of the nodes it connected to. In the entire run of the algorithm, the number of times we have to perform the decrement operation is equal to the number of edges. So it will take
**O(E)**time. - Whenever we decrement the in-degree of a node, we check if the node has achieved in-degree = 0 status. When this happens, we move it to the stack. In the entire run of the algorithm, it can occur at max (V-1) times. So the run time of this operation is
**O(V).** - In the end, we compare the size of the resultant topologically sorted array and the size of the graph to ensure that it was not a cyclic graph. This operation is done in
**O(1)**time.

So, adding all the individual run times, the time complexity of topological sort is **O(V+E)**.

This is the best case, worst case, and average-case time complexity of this algorithm since we perform the same steps regardless of how the elements are organized.

Auxiliary space required is** O(V).**

- We have to create one array to store the in-degrees of all the nodes. This will require
**O(V)**space. - We have to store the nodes with in-degree = 0 in a data structure (stack or queue) when we remove them from the graph. In the worst case, the data structure will store all the graph nodes, requiring
**O(V)**space. - Finally, we need an array to store all the nodes in sorted order. This will naturally require
**O(V)**space.

Adding all three, we arrive at the space complexity of **O(V)**.

Strengths:

- Requires linear time and linear space to perform.
- Effective in detecting cyclic dependencies.
- Can efficiently find feedback loops that should not exist in a combinational circuit.
- Can be used to find the shortest path between two nodes in a DAG in linear time.

Weaknesses:

- Topological sort is not possible for a graph that is not directed and acyclic.

**Question 1: What happens when we try to topologically sort a graph with nodes not connected by any edges?**

Imagine a task that does not have any prerequisites and is not a blocker for any other tasks. It can be performed at any time regardless of whether other tasks have been completed or not. Thus, it is valid for these nodes to appear in any position in a topologically sorted array. It will depend on the implementation of the algorithm where they eventually end up.

**Question 2: Why is it so that there can be multiple correct results for a topological sort?**

As already illustrated in the previous question, the position of nodes in the final topologically sorted array only needs to satisfy one criterion — for each node, all its dependencies should be found to its left. This condition can be fulfilled by arranging the nodes in multiple ways, as long as we are not placing a node to the right of a dependent node.

If you're having trouble understanding this idea, imagine a graph that has N nodes but only one directed edge connecting node A to node B. To topologically sort the nodes, we can arrange the N nodes in any way while only taking care of the constraint that A should lie to the left of B.

If you’re looking for guidance and help with getting started, sign up for our free webinar. As pioneers in the field of technical interview prep, we have trained thousands of software engineers to crack the toughest coding interviews and land jobs at their dream companies, such as Google, Facebook, Apple, Netflix, Amazon, and more!

----------

*Article contributed by Tanya Shrivastava*