There are n courses to take, they are referred to by numbers from 0 to n-1. Some of them have prerequisites, e.g. courses A and B must be completed before course C can be taken (in other words, course C depends on A and B).
Find and return an ordering in which all the given courses can be taken while satisfying all the prerequisites. If there exists more than one such ordering, any one of them would be a correct answer. If no such ordering exists, return a special value: [-1].
Input: n=4, prerequisites=[[1, 0], [2, 0], [3, 1], [3, 2]]
Output: [0, 1, 2, 3]
According to the input,
● course 0 must be done before both 1 and 2,
● courses 1 and 2 must be done before course 3.
There are two orderings in which one can take all four courses: [0, 1, 2, 3] and [0, 2, 1, 3]. Both are correct answers.
Function accepts two arguments: the number of courses n and the list of prerequisites.
Prerequisites are given in the form of a two-dimensional array (or a list of lists) of integers. Each inner array has exactly two elements, so it is essentially a list of pairs. Each pair [X, Y] represents one prerequisite: course Y has to be completed before X can be taken (X depends on Y, in other words).
Function must return an array (list) of integers.
If all given courses can be taken while satisfying all given prerequisites, the returned array must contain a possible ordering (if more than one such ordering exists, any one must be returned). Otherwise the function must return an array (list) with one element -1 in it.
• 1
• 0
nodes’ in-degree. They have the same time and space complexity in the big-O notation in terms of number of courses and number of prerequisites.
Both algorithms need a directed graph to work with; let us build one. Courses become graph nodes. Prerequisites become directed edges: for each “course A must be completed before B can be taken” we add an edge from A->B. In other words, an incoming edge means that another course must be completed before this one.
For example, if n=4 and prerequisites=[[1, 0], [2, 0], [3, 1], [3, 2]], the graph would look like this:
Having built such a graph, now a correct answer is a topological ordering of the nodes. The two sample solutions discussed below are essentially two different implementations of the topological sort algorithm.
Topological ordering doesn’t exist if the graph has a cycle. Both sample solutions detect cycles in their respective ways and return the special value in that case.
Topological sort implementation via DFS goes like this:
● Run top level DFS (iterate through all nodes and run DFS from each unvisited one) on the graph.
● Remember the order in which the nodes were “finished with” by DFS.
● Return the reverse of that order.
That’s what dfs_solution.cpp does.
It uses a standard DFS approach for detecting cycles: as we run DFS, if we encounter an edge from “current” node to one that is visited but isn’t finished with, that means there is a cycle.
O(n + e) where n is the number of courses/nodes and e is the number of prerequisites/edges.
Top level DFS takes O(n + e) time. On top of that we 1) build the graph from input data (which takes O(n + e) time) and 2) reverse the list list of n nodes in the end (another O(n)).
O(n + e).
Graph that we build takes O(n + e) space, and the DFS stack takes up to O(n + e), too.
O(n + e).
Input takes O(e) space, output takes O(n), and the auxiliary space used is O(n + e).
In-degree of a node in a directed graph is the number of edges that end in that node. This approach to topological sorting is based on the observation that any node with zero in-degree can be readily added to the topological ordering. Same observation in terms of the courses and dependencies: we can always start taking courses from one that has no dependencies.
So we first add any node with zero in-degree to the “answer” list. Then we “erase” that node from the graph [1]. As a result, any courses that depended on the one that we removed now don’t (and if that was their last dependency, they now have none—so zero in-degree—and can be added to the “answer” in the next round).
We repeat that (picking any node with zero in-degree, adding it to the “answer” list, removing it from the graph) until we run out of nodes with zero in-degree, and we are done. “in_degree” list is used to locate nodes with zero in-degree efficiently (saving us a loop over the remaining nodes in order to find one with zero in-degree any time we need one).
Cycle detection. If we run out of nodes with zero in-degree while some nodes still remain in the graph, it means that remaining nodes all have incoming edges. If we visually imagine such a (portion of a) graph, we can see that it necessarily has a cycle.
[1] While conceptually we remove a node and edges that start from it from the graph, in the actual sample solution modification of “edges” list happens to be unnecessary because of how “in_degree” and “nodes_with_zero_in_degree” lists are used.
O(n + e) where n is the number of courses/nodes and e is the number of prerequisites/edges.
Building the graph from input data takes O(n + e) time.
The main part of the solution consists of a FOR loop inside of a WHILE loop. Body of the WHILE loop is executed n times at worst while the body of the FOR loop - total e times at worst (even though the FOR loop itself is “attempted” up to n times). So O(n + e) total.
The intermediate data structures for representing the graph and for tracking in-degree of nodes take O(n + e) space.
O(n + e).
Input takes O(e) space, output takes O(n), and the auxiliary space used is O(n + e).