About usWhy usInstructorsReviewsCostFAQContactBlogRegister for Webinar

Course Schedule Problem

Course Schedule


There are n courses to take, they are referred to by numbers from 0 to n-1. Some of them have prerequisites, e.g. courses A and B must be completed before course C can be taken (in other words, course C depends on A and B).


Find and return an ordering in which all the given courses can be taken while satisfying all the prerequisites. If there exists more than one such ordering, any one of them would be a correct answer. If no such ordering exists, return a special value: [-1].


Example

Input: n=4, prerequisites=[[1, 0], [2, 0], [3, 1], [3, 2]]

Output: [0, 1, 2, 3]


According to the input, 

● course 0 must be done before both 1 and 2,

● courses 1 and 2 must be done before course 3.

There are two orderings in which one can take all four courses: [0, 1, 2, 3] and [0, 2, 1, 3]. Both are correct answers.


Notes

Function accepts two arguments: the number of courses n and the list of prerequisites.

Prerequisites are given in the form of a two-dimensional array (or a list of lists) of integers. Each inner array has exactly two elements, so it is essentially a list of pairs. Each pair [X, Y] represents one prerequisite: course Y has to be completed before X can be taken (X depends on Y, in other words).


Function must return an array (list) of integers.

If all given courses can be taken while satisfying all given prerequisites, the returned array must contain a possible ordering (if more than one such ordering exists, any one must be returned). Otherwise the function must return an array (list) with one element -1 in it.


Constraints:

• 1 <= n <= 10000

• 0 <= number of prerequisites <= 10000


Solution

nodes’ in-degree. They have the same time and space complexity in the big-O notation in terms of number of courses and number of prerequisites.


Both algorithms need a directed graph to work with; let us build one. Courses become graph nodes. Prerequisites become directed edges: for each “course A must be completed before B can be taken” we add an edge from A->B. In other words, an incoming edge means that another course must be completed before this one.

For example, if n=4 and prerequisites=[[1, 0], [2, 0], [3, 1], [3, 2]], the graph would look like this:

 

Having built such a graph, now a correct answer is a topological ordering of the nodes. The two sample solutions discussed below are essentially two different implementations of the topological sort algorithm.

Topological ordering doesn’t exist if the graph has a cycle. Both sample solutions detect cycles in their respective ways and return the special value in that case.


1) dfs_solution

Topological sort implementation via DFS goes like this:

● Run top level DFS (iterate through all nodes and run DFS from each unvisited one) on the graph.

● Remember the order in which the nodes were “finished with” by DFS.

● Return the reverse of that order.


That’s what dfs_solution.cpp does.

It uses a standard DFS approach for detecting cycles: as we run DFS, if we encounter an edge from “current” node to one that is visited but isn’t finished with, that means there is a cycle.


Time Complexity:

O(n + e) where n is the number of courses/nodes and e is the number of prerequisites/edges.

Top level DFS takes O(n + e) time. On top of that we 1) build the graph from input data (which takes O(n + e) time) and 2) reverse the list list of n nodes in the end (another O(n)).


Auxiliary Space Used:

O(n + e).

Graph that we build takes O(n + e) space, and the DFS stack takes up to O(n + e), too.


Space Complexity:

O(n + e).

Input takes O(e) space, output takes O(n), and the auxiliary space used is O(n + e).


// -------- START --------

// This will store a list of outgoing edges for every graph node.
vector> edges;

// A graph node may be in one of three states during a DFS run:
const int NEVER_VISITED = 0, VISITED = 1, FINISHED = 2;

// This will store those states for all nodes.
vector visited;

// This will store the ordering of nodes to be returned.
vector answer;

// Runs DFS from current_node. Returns false if a cycle is detected.
bool dfs(int current_node){
    visited[current_node] = VISITED;

    for(int connected_node: edges[current_node])
    {
        if(visited[connected_node] == VISITED)
            return false; // Cycle detected.

        if(visited[connected_node] == NEVER_VISITED)
        {
            if(!dfs(connected_node))
                return false; // Cycle detected while exploring connected_node's children.
        }
    }

    visited[current_node] = FINISHED;
    answer.push_back(current_node);

    return true; // current_node and its children do not form a cycle.
}

vector course_schedule(int n, vector > prerequisites)
{
    answer.clear();

    // Initialize the graph.
    edges.clear();
    visited.clear();

    for(int i = 0; i < n; i++)
    {
        edges.push_back(vector()); // Initial state: no edges.
        visited.push_back(0); // Initial state: no node is visited.
    }

    // Populate edges from the input data.
    for(vector prereq: prerequisites)
    {
        edges[prereq[1]].push_back(prereq[0]);
    }

    // Top level DFS: call DFS on each unvisited node in the graph.
    for(int i = 0; i < n; i++)
    {
        if(visited[i] == NEVER_VISITED)
        {
            if(!dfs(i))
                return {-1};
        }
    }

    reverse(answer.begin(), answer.end());
    return answer;
}

// -------- END --------
	

2) in_degree_tracking_solution

In-degree of a node in a directed graph is the number of edges that end in that node. This approach to topological sorting is based on the observation that any node with zero in-degree can be readily added to the topological ordering. Same observation in terms of the courses and dependencies: we can always start taking courses from one that has no dependencies.


So we first add any node with zero in-degree to the “answer” list. Then we “erase” that node from the graph [1]. As a result, any courses that depended on the one that we removed now don’t (and if that was their last dependency, they now have none—so zero in-degree—and can be added to the “answer” in the next round).


We repeat that (picking any node with zero in-degree, adding it to the “answer” list, removing it from the graph) until we run out of nodes with zero in-degree, and we are done. “in_degree” list is used to locate nodes with zero in-degree efficiently (saving us a loop over the remaining nodes in order to find one with zero in-degree any time we need one).


Cycle detection. If we run out of nodes with zero in-degree while some nodes still remain in the graph, it means that remaining nodes all have incoming edges. If we visually imagine such a (portion of a) graph, we can see that it necessarily has a cycle.


[1] While conceptually we remove a node and edges that start from it from the graph, in the actual sample solution modification of “edges” list happens to be unnecessary because of how “in_degree” and “nodes_with_zero_in_degree” lists are used.


Time Complexity:

O(n + e) where n is the number of courses/nodes and e is the number of prerequisites/edges.

Building the graph from input data takes O(n + e) time.

The main part of the solution consists of a FOR loop inside of a WHILE loop. Body of the WHILE loop is executed n times at worst while the body of the FOR loop - total e times at worst (even though the FOR loop itself is “attempted” up to n times). So O(n + e) total.


Auxiliary Space Used:

The intermediate data structures for representing the graph and for tracking in-degree of nodes take O(n + e) space.


Space Complexity:

O(n + e).

Input takes O(e) space, output takes O(n), and the auxiliary space used is O(n + e).


# -------- START --------

def course_schedule(n, prerequisites):
    # In-degrees for all n nodes:
    in_degree = [0 for _ in range(n)]

    # For each node we store a list of nodes it is connected to:
    edges = [list() for _ in range(n)]

    for dependent_course, dependency in prerequisites:
        edges[dependency].append(dependent_course)
        in_degree[dependent_course] += 1

    nodes_with_zero_in_degree = [i for i in range(n) if in_degree[i] == 0]
    answer = []
    while nodes_with_zero_in_degree:
        curr = nodes_with_zero_in_degree.pop()
        answer.append(curr)

        for connected_node in edges[curr]:
            in_degree[connected_node] -= 1
            if in_degree[connected_node] == 0:
                nodes_with_zero_in_degree.append(connected_node)

        # Even though we have removed curr node from the graph,
        # edges[curr] list doesn't need to be emptied here:
        # it will never be looked at again.

    if len(answer) == n:
        return answer
    else:
        # We ran out of nodes with zero in-degree while some nodes still remain
        # in the graph. It means that remaining portion of the graph forms a cycle.
        # No course ordering is possible.
        return [-1]

# -------- END --------