About usWhy usInstructorsReviewsCostFAQContactBlogRegister for Webinar

Depth-first Search (DFS) Algorithm

If you are preparing for a technical interview, you must know the basic graph traversal techniques, as they are quite popular in software engineering technical interviews. In this article, we will discuss one such graph traversal algorithm — depth-first search (DFS).

Many real-world problems are modeled with graphs. Hence, it is expected of software engineers and developers to have good command over graph traversal algorithms such as DFS and BFS. Graph traversal algorithms are the backbone of many other easy and complex graph algorithms. 

This article will cover the following DFS concepts:

  • What Is Graph Traversal?
  • What Is DFS?
  • Applications of DFS 
  • How Does DFS Work?
  • DFS Algorithm
  • DFS Example
  • Different Types of Edges in DFS
  • DFS Pseudocode
  • DFS Code
  • DFS Complexities 
  • DFS for Disconnected Graph
  • DFS vs. BFS
  • FAQs on DFS

What Is Graph Traversal?

Traversing a graph means visiting all vertices and edges of a graph at least once in a well-defined order. The order in which the algorithm visits the nodes is important. Different problems may require different orders of traversals.

What Is DFS?

Depth-first search (DFS) is a recursive algorithm for traversing a graph. It uses the idea of exhaustive search —  it will keep moving deeper into the graph until that particular path is entirely exhausted (in other words, a dead end is found). 

It is used to solve many interesting problems, such as finding a path in a maze, detecting and breaking deadlocks in operating systems, and many others. 

Applications of DFS 

Here are a few (among many) applications of DFS:

  • Finding whether two nodes present in a graph belong to the same component
  • Checking if the graph is bipartite
  • Finding the topological ordering of the vertices of a graph
  • Finding the strongly connected components of a graph
  • Finding the spanning tree of an unweighted graph
  • Finding bridges in graphs
  • Finding articulation points in graphs
  • Finding a path in a maze and similar puzzles
  • Analyzing networks and mapping source-destination route

How Does DFS Work?

DFS starts traversing the graph from a starting node (provided as input to the algorithm). It chooses a path starting from this node, and goes as deep as it can. After exhausting that path, it backtracks until it finds another unexplored path and then explores it recursively. 

It keeps recursing until the complete graph has been explored or until the node we were looking for is found.

DFS Algorithm

  • Step 1: Create an empty boolean array representing the nodes’ state (visited or unvisited). Initially, all nodes are unvisited, so the entire array is initialized with false.
  • Step 2: Choose a starting node. Then initiate a recursive function with the starting node as a parameter. 
  • Step 3: Inside the recursive function, mark the node from the parameter as visited. After that, loop through the neighbors and call the same recursive function with all unvisited neighbors as parameters. 

DFS Example

We’ll use the following graph for illustration: 

DFS Example


We start from node 0, mark it visited, and then travel towards node 1.


On reaching node 1, we see it’s unvisited — we mark it visited and move towards its child node 4.

Node 4 is also unvisited; so, we mark it visited. 

As node 4 does not have any child, we backtrack to node 1.

Node 1 does not have any child left to explore (in other words, we have exhausted node 1). Therefore, we backtrack to node 0.

Node 0 is not yet completely explored — we travel to its child, node 2, and mark it visited. 

Node 2 does not have any neighbors — we backtrack to node 0.

Node 0 has a neighbor node 3, which is not visited. So, we explore it.

Node 3 has a neighbor, node 2, but since it's already visited; so, we don’t go over it again and backtrack to node 0. Node 0 does not have any unvisited neighbors left.  

In this manner, we will visit the complete graph using DFS. 

DFS Tree of a Graph

If we remove all edges other than the green ones, we will get a tree — this is called the DFS tree for this graph, and all the green edges are the only edges of that tree. The image below shows the DFS tree for the above graph.

Different Types of Edges in DFS

We can divide the graph edges into the following kinds:

  • Tree edge: An edge that is present in the DFS tree after applying DFS on the graph. 
  • Forward edge: An edge that is not present in the DFS tree after applying DFS, and it connects a node “u” to one of its successors.
  • Back edge: An edge that is not present in the DFS tree after applying DFS, and it connects a node “u” to one of its ancestors. Its presence indicates a cycle in directed graphs.
  • Cross edge: An edge that is not present in the DFS tree after applying DFS, and it connects a node “u” to another node, where there is no parent-child relationship between the two nodes.

Consider the below-directed graph for a better understanding. Applying DFS on it will give the following order: 

0 1 2 3 4

You can see the four types of edges in the following figure:

DFS Pseudocode

The DFS pseudocode is very short and concise. Go ahead and implement it in any programming language of your choice.

The graph is stored in G, and the starting node is u.

Recursive Pseudocode: 


DFS (Graph g, node u):
        mark u as visited
        for neighbors adj_node of u in Graph G:
            if adj_node is not visited:
                DFS (G, adj_node)

Iterative Pseudocode: 


DFS(G, u): 
      let St be stack
      Push u in the stack
      mark u as visited.
      while ( St is not empty)
          v  =  Node at the top of stack
          remove the node from stack
         for all neighbors adj_node of v in Graph G:
            if adj_node is not visited :
                     mark adj_node as visited
                     push adj_node in stack

Note: Recursive and iterative DFS might traverse the graphs in different but correct orders. 

DFS Code With Example

We are using C++ to implement the recursive and iterative DFS algorithm. Try and implement it on your own before seeing the code!

Recursive Code


#include 
#include 

using namespace std;

// DFS function for node u and graph adj
void dfs(vector adj[], int u, vector& visited)
{
    // Mark u visited
    visited[u] = true;
    cout << u << " ";

    // Travelling over all the adjacent/neighboring nodes
    for (int i = 0; i < adj[u].size(); i++)
    {
        // v is the adjacent node
        int v = adj[u][i];

        // If v has not been visited before
        // then perform DFS on it again
        if (visited[v] == false)
        {
            dfs(adj, v, visited);
        }
    }
}

// Driver Code
int main()
{
    // Number of edges
    int n = 5;

    // Adjacency List for a Directed Graph
    vector adj[n];
    adj[0] = {1, 2, 3};
    adj[1] = {4};
    adj[2] = {};
    adj[3] = {2};
    adj[4] = {};

    // Initialize the visited array with false
    vector visited(n, false);

    cout << "DFS traversal for given graph: ";

    // Let’s start the DFS from node 0
    dfs(adj, 0, visited);

    return 0;
}

Iterative Code


#include
#include
#include

using namespace std;

// Function to do iterative dfs over graph stores in adj
void dfs(vector adj[], int startingNode, vector& visited)
{
    // Initializing stack
    stack st;

    // Pushing Starting Node in stack and marking it visited
    st.push(startingNode);
    visited[startingNode] = true;

    while (!st.empty())
    {
        // popping the next vertex to be visited from stack
        int u = st.top();
        st.pop();
        cout << u << " ";

        // Iterating over the neighbors of u
        // and pushing unvisited one in stack
        // and marking them as visited
        for (int i = 0; i < adj[u].size(); i++)
        {
            int v = adj[u][i];
            if (visited[v] == false)
            {
                st.push(v);
                visited[v] = true;
            }
        }
    }
}

// Driver Code
int main()
{
    // Number of edges
    int n = 5;

    // Adjacency List for Directed Graph
    vector adj[n];
    adj[0] = {1, 2, 3};
    adj[1] = {4};
    adj[2] = {};
    adj[3] = {2};
    adj[4] = {};

    // Initialize the visited array with false
    vector visited(n, false);

    cout << "DFS traversal for given graph: ";

    // Let starting node is 0
    dfs(adj, 0, visited);

    return 0;
}

Output:

Output for recursive code:

DFS traversal for given graph: 0 1 4 2 3

Output for iterative code:

DFS traversal for given graph: 0 3 2 1 4

Note: It might seem like one of the two codes is wrong — but both are absolutely correct as both algorithms traversed the graph in a depth-first order. 

DFS Complexities

Time Complexity 

The time complexity for the DFS algorithm depends on the data structure in which the graph is stored. 

If we use an adjacency list to represent the graph, its time complexity ends up as O(V+E), where V is the number of vertices in the graph and E is the number of edges in the graph.

  • Proof: If you look carefully, you will observe that our algorithm travels over each node and each edge exactly once.

If we use the adjacency matrix to represent the graph, its complexity equals O(V2), where V is the number of vertices in the graph.

  • Proof: Here too, we travel over each edge and each node exactly once. But to find the next neighboring node, we need to traverse over all other nodes. 

Special case: When the graph is a tree, the number of edges E is V - 1. In this case, the time complexity becomes O(V) if we use an adjacency list to represent the tree.

Space Complexity

The DFS algorithm’s space complexity is O(V), excluding the memory consumed by the graph representation, where V is the number of nodes in the graph.

Note: Graph representation takes O(V+E) memory when using adjacency list and O(V2) memory when using adjacency matrix

  • Proof: We need to allocate memory only for marking the nodes that have been visited. As there are V nodes, we would require O(V) memory. 

Furthermore, the depth of the recursion tree also matters as it consumes stack memory. The maximum depth for the recursion tree is the longest path from the starting node to any other node in its connected component, which can be O(V) in the worst-case.

Special case: If we know beforehand that graph is a tree, we can implement DFS without using O(V) memory for the array to keep track of the visited vertices. However, recursion will still consume some memory.
We can do this by keeping track of the parent of the current node being explored. And when searching for an unvisited neighbor node, we will skip the parent node (as we already would have visited that node). 

Code for DFS Function for a Tree in C++


// DFS function for node u and tree adj
// We can assume that the parent of the root is -1, a hypothetical node
void dfs(vector adj[], int u, int parent)
{
    cout << u << " ";

    // Travelling over all the adjacent/neighboring nodes
    for (int i = 0; i < adj[u].size(); i++)
    {
        // v is the adjacent node
        int v = adj[u][i];

        // If v is not the parent of u, 
        // only then will we apply DFS on v
        if (v != parent)
        {
            dfs(adj, v, u);
        }
    }
}

DFS for Disconnected Graphs

In our implementation above, if the graph is disconnected, the code will not visit nodes that are disconnected from the starting node. 

For example, consider the below graph:

If we apply the above implementation of DFS on this graph (in which we have considered node 0 as the starting node), we will end up not visiting node 3 and node 4 as they are not in the same connected component as node 0.

So, what can we do to make our code work?

We’ll assume every node to be a starting node and apply DFS only on those nodes that have not been previously visited.

DFS Code for Disconnected Graphs


#include
#include

using namespace std;

// Function to do Depth-First Search on graph stores in adj
void dfs(vector adj[], int u, vector& visited)
{
    // Mark u visited
    visited[u] = true;
    cout << u << " ";

    // Travelling over all the adjacent/neighboring nodes
    for (int i = 0; i < adj[u].size(); i++)
    {
        // v is the adjacent node
        int v = adj[u][i];

        // If v has not been visited before
        // then perform DFS on it again
        if (visited[v] == false)
        {
            dfs(adj, v, visited);
        }
    }
}

// Driver Code
int main()
{
    // Number of edges
    int n = 5;

    // Adjacency List for a Directed Graph
    vector adj[n];
    adj[0] = {1, 2};
    adj[1] = {};
    adj[2] = {};
    adj[3] = {3};
    adj[4] = {};

    // Initialize the visited array with false
    vector visited(n, false);

    cout << "DFS traversal for given graph:" << endl;

    // For disconnected graphs
    for (int i = 0; i < n; i++)
    {
        // Let’s start a DFS from every which has not been visited yet
        if (visited[i] == false)
        {
            dfs(adj, i, visited);
            cout << endl;
        }
    }

    return 0;
}

Output: 

DFS traversal for given graph: 

0 1 2 

3 4

BFS vs. DFS

Following are a few differences between BFS and DFS:

To learn more, check out our article on breadth-first search.

FAQs on DFS

Question 1: How to detect a cycle in a directed graph using DFS?

Answer: We need to check if there exists a back edge in a graph. As we discussed earlier, the back edge is an edge from the graph that is not present in the DFS tree after applying DFS, and it connects a node “u” to one of its ancestors. We can keep track of the active nodes in another array while running the DFS. Those nodes that are placed on the stack but have not been removed are considered active. 

During DFS, when a new node is visited, all of its ancestors should be active. Also, all of the active nodes must be the new node’s ancestors. So, whenever we explore any new node, if we find an edge directed from the new node to any of the active nodes, we can conclude that the given graph has a cycle. 

Proof: Suppose the back edge connects node “u” with node “v” and it goes from “u” to “v.” Without loss of generality, also suppose that “v” was discovered before “u”. As “v” and “u” are both active, there must be a path in the DFS tree from “v” to “u.” That path only uses tree edges. On the other hand, we just discovered another path from “u” to “v” using a back edge. So, in this graph, there exists a path that starts from “u” and comes back to “u.” Hence, it must be a cycle. 

Code snippet in C++ for DFS function for finding cycle in directed graph:


// DFS function to find cycle in directed graph stored in adj
// Function returns true if directed graph contains a cycle
// Else it return false
bool dfs(vector adj[], int u, vector& visited, vector& active)
{
    // Mark u visited and active
    visited[u] = true;
    active[u] = true;

    // Travelling over all the adjacent/neighboring nodes
    for (int i = 0; i < adj[u].size(); i++)
    {
        // v is the adjacent node
        int v = adj[u][i];

        // If v is in recursion stack, we have found a cycle
        // Also, there exists a backedge from u to v
        if (active[v] == true)
        {
            return true;
        }

        // If v has not been visited before then perform
        // DFS on it again and check for cycle
        if (visited[v] == false && dfs(adj, v, visited, active) == true)
        {
            return true;
        }
    }

    // Mark u as inactive as it will pop out of recursion stack
    active[u] = false;
    return false;
}

Question 2: How to detect a cycle in an undirected graph using DFS?

Answer: An undirected graph is considered acyclic if and only if it’s a tree. So, the problem boils down to finding out if the given graph is a tree. To check this, we can start a DFS from any node, and for every node, keep track of its parent node in the DFS tree. If during DFS we visit a node twice (excluding the parent), we can conclude that the given undirected graph is cyclic. 

Question 3: Why does DFS use a “stack” data structure? Can it use a queue data structure?

Answer: DFS explores any path as long as possible. When it hits a dead end, it backtracks to the just-previous state. So, we need a data structure that can give us the most recent element (or call). The stack serves this purpose — Last In First Out (LIFO). The queue cannot be used to implement DFS as it is based on the opposite concept — First In First Out (FIFO).

Question 4: Can we use DFS to solve the shortest path distance between two nodes in unweighted graphs?

Answer: No. 

Let’s take an example: from the starting node, if node “u” is visited before node “v,” there’s no guarantee that u is closer to the starting node. Therefore, DFS cannot be used to find the shortest distance between two nodes in polynomial time.

Are You Ready to Nail Your Next Coding Interview?

Problems based on graphs and its traversal feature in almost every coding interview for software developers. If you’re looking for guidance and help to nail these questions and more, sign up for our free webinar

As pioneers in the field of technical interview prep, we have trained thousands of software engineers to crack the toughest coding interviews and land jobs at their dream companies, such as Google, Facebook, Apple, Netflix, Amazon, and more!

Sign up now!

---------

Article contributed by Divyansh Gupta