Interview Kickstart has enabled over 3500 engineers to uplevel.

Learning the basics of graphs and graph theory is key for coding engineers at any stage in their career. But, graph theory can be intimidating. Even if you are an experienced software developer preparing for a technical interview for your dream job, graph theory can make you anxious.

In this article, we will cover the basics to help you with your prep. We’ll discuss:

- Introduction to Graph Theory
- The History of Graph Theory
- Real-world Applications of Graph Theory
- Types of Graph
- What Is a Tree?
- Tree vs. Graph
- Some Key Concepts
- Graph Theory Data Structures
- Strengths and Weaknesses of Graphs
- Examples of Tech Interview Problems/Questions on Graph Theory
- Graph FAQs

Graph theory is a branch of mathematics that deals with a network of points that are connected via lines. Given a set of nodes that are connected in some way via edges, Graph theory is essentially the study of their relationships. Nodes and edges are a great abstraction for many real-life problems. Examples include problems involving arrangement, networking, and optimization relevant to city layouts, social media reach, navigation, and political campaigns. We’ll start by becoming acquainted with some important concepts in graph theory.

A graph is a data structure that contains a finite set of vertices/nodes and a set of directed or undirected edges connecting them. An edge is represented in the format (U, V). Here, the edge connects U and V. U is the source node, and V is the destination node.

- If the edges are directed, it’s called a directed graph or digraph. If they aren’t directed, it's called an undirected graph.
- A graph with no loops and any two nodes connected by at most a single edge is called a simple graph. A simple graph may be connected or disconnected. When two nodes are connected by more than one edge, it’s called a multigraph. Unless specified, we can assume a graph is a simple graph.
- If each node is connected to all other nodes via an edge, the graph is called a complete graph. Hence if we know the number of nodes present in a complete graph, we also know its general structure.

For a directed graph, the degree of a node refers to the number of edges coming into or going out from a node. The number of incoming edges is called in-degree, and the number of outgoing edges is called out-degree. For an undirected graph, the degree of a node again refers to the number of edges associated with a node. However, in an undirected graph, the edges are bidirectional, and all edges are counted towards the degree. We can then conclude a few points based on this definition:

- In a connected simple graph, all the nodes will have a degree of at least 1.
- In a complete graph having N nodes, the degree of each node will be N-1.
- To the degree of a node that is a part of a self-loop, the loop contributes +2 to the degree of the node.

Path refers to a sequence of edges that connect a sequence of nodes. A path may be a single edge connecting two nodes or multiple edges that are connected through multiple nodes.

- If a path always exists from any point to any other point in a graph, it’s called a connected graph.
- If a path starts and ends at the same node, but no edge was traversed more than once, it’s called a circuit. A circuit is hence a closed path.
- A circuit where each edge is traversed exactly once and every node is traversed as well is referred to as an Eulerian circuit. The graph, in this case, is known as an Eulerian graph, which means the graph has all nodes with only even degrees and is also a connected graph.
- A path that visits every node once with no repeat visits is called a Hamiltonian path. A circuit that visits every node once with no repeat visits is called a Hamiltonian circuit, and being a circuit, it starts and ends at the same node.

Graph theory’s origins have been traced to be in the 18th century, around the year 1735. It started as recreational puzzles in mathematics that gained traction, became important, and have now evolved into a valuable area of mathematical research.

Leonhard Euler, a Swiss mathematician, physicist, astronomer, geographer, logician, and engineer, proved the solution for one such old puzzle, the Königsberg bridge problem. This Euler’s proof also proved what is now known as the first theorem of graph theory. Many mark this proof as the origin of graph theory.

While Euler successfully proved his result, techniques, and tests that would help him establish his assertion in a robust mathematical way weren’t developed well. So for decades, there was little to no progress in graph theory. Now, with the tools and techniques we have available in modern times, we’re really starting to explore the full potential of graph theory.

- Navigation and route optimization (GPS)
- Molecular and atomic study
- Network security
- Social networking: Friend recommendations
- Tracking the spread of diseases like COVID
- Social Sciences Tracking and Surveys
- Sequencing DNA
- Course Scheduling

Graphs can be separated into types based on various factors. Understanding the type of graph a graph problem should be represented by and understanding the graph’s properties are crucial to solving it. We’ll now discuss some of the major types of graphs.

**Directed Graphs: **In directed graphs, all edges are directed. This means that we can traverse in only one direction. So if node A connects node B through a directed edge from A to B, we can traverse from A to B but not B to A.

**Undirected Graphs: **In undirected graphs, all edges are bidirectional. So, if node A and node B are connected, we can traverse from A to B as well as B to A. Straight lines, not arrows, represent the edges. Undirected graphs can be used to represent mutual/bidirectional relationships such as bidirectional roads in a city map with roads represented as edges and locations represented as nodes.

In the above figure, for an undirected graph, each node could represent different cities, and the edges show the bidirectional roads. In the directed graph shown above, we can represent not only bidirectional roads but one-way roads as well, since here, we use directed edges instead of bidirectional edges.

**Weighted Graphs: **A weight is a numerical value that can be associated with an edge. These weights associated with the edges represent real criteria like the importance, cost, quantity, popularity, and distance involved in that edge traversal. Both directed and undirected graphs can be weighted. We can represent tolls on each of the roads using weights. We can also represent the distance or time required to travel between two locations as weights. GPS navigation systems use weighted graphs to do these calculations and suggest the optimized paths.

**Unweighted Graphs: **If the edges do not carry weights, as we saw in the plain examples of directed and undirected graphs, the graph is said to be unweighted. We can use unweighted graphs when the properties of the route or edge (barring their directional nature) are not relevant. In the case of unweighted graphs, we focus on which nodes are connected, and we may care about the directional or cyclic nature of the connections.

**Cyclic Graphs: **If a graph contains one or more cycles, it is deemed a cyclic graph. In other words, a cyclic graph contains at least one node, which has a path that connects it back to itself. If the cyclic graph contains only one cycle, it is called a unicyclic graph.

**Acyclic Graphs: **An acyclic graph, on the other hand, has no cycles. Since there are zero cycles, no node has a path to be traced back to itself. An undirected graph that’s acyclic is a forest graph.

A directed graph with no directed cycles is called a directed acyclic graph. By directed cycles, we mean that following the directions in the graph will never form a closed loop. So in the figure shown above, the first directed graph is acyclic, and the second directed graph is cyclic. Note that just the direction of one edge has been reversed to create the cyclic graph out of the acyclic graph. Just that one change leads to a directed cycle being formed in the graph.

To connect the concept of a tree and graphs, if a connected undirected graph is also acyclic, it is a tree graph. We’ll discuss both trees and forests more in the coming sections. We’ve just answered the question: “When is a graph a tree?”. We’ll now look at what this means in more detail.

We’ve just learned that graphs and trees are related to each other as data structures and that a connected undirected acyclic graph is what we call a tree. Let us unpack what that means. A graph is a tree if it:

- Does not contain a cycle
- Is undirected
- Is connected (for each node in it, there exists a path from that node to any other node present in the graph)

Trees are used to represent hierarchical structures in graphical forms. A tree is an undirected graph unless mentioned otherwise. A directed graph, however, can be labeled a directed tree graph (polytree) if:

- It is also an acyclic graph.
- The underlying undirected graph (obtained by converting all edges into undirected edges) of this directed acyclic graph (DAG) is a tree.

**Tree plot (diagram):**

**Tree graph theory: **

- The node at the top of the tree is called a root node. (a)
- The edges of a tree graph are appropriately called branches, and the vertices, as usual, are called nodes. (abcdefg)
- Any sub-node of a given node is called a child node, and the concerned node is called the parent node of the subnode. (bce are child nodes of their parent node a)
- Child nodes of the same parent are called sibling nodes. Sibling nodes are at the same hierarchical level. (bce are sibling nodes)
- Nodes with different parents but which are at the same hierarchical level are called cousin nodes. (g and d, g and f are cousin nodes)
- The nodes at the lowermost level have no child nodes and are termed leaf nodes. (gdf)
- If a tree has N nodes, it has N-1 edges; this is because to have even one more edge as an undirected connected graph, one would have to create a cycle, which means the graph will cease to be a tree.

As we’ve seen, a tree is a specific type of graph. Anything true for a graph would be true for a tree, but trees would be a special case of graphs. So let us see where the differences lie.

Now that the basics are covered, let’s look at some important concepts related to graph theory and trees: subgraph**, forest, spanning tree, circuit rank, and handshaking lemma. **

A sub-graph is a graph all of whose nodes and edges are contained the same way, in a larger graph, which may have additional nodes and/or edges as well. For example:

An undirected acyclic graph is called a forest graph. In other words, a forest is an undirected graph whose connected components or sub-graphs are trees. The image of a forest may look like multiple sub-graphs/trees, but it represents just a single disconnected graph. We can also say that a forest is a disjoint collection of trees.

Let S be a sub-graph of a connected graph G. S is called a spanning tree of G if it’s a tree that contains all the vertices of graph G. For an undirected graph G’, if S’ is a subgraph and contains all vertices of G’ covered using the minimum possible number of edges, S’ is a spanning tree of G’. If there are N nodes in a graph, its spanning tree will always have N-1 edges. A graph may have multiple spanning trees, but if the graph is a disconnected graph, it will not have a spanning tree at all.

The circuit rank of an undirected graph refers to the minimum number of edges that must be deleted from the graph to break all its cycles and get a tree or a forest.

For a connected graph G consisting of N nodes and E edges, a spanning tree S of graph G will contain N-1 edges. So the number of edges to be deleted to get a spanning tree would be the number of edges-(N-1) or E-N+1, which is the circuit rank of graph G.

Handshaking lemma states that any finite undirected graph will always have an even number of nodes with an odd degree. So for a finite undirected graph, the statement: “Odd number of edges touch (go into/out of) this node” will always be true only for an even number of nodes in the undirected graph.

We’ll now look at some data structures used in graphs for representation.

As the name indicates, an adjacency matrix is a matrix that helps us find if any two specific nodes are adjacent, i.e., directly connected via an edge. Given any graph G with N nodes, its adjacency matrix A is a two-dimensional array of size N x N.

The adjacency matrix A[N][N] can be filled by going to each position A[i][j] in it, seeing if nodes i and j have a direct edge connecting them and storing a value appropriately indicating the result at A[i][j].

**Example:**

**The adjacency matrix of the graph: **

For simple graphs, adjacency matrix A is a Boolean matrix. In that case, if i and j are connected via a direct edge, A[i][j] stores 1 (edge contributes +1 and loop contributes +2). If they aren’t connected, it stores 0. For weighted graphs, if a direct edge is found, the value stored is the weight of that edge. If an edge isn’t found in the case of weighted graphs, it’ll contain a specific value that can clearly indicate that. A very large or negative number can be used for this purpose, so it can’t match with and hence, doesn’t get confused with a weight stored of the same value.

An adjacency list represents a graph as an array of linked lists. It works by storing the neighboring nodes for each node. An adjacency list is created by making an array of size N, where N is the number of nodes in the graph. And each element A[i] of that array is a linked list containing all the nodes that are directly connected to the node with index i. For weighted graphs, the linked list will additionally store the edge weight along with the neighboring nodes.

**Example: **

Edges list represents a graph as a list of edges. In the edges list, all the edges are stored inside a linked list. Each object holds two nodes (U, V), indicating that they are directly connected by an edge. For weighted graphs, each object will additionally store the weight of the edge between U and V along with the nodes U and V. If the graphs have many nodes but few edges, it can be a more compact, clearer representation.

**Example:**

**Strengths:**

- Graphs provide well-developed mathematical models for reasoning for complicated large-scale relationships that can be represented using nodes and edges.
- Most efficient data structures and algorithms with major impact make use of graphs in some way.

**Weaknesses:**

- Multiple graphs are possible for a particular network, and it’s difficult to model multiple relationships.
- Some algorithms using graphs are complex and can be slow.

- Traveling salesman problem
- Seven bridges of Königsberg problem
- Map-coloring problem
- Four-colour map problem
- Route optimization problem
- For a given fixed chess configuration with you having only a queen and the opponent having only a pawn, figure out the minimum number of moves in which the queen can kill the pawn.
- Find if a cycle is present in a given undirected graph.
- Convert a sorted list into a binary search tree.

To learn more, visit ourinterview questionsandproblemspages.

**Question 1: When can a directed graph be a tree?**

A tree must be an undirected graph. However, if a directed graph is also acyclic (is a DAG), and its underlying undirected graph is a tree, the graph will be called a directed tree or polytree.

**Question 2: What is the order and size of a graph?**

The number of nodes is known as the order of the graph, and the number of edges in the graph represents the size of a graph.

If you’re looking for guidance and help to nail your next technical interview, sign up for our free webinar. As pioneers in the field of technical interview prep, we have trained thousands of software engineers to crack the toughest coding interviews and land their dream jobs at Google, Facebook, Apple, Netflix, Amazon, and other Tier-1 tech companies.

Join our webinar to learn more!

**----------**

*Article contributed by Tanya Shrivastava*