Jan 9, 2023

Undirected Graph Cycle Detection, Part I: Double-Path Finder Depth-First Search Algorithm

In an undirected graph edges can be traversed in the forward direction and in the reverse direction; how can then we detect cycles in such a graph? In this post we are going to discuss one method to solve this problem.

Icebreaker: if we were to include nodes in the digital representation of the number 6, would we find a cycle?

Background

Cycle detection in an undirected graph is presented in LeetCode's Redundant Connection problem. In this problem, we are given an undirected graph with n nodes, where n runs from 1 to n, and n edges, and we are asked to remove one edge so that the resulting graph is a tree with n nodes. In addition, the given graph resulted originally from a tree that is connected and has no cycles. Find below the examples shown in the problem:

In example 1 the output is [2, 3] because it is the edge responsible for forming a cycle; similarly, in example 2 the output is [1, 4] because it is the edge responsible for closing a cycle.

This is an interesting problem: the problem does not mention that the given graph is an undirected cyclic graph; however, if we write out some more example graphs using the givens of the problem, we'll be able to notice a common pattern (side-note: when solving problems, a whiteboard can be very useful, as you can draw example inputs for a given problem to better understand the scope of the problem and what is being asked; the computer's keyboard alone may not be enough!!!). Let's take a look at some example graphs below constructed using the givens:

From the graphs, we can notice that, when the number of nodes is equal to the number of edges, it does not matter how we arrange the edges, there will always be at least one cycle in the graph. For example, for graphs #1, #3, and #5, there is a cycle between nodes [1, 2, 5], [1, 2, 3, 4], and [4, 2, 3], respectively.

The problem asks to remove one edge so that the resulting graph is a tree of n nodes (please note that removing an edge means removing the line connecting two nodes, AND NOT the nodes themselves). For graph #1 above, if we remove either edge [4, 3] or edge[2, 3], we will end up with two components (or two trees), where one of the components will contain a cycle, as shown in the below image:

Hence, for graph #1, we need to remove any of edges [1, 5], [2, 1] or [5, 2] for the resulting graph to be a tree with no cycles. Similarly, for graph #3, we need to remove any of the edges that form a cycle, that is, any of [1, 4], [1, 2], [2, 3], or [4, 3]. Therefore, to solve the problem, we need to able to detect the cycle in the given graph!!!

One strategy to detect cycles in undirected graphs is a double-path finder depth-first search (dpf dfs) approach. Basically, if we choose any two nodes on the graph, where one node is the source, and the other node is the target, if there is more than one path for getting from source to target, then the graph contains a cycle. For example, consider graph #6 presented before: if we want to go from node 3 to node 1, there is only one path to do so, but if we want to go from node 3 to node 5, there are two different paths–either passing through node 2, or a direct path to node 5. Hence, the graph contains a cycle that includes at least the edge [3, 5]. Notice that not all edges in the graph form a cycle, and that is why we are going to use depth-first search to find at least one edge at which a cycle is formed. For graph #6, edges [3, 5], [5, 2], and [2, 3] form a cycle.

The double-path finder depth-first search approach runs as follows:

Build a graph adding one edge at a time using the provided array of edges.
Prior to adding an edge, run a dfs algorithm on the graph using the edge to see if there is a path that connects the source node to the target node from the given edge; if such a path exists, then we will have two paths that lead from source to target: the path that we just found via dfs, plus the direct path that will be formed by adding the given edge; which means that the given edge closes a cycle in the graph and, as a result, the graph contains a cycle.

The below image shows the dpf dfs approach using an example graph:

From the image, we start with n unconnected components, which, in this case, are 8 nodes (n = 8), as well as n edges, to be used for building the graph (step 0). Then we add edges to the graph one edge at a time. The goal of the approach is that, given edge [a, b], we want to know whether there already exists a path in the graph connecting nodes a and b (via dfs) before adding edge [a, b] to the graph. For such a path to exist, it is first required that both a and b have at least one edge each (or, equivalently, one neighbor). Notice that from step 1 to step 4, none of the nodes to be added have neighbors (edges). Take step 3 as an example: before adding edge [5, 6], none of nodes 5 and 6 have neighbors (no connections), therefore we do not run dfs on edge [5, 6] before adding edge [5, 6] to the graph.

When the nodes of a given edge each have neighbor(s) in the graph, we run dfs on the edge before adding the edge to the graph. Take step 5 as an example: each of nodes 2 and 4 in edge [2, 4] have one neighbor, so we run dfs using node 2 as the source, and node 4 as the target.

If, in the dfs process, we find a path that connects the nodes from the given edge, subsequently adding the given edge will close a cycle on the graph. If no path is found, we simply add the edge to the graph. For example, in step 6, running dfs of edge [2, 5] before adding the [2, 5] edge results in no path found connecting nodes 2 and 5; then we add edge [2, 5] to the graph. However, on step 8, running dfs of [5, 7] results in a path found that connects nodes 5 and 7, namely, the path passing through node 6; hence adding edge [5, 7] will result in a cycle being closed at that edge.

Ok, let's now put all of this into code.

Coded Solution

The code implementation is presented below:

function findRedundantConnection(edges) {
  const max_nodes = 1000;
  const graph = new Array(max_nodes + 1).fill(0).map(() => []);
  let visited;

  for (let [u, v] of edges) {
    visited = new Set();
    if (graph[u].length && graph[v].length && dfs(u, v)) {
      return [u, v];
    }
    graph[u].push(v);
    graph[v].push(u);
  }

  function dfs(src, target) {
    if (!visited.has(src) {
      visited.add(src);
      if (src === target) return true;
      for (let neighbor of graph[src]) {
        if (dfs(neighbor, target)) return true;
      }
    }
  }
};

We use an array to represent the graph (line 3), where the indices of the array represent the vertices (or nodes) of the graph, and the array values represent the nodes' neighbors, initialized as empty arrays. The maximum number of nodes allowed in the problem is 1000, so we use an array of length 1001 to cover all possible cases.

Prior to building the edges of the graph (lines 11 and 12), we first check whether we can run a dfs algorithm on the given edge (line 8); the dfs algorithm returns true if it finds a path that starts at src, running through src's neighbor or src neighbor's neighbor and ends at target. Otherwise, the dfs algorithm returns false. We also initialize a new set at the start of every edge iteration (line 7) to be used by the dfs algorithm.

The problem guarantees that the graph contains at least one cycle; hence, at some point, the dfs algorithm will return true and, at that point, we return the given edge as the final answer as that edge forms a cycle and can be removed.

Time Complexity and Space Complexity

Notice that, in the event of one cycle in the graph found at the last edge of the graph, the dfs function will be called for every node on the graph at least once and, for every node, the node's neighbors will be visited at most once. This gives us an overall time complexity of O(V + E), where V is the number of nodes, and E is the number of edges.

The length of the array used to represent the graph will be equal to the number of nodes, taking O(V) space and, for every node, the node's neighbors are added to an array, taking O(2E) space (undirected graphs contain bidirectional edges). Hence, the overall space complexity for the algorithm is O(V + 2E), which reduces to O(V + E).

Summary

One way of detecting cycles in an undirected graph is through depth-first search (dfs). For a given [source, target] edge, the dfs can be tuned to detect whether a path between the source node and target node exists in the graph before adding the [source, target] edge to the graph. If such a path exists, a cycle has been found.

Answer to the icebreaker question: yes, the image represents an undirected graph, with a cycle formed among nodes [1, 2, 3, 4].