Disjoint Set Union (Union Find)

#algorithms

Introduction#

Disjoint Set Union (DSU) is a data structure which keeps track of elements partitioned in non overlapping subsets (disjoint sets).

It has the following operations:

  1. Create a set from new element
  2. Find the representative of an element in a set
  3. Combine any two set

Structure#

Each set have one representative element. A set can be visualized as a tree with their root element as the representative and each element pointing towards their parent (the root will point to itself). Hence, to find the representative we just need go up the tree.

We will need to maintain an parent array that will store the parent of each index (element). The representative can be identified if parent[i] = i.

To improve the efficiency of operations, we also maintain a size/rank array that stores the size/rank of each set (tree).

Code#

class DSU {
 private:
  vector<int> parent, sz;

 public:
  void makeSet(int node) {
    parent[node] = node;
  }

  DSU(int n) {
    parent.resize(n), sz.resize(n, 1);
    for (int i = 0; i < n; i++) {
      makeSet(i);
    }
  }

  int findRep(int node) {
    if (parent[node] == node) return node;
    return parent[node] = findRep(parent[node]);  // path compression
  }

  void merge(int a, int b) {
    a = findRep(a), b = findRep(b);
    if (a != b) {
      int sa = sz[a], sb = sz[b];
      if (sa > sb) swap(sa, sb);
      parent[a] = b;  // union by size (bigger will be the representative of merged set)
      sz[b] += sz[a]; // only need to maintain the size of representative
    }
  }
};

After both path compression and union by size optimizations, final amortized time complexity of queries is O(α(n)), where α(n) is the inverse Ackermann function, which grows very slowly (doesn't exceed 4 for all reasonable n).

Applications#

  1. Krushkal's algorithm to find the minimum spanning tree for a connected weighted graph
  2. Painting subarrays offline