What is union-find and what problems does it solve?

Union-find, also called disjoint set union (DSU), is a data structure that tracks a partition of a set of elements into disjoint groups. It supports two operations efficiently: find (which group does this element belong to) and union (merge the groups of two elements). It is the optimal data structure for dynamic connectivity problems: detecting when two nodes are connected, counting connected components, and detecting cycles in undirected graphs.

What are path compression and union by rank, and why do they matter?

Path compression makes the find operation amortize to near O(1) by flattening the tree: when finding the root of an element, update all nodes along the path to point directly to the root. Union by rank (or union by size) keeps the tree shallow by always attaching the smaller tree under the root of the larger tree. Together, they give O(alpha(n)) amortized time per operation, where alpha is the inverse Ackermann function, effectively constant for all practical inputs. Candidates who implement union-find without these optimizations typically lose points in FAANG interviews.

When should I use union-find versus BFS or DFS for graph problems?

Use union-find when edges are added dynamically and you need to answer connectivity queries online (after each edge addition). Use BFS or DFS when edges are all known upfront and you are doing a single traversal. Union-find is also preferred for Kruskal's minimum spanning tree algorithm because it processes edges in sorted order and needs to detect cycles incrementally. BFS and DFS cannot efficiently handle the dynamic edge insertion case.

What are the most common union-find problem categories in interviews?

Based on real interview reports in LeakCode's database, the most common union-find categories are: counting connected components in a graph, detecting cycles in an undirected graph, minimum spanning tree via Kruskal's algorithm, number of islands or grid connectivity problems that update dynamically, redundant connection detection, and accounts merge problems where you group related identifiers. Grid connectivity and cycle detection together represent the majority of reported union-find questions.

How does LeakCode help with union-find prep?

LeakCode aggregates 51,000+ real interview reports from sources like 1Point3Acres, Blind, LeetCode Discuss, and Glassdoor. You can filter union-find questions by company to see which specific problem types each company asks. Google and Meta lean toward harder graph problems that combine union-find with other techniques, while Amazon tends to ask more straightforward connectivity and cycle detection variants.

Union-Find (Disjoint Set) Interview Questions: Complete Guide (2026)

What union-find interview questions actually test, which problem categories appear most at top companies, and how LeakCode's data from 51,000+ reports helps you prepare the right depth.

Why union-find appears in FAANG interviews

Union-find occupies a specific niche in algorithm interviews: it is the most efficient known solution for dynamic connectivity problems, and implementing it correctly from scratch (with path compression and union by rank) demonstrates fluency with pointer-style data structure manipulation, amortized analysis awareness, and graph reasoning.

In LeakCode's database of 51,000+ real interview reports, union-find questions are more specialized than graph traversal questions but appear consistently in mid and senior-level coding rounds at Google, Meta, and Amazon. The data shows that candidates who implement union-find with both optimizations pass at significantly higher rates than those who implement a naive version, even when both pass all test cases.

A particularly common interview pattern is using union-find inside a larger problem (grid processing, account merging, network monitoring) where the candidate must recognize that connectivity is the core subproblem before they can select the right data structure.

Main union-find problem categories

Connected components counting

Count the number of distinct connected components in a graph or grid. Initialize each node as its own component. For each edge, union the two endpoints. The number of unique roots at the end equals the component count. Solvable with BFS or DFS as well, but union-find is more efficient when edges are added incrementally. Commonly asked at Amazon and Microsoft.

Cycle detection in undirected graphs

Detect whether adding an edge creates a cycle. If both endpoints of a new edge already belong to the same component (find returns the same root), adding the edge would create a cycle. Used in Kruskal's MST algorithm to skip edges that would form cycles. Appears in redundant connection problems, network design questions, and dependency graph analysis.

Minimum spanning tree (Kruskal's algorithm)

Sort edges by weight. For each edge in order, union the two endpoints if they are not already connected. The first n-1 edges that do not form a cycle form the MST. Union-find makes the cycle check O(alpha(n)) per edge. Understanding both Kruskal's (edge-centric, union-find) and Prim's (vertex-centric, heap) and knowing when each is preferred is expected in senior-level graph rounds.

Dynamic grid connectivity

Problems where land or connections are added to a grid dynamically and you must track connected regions after each addition. Number of islands II is the canonical example: cells are added one by one and you must report the island count after each addition. Union-find handles this in O(alpha(n)) per addition versus O(n) for a full BFS after each step. Heavily reported in Google and Meta hard rounds.

Entity grouping and account merging

Group accounts or identities that share a common attribute (same email address, same phone number) into unified records. For each shared attribute, union the associated accounts. The connected components at the end are the merged accounts. This disguises union-find as a data processing problem and is a common Amazon and Meta interview variant that tests whether candidates recognize the underlying graph structure.

How to implement union-find correctly

A minimal correct implementation requires three components: an array mapping each element to its parent, a rank or size array for union by rank, and the two operations with their optimizations applied.

1.Path compression in find. When finding the root, set every node along the path to point directly to the root. One-liner recursive: parent[x] = find(parent[x]). This flattens the tree for future queries.
2.Union by rank or size. When merging two trees, always attach the root with smaller rank or size under the root with larger rank or size. This keeps trees shallow and prevents the worst-case O(n) find.
3.Track component count if needed. Initialize a count variable equal to n. Decrement it every time a union actually merges two distinct components (i.e., find(a) != find(b) before the union). Return this count for component counting problems.
4.State the amortized complexity. With both optimizations, operations are effectively O(1). Stating this explicitly shows familiarity with the algorithm beyond basic implementation.

Companies that ask union-find questions

Browse real union-find interview questions from these companies on LeakCode:

Google Meta Amazon Microsoft Uber Twitter LinkedIn Netflix

Search Union-Find on LeakCode

LeakCode has 51,000+ real interview questions from 2,000+ companies. Filter by company to see which union-find problem categories your target company asks most often.

Browse Union-Find Questions

How LeakCode helps with union-find prep

LeakCode indexes real candidate-reported questions from 1Point3Acres, Blind, LeetCode Discuss, Glassdoor, and Reddit. Filtering by company shows you the distribution of union-find problem types your target company uses, so you know whether to focus on pure connectivity problems or disguised graph union-find variants.