Graphs
The core idea
A graph is nodes and edges. A binary tree is a graph that happens to be acyclic and rooted. A 2-D grid is a graph where each cell has up-to-four neighbours. Once you see these as graphs, the same two algorithms solve almost everything in this section: BFS and DFS.
BFS visits all neighbours of s before any neighbour-of-neighbour. It finds the shortest path in an unweighted graph. DFS dives deep, backtracking on a dead end. It is the natural tool for connectivity, cycle detection, and topological sort.
Three extensions you must know:
- Visited set / colour state: every graph traversal needs to remember what it has seen, or it loops forever.
- Topological sort: order nodes so every edge points forward. Built on DFS with three colours (white, grey, black) or BFS with in-degree counts (Kahn's algorithm).
- Multi-source BFS: seed the queue with multiple starts. Used when a process spreads from several origins simultaneously.
When to reach for it
Trigger phrases:
- "grid of 1s and 0s", "islands", "rooms"
- "graph of courses / prerequisites"
- "shortest path / fewest steps"
- "can you reach ... from ..."
- "dependency cycle"
- "spread / infect / fill"
Grid problems disguise their graph nature; every cell is a node, every adjacency is an edge.
How to approach
- Identify the nodes and edges explicitly. Write them down.
- Decide: shortest-path or just-reachability? Shortest → BFS. Reachability or order → DFS.
- Pick the visited representation. For grids, often the grid itself (overwrite cells). For named nodes, a set.
- For weighted shortest paths, BFS is wrong; you need Dijkstra. None of the 105 problems here demand that; recognise the limit anyway.
Real-world applications
- Social networks: friend-of-friend, recommendation, community detection.
- Routing: package delivery, network packet routes, train schedules.
- Build systems: bazel, gradle, make. Topological sort of targets.
- Dependency installers: pip, npm, apt. Cycle detection on packages.
- Image segmentation: flood fill, connected components.
- Spreadsheets: cell recalculation order is topological sort over the formula DAG.
In the AI-integrated workspace
Graph problems are where AI agents most often confuse BFS and DFS. The pattern: the spec asks for shortest path, the agent writes DFS, the code finds a path but not the shortest one. Always check: "What guarantees shortest?".
The second common error is forgetting to mark visited at the right moment. There are two valid timings: mark on enqueue (BFS, correct) or mark on dequeue (BFS, wrong; you'll enqueue the same node multiple times and inflate the queue). DFS recurses with mark-on-entry. Generated code routinely picks the wrong moment.
The third: in topological sort, generated code that computes in-degree but never enqueues the zero-in-degree nodes, or never decrements correctly. A topological sort review asks: "Is the final result the same length as the input node list?". If not, there is a cycle, which itself is a meaningful answer.
In agentic workflows for backend services, "spreading state across a network of services" is almost always a graph traversal. When you direct an agent to write retry / fan-out / dependency-resolution code, frame it in BFS / DFS terms; the agent's output will be cleaner.
Problems in this cluster
Easy: flood-fill. Medium: number-of-islands, max-area-of-island, clone-graph, rotting-oranges, pacific-atlantic-water-flow, course-schedule, course-schedule-ii, detect-cycle. Hard: word-ladder.
Start with flood-fill, then graduate to number-of-islands (the deep article is the template for every connected-components problem you will ever solve).