max-area-of-island
Difficulty: Medium · Topic: Graphs · Practice it on parikshan: /practice/max-area-of-island/start
The story
The satellite-imagery startup from number-of-islands ships its first customer report and the insurance underwriter writes back with a follow-up: "We don't just want how many islands. We want the biggest one." Storm surge, evacuation planning, and parametric insurance payouts all depend on the size of the largest contiguous landmass in a flood map, not its count. The product team forwards the spec to engineering and the ticket lands on your desk.
This is number-of-islands with one substitution: instead of counting components, sum the cells inside each component and take the max. The change to the algorithm is two lines. The change in mindset is the whole point.
What's actually being asked
Given an r × c grid of '0' (water) and '1' (land), find the area (number of cells) of the largest 4-connected island. If there is no land at all, return 0.
The naive approach
Same as the previous problem: a nested loop comparing every cell against every other cell to test connectivity is O((r·c)^2). Pointless. The graph traversal solves it in linear time.
The key insight
In number-of-islands the outer loop maintained a counter count and incremented it once per island discovered. Here you replace count with best, run a traversal that returns the size of the island it just walked, and update best = max(best, area) after each traversal.
That swap is the whole article. Every grid problem in this family has the same shape:
- Outer loop over cells.
- When you hit unseen land, walk its component.
- Record one number about that component.
Number of islands: record 1 per island. Max area: record cell count. Sum of perimeters: record the perimeter. The skeleton never changes; only the per-island accumulator changes.
Step-by-step approach
- Allocate
seen = [[False] * c for _ in range(r)]andbest = 0. - For each cell
(i, j)row-major:- If
grid[i][j] == '1'andnot seen[i][j]:- BFS from
(i, j). Start a deque with(i, j), markseen[i][j] = True, setarea = 0. - While the deque has cells, pop
(x, y), incrementarea, examine the four cardinal neighbours; for each in-bounds neighbour that is'1'and not yet seen, mark and enqueue. - After the BFS,
best = max(best, area).
- BFS from
- If
- Print
best.
Worked example
0 1 0 0 0
1 1 1 0 0
0 0 1 1 1
0 0 0 1 0
- Row-major scan reaches
(0, 1). BFS walks(0,1) -> (1,1) -> (1,0), (1,2) -> (2,2) -> (2,3) -> (2,4), (3,3). Area = 8.best = 8. - No other unseen land. Output
8.
Complexity
- Time: O(r · c). Every cell is enqueued at most once.
- Space: O(r · c) for
seenplus the BFS frontier in the worst case (a 200x200 fully connected blob).
Common pitfalls
- Marking on pop instead of on enqueue: a cell can be enqueued through two different rotten-orange-style neighbours and counted twice, inflating
area. Markseenthe instant you append to the deque. - Initialising
bestto a sentinel: amax([])on an all-water grid throws. Startbest = 0, update inside the loop, the all-water case naturally returns0. - Recursive DFS on a 200x200 monochrome grid: 40,000 cells in one component blows Python's default recursion limit. Use the iterative deque version.
- Reading the row as a list of characters vs a string:
grid[i][j]works on both, but slicing semantics differ. Be consistent.
Where this shows up in the enterprise
Same map of customers as number-of-islands, with the metric shifted from "how many" to "how big". Concrete examples:
- NASA AID and Red Cross flood maps: largest contiguous flooded region drives the response tier (regional vs federal emergency).
- Maxar and Planet Labs deforestation reports: largest single clear-cut block, not the number of clear-cuts, is what regulators publish.
- Cognex and Keyence wafer inspection: largest defect cluster determines whether the die is scrap or salvage.
- Cloudflare attack-traffic clustering: the size of the dominant connected component in a
(source, time)event grid quantifies the largest single botnet's footprint, which sizes the WAF rule update. - Medical imaging at GE Healthcare and Siemens Healthineers: largest connected tumour mass is a staging input, not just the count.
- Geofence analytics in delivery and ride-share: the largest contiguous "demand region" is the centroid candidate for a new warehouse or driver hub.
- Game level analytics: largest reachable region in a procedurally generated map flags accidentally unreachable rooms.
The pattern "report a maximum over connected components" is everywhere segmentation masks are produced. The instinct that "count vs. size is a one-line edit" is what makes this a 30-second problem instead of a 30-minute one.
Variants you'll see elsewhere
number-of-islands: count instead of size.surrounded-regions: instead of summing, classify each component by whether it touches the border.largest-island-after-flip: same traversal, plus a post-process that asks "if I flipped one water cell, what is the new max?". Solved with DSU.island-perimeter: traverse, accumulate edges to water instead of cells.count-sub-islands: traverse a candidate island and verify every cell is also land in a second grid.
All of them share the same outer scan plus inner traversal plus per-component accumulator. Once you internalise that template, you can hand-write any of these in under five minutes.
How parikshan turns this into a learning loop
This problem is where parikshan's repeat-the-template-with-a-twist drill earns its keep. The recommended path:
- Practice mode for
number-of-islandsuntil you can do it from blank with no hints. - Practice mode for
max-area-of-island, no AI. The only thing you should change from the previous solution is the inner accumulator and the outerbest = max(...). - Exam mode for both, back to back. The clock will reward the engineer who notices the structural identity and rewards muscle memory.
The AI chat in practice mode is excellent for asking "what would change if I instead wanted the perimeter?" or "if the grid wrapped around, what would I change?". Treat the AI as a Socratic partner here, not a code generator.
In the AI-integrated workspace
When you ask an agent to write max-area-of-island, you will usually receive a copy of its previous number-of-islands solution with one variable renamed. That is correct. Your review should:
- Confirm the per-island accumulator is in the right place: counting cells inside the BFS pop loop is correct; counting outside is a stale-frontier bug.
- Confirm
bestinitialises to0, notfloat('-inf'): the empty-grid case must print0. - Push back on a recursive DFS: on the largest grids it crashes. Demand iterative BFS or sys.setrecursionlimit + iterative DFS.
- Inspect the
seensemantics: a copy ofgridmutated in place is fine if the spec permits, but a separateseenmatrix is safer when the grid is referenced elsewhere.
The bigger lesson, and the lesson that compounds, is the meta-skill: when a new spec arrives, ask "is this a known template with a swapped accumulator?". Half of all grid problems in your career will be exactly that. The senior engineer who names the template in five seconds ships in five minutes; the junior engineer who treats every spec as new spends an hour reinventing it. parikshan trains the first kind.