lowest-common-ancestor-bst

Difficulty: Medium · Topic: Trees · Practice it on parikshan: /practice/lowest-common-ancestor-bst/start

The story

A multi-tenant SaaS billing platform runs an org-chart-like tree of accounts: the parent company at the root, subsidiaries below, business units under those, and individual cost centres as leaves. The tree is kept as a BST keyed by account id (where the id encodes the hierarchical placement). When the billing engine needs to consolidate two cost centres' invoices, it has to find the deepest common parent account: the most specific scope that "owns" both. Below that scope, billing is separate; at or above it, the invoice consolidates. The engine asks this question hundreds of times per second, so the answer needs to be O(log n) per call, not O(n).

That is lowest-common-ancestor-bst. Once you see this problem in this framing, you will recognise it in every hierarchical access-control question, every git "common ancestor of two commits" prompt, every taxonomy that asks "what is the most specific class that contains both of these items?".

What's actually being asked

Given a binary search tree and two distinct values p and q, both guaranteed to be present, return the value at their lowest common ancestor: the deepest node that has both as descendants (a node is its own descendant).

The naive approach

Treat the tree as a generic binary tree. Recurse from the root: at each node, recurse left and right; if both sides return a non-null result, this node is the LCA. That works for any binary tree in O(n) time. Correct, but it ignores everything that makes this a BST. On a balanced BST of n = 10⁴ nodes, O(n) is 10⁴ operations; an O(log n) algorithm does ~14. For a billing engine doing hundreds of these per second, that ratio matters.

The key insight

In a BST, every node cur partitions the rest of the tree into two halves: everything in cur.left is < cur.val, everything in cur.right is > cur.val. So when you compare cur.val to the two targets p and q, exactly one of three cases holds:

Both p and q are less than cur.val. The LCA lies in the left subtree. Descend left.
Both p and q are greater than cur.val. The LCA lies in the right subtree. Descend right.
cur.val lies between p and q (inclusive of either). Then cur is the LCA. Going either direction would lose one of the two targets.

The walk is a single chain from root to LCA, no recursion needed at all, no backtracking. The first time the search range straddles cur.val, that node is exactly the LCA. The path is at most O(h) long: O(log n) for a balanced BST, O(n) for a skewed one.

Step-by-step approach

Parse the level-order array into a tree, parse p and q from the second line.
Sort so lo = min(p, q), hi = max(p, q).
Start cur = root.
Loop:
- If cur.val < lo, descend right: cur = cur.right.
- Else if cur.val > hi, descend left: cur = cur.left.
- Otherwise (lo <= cur.val <= hi), print cur.val and exit.

No recursion. Constant memory. No stack overflow regardless of input shape.

Worked example

Input:
[6, 2, 8, 0, 4, 7, 9, null, null, 3, 5]
p = 2, q = 8

           6
          / \
         2   8
        / \ / \
       0  4 7  9
         / \
        3   5

lo = 2, hi = 8
cur = 6:  lo=2 <= 6 <= 8=hi -> LCA is 6.

Example where one target is an ancestor of the other:

Input:
[6, 2, 8, 0, 4, 7, 9, null, null, 3, 5]
p = 2, q = 4

lo = 2, hi = 4
cur = 6:  6 > hi=4, descend left.
cur = 2:  lo=2 <= 2 <= 4=hi -> LCA is 2.

The algorithm correctly returns 2 because 2 itself is one of the targets and an ancestor of the other; that satisfies "the deepest node with both as descendants" (a node is its own descendant).

Complexity

Time: O(h). O(log n) on a balanced BST; O(n) on a degenerate chain.
Space: O(1). The walk holds one pointer.

This is the rare case where the constraint structure (BST ordering) buys an algorithmic win, not just a constant-factor speedup.

Alternative for general binary trees

If the tree is not a BST, the BST trick is unavailable. The general LCA recurrence is the one mentioned in the naive section: recurse into both subtrees, and the LCA is the deepest node where both sides report a hit. Another option is the in-order index pair approach: do an in-order walk, mark when you first see p and q, and the LCA is the node visited between them whose subtree contains both. Both general approaches are O(n). Mention this distinction because LCA-of-general-binary-tree is a different problem on parikshan with a different problem id.

Common pitfalls

Strict comparison instead of inclusive: writing lo < cur.val < hi (strict) misses the case where cur.val == p or cur.val == q. Use lo <= cur.val <= hi (inclusive).
Forgetting to sort p and q: if you do cur.val < p without sorting, you have to keep two comparisons in mind. Sorting once at the start removes a class of bugs.
Recursing when you do not need to: the BST version is a single descent, no recursion. Writing a recursive helper here is correct but wasteful.
Applying the BST trick to a non-BST: the algorithm is only correct because of the BST ordering. On a generic binary tree (lowest-common-ancestor-binary-tree on most judges) it returns garbage.

Where this shows up in the enterprise

Git's git merge-base: the underlying algorithm finds the LCA in a commit DAG; for linear histories it reduces to the BST case.
Filesystem permissions: NFS and POSIX ACLs walk the directory tree to find the most specific common parent of two files when computing inheritable permissions.
Cloud IAM hierarchies: AWS Organisations, Google Cloud Resource Manager, and Azure Management Groups all run LCA queries against their account trees to compute consolidated billing and policy scope.
PostgreSQL ltree extension: ltree is a hierarchical label type indexed as a BST-like structure; LCA queries are a core operation.
MongoDB document hierarchies: $graphLookup with $lt and $gt filters can compute LCA over a tree-shaped reference graph.
DNS delegation: AWS Route 53 and Cloudflare DNS resolve the most-specific common authoritative zone for two queries.
CDN routing: prefix tries (effectively BSTs over bit positions) compute LCA when consolidating overlapping prefixes.
TypeScript compiler: type-system LCA queries (the common supertype of two types) traverse the inheritance tree under similar rules.
XGBoost and LightGBM: decision-tree interpretability tools find the LCA of two input paths to explain "what shared decision led to both predictions".
Reddit and Hacker News: comment threads use LCA to identify the deepest shared parent when a user replies to multiple messages.

The pattern is universal: any hierarchy where two items need to share a scope has somebody running LCA.

Variants you'll see elsewhere

lowest-common-ancestor-binary-tree: general binary tree, the recursion-both-sides variant.
lowest-common-ancestor-of-deepest-leaves: LCA of the deepest leaves; combines depth tracking with LCA.
find-distance-in-binary-tree: distance between two nodes; uses LCA as an intermediate step.
kth-ancestor-of-a-tree-node: precompute ancestors for O(log n) ancestor queries; related but a different data structure (binary lifting).
path-to-node: stash the root-to-node path; another way to compute LCA by intersecting two paths.

How parikshan turns this into a learning loop

After you read this article and click through to the practice link, the parikshan flow integrates four layers:

The editor runs your code against public tests and per-session dynamic private tests. The dynamic suite includes a 10⁴-node skewed BST to verify the iterative form, plus shapes where one target is an ancestor of the other (the inclusive comparison is the only way to clear them).
Hints are graduated. The first hint points at the BST ordering. The third gives the loop skeleton. Each hint costs integrity.
The AI training assistant offers Explain, Review, Find Bugs, Add Comments, Optimise at small costs in exam mode; free in practice.
The dispute flow runs after the verdict. The common mistake is candidates who solve the general binary-tree LCA and ignore the BST property; the reviewer can confirm the simpler algorithm is the expected one.

Read the concept, attempt in practice mode, take hints when stuck, then redo in exam mode. When you can write the iterative descent in under three minutes without touching AI, the pattern is in your fingers.

In the AI-integrated workspace

In your future job, you will not write lca_bst by hand. An agent will. Your job is to:

Recognise that LCA on a BST is a single descent with three branching cases, no recursion, no extra memory.
Tell the agent precisely: "iterate from the root; descend right if cur.val < lo, descend left if cur.val > hi, otherwise return cur.val. Use inclusive comparisons so an ancestor that equals a target also matches".
Read the agent's code and reject it if you see the general-binary-tree LCA recursion. That works but wastes the BST property; it is O(n) where O(log n) is available.
Stress-test on the edge cases the agent will silently miss: one target is the root, one target is an ancestor of the other, the two targets are leaves on opposite sides, a degenerate left-only chain.

Candidates who can name "this is BST-LCA, single descent on the ordering" within five seconds are the ones who direct AI agents productively. Candidates who cannot recognise the pattern end up approving the slower general-tree recursion, which works on the sample and silently misses the latency budget on production traffic. parikshan is built to train the first kind.