parikshan

validate-bst

Difficulty: Medium  ·  Topic: Trees  ·  Practice it on parikshan: /practice/validate-bst/start

The story

A migration script at a financial data provider moves a transaction history from a legacy system into PostgreSQL. The legacy system stored records in a homemade tree structure that claims to be a binary search tree, indexed by transaction id. Before the migration trusts the structure (and skips a full sort during the import, saving hours on a 10⁸-row dataset), the script has to verify the claim. A single record out of BST order anywhere in the tree means the structure is corrupt and the import has to fall back to a full sort. The verification has to run in one pass per tree, no extra memory beyond the recursion stack.

That is validate-bst. Once you see this problem in this framing, you will recognise it in every "is this index actually valid?" scenario: database integrity checks, decision-tree validators, configuration linters that assert ordered hierarchies.

What's actually being asked

Decide whether a binary tree is a valid binary search tree under the strict definition: for every node, all values in its left subtree are strictly less than the node's value, and all values in its right subtree are strictly greater. Equal values are not allowed anywhere.

The naive approach

Check neighbours. The seductive but wrong solution is "at each node, verify left.val < node.val < right.val and recurse". This is the most popular bug in BST validation. It is necessary but not sufficient.

Counterexample: [5, 1, 4, null, null, 3, 6].

     5
    / \
   1   4
      / \
     3   6

At node 5: left child 1 < 5, right child 4. Wait, 4 < 5, the local "node.val < right.val" check fails here. Adjust the counterexample to one where the local check passes but the global fails:

     5
    / \
   1   6
      / \
     3   8

At node 5: left 1 < 5, right 6 > 5. Local check passes. At node 6: left 3 < 6, right 8 > 6. Local check passes. But node 3 is in the right subtree of 5, and 3 < 5. The global BST invariant is violated. The local check missed it.

The lesson: a node's BST constraint depends on every ancestor on the path to the root, not just its immediate parent. The local check can never see that.

The Trees primer explicitly debunks the neighbour check; any "validate BST" code that does not pass bounds is wrong.

The key insight

Approach 1: Bounds passed down. Each node inherits a half-open interval (lo, hi) from above. The root starts with (-inf, +inf). The left child of a node n inherits (lo, n.val); the right child inherits (n.val, hi). At each node, check lo < node.val < hi. Any violation fails the whole tree. This is the textbook recursion-with-extra-parameter pattern.

Approach 2: In-order traversal. The in-order traversal of a valid BST is strictly increasing. So walk the tree in-order, track the previously visited value, and assert each new value is strictly greater than the previous. Any non-increase fails the tree.

Both are O(n) and both work. The in-order approach generalises cleanly because the iterative form has a small stack and natural short-circuit. The bounds-passed-down approach generalises to BSTs with extra constraints (size limits per subtree, value-range filters) more readily.

The reference solution on parikshan uses the iterative in-order form to avoid Python's recursion limit on a 10⁴-node right-skewed chain.

Step-by-step approach

  1. Parse the level-order array into a tree. If empty, print true and exit.
  2. Initialise stack = [], cur = root, prev = None.
  3. While cur is not None or the stack is non-empty:
    • Push the left spine: while cur is not None, push cur and cur = cur.left.
    • Pop one node into cur.
    • If prev is not None and cur.val <= prev, print false and exit.
    • Set prev = cur.val. Set cur = cur.right.
  4. After the loop, print true.

This visits each node exactly once. Time O(n), space O(h) for the stack.

Worked example

Valid example:
Input: [2, 1, 3]

     2
    / \
   1   3

In-order: 1, 2, 3. Strictly increasing. Answer: true.
Invalid example:
Input: [5, 1, 4, null, null, 3, 6]

     5
    / \
   1   4
      / \
     3   6

In-order: 1, 5, 3, 4, 6. After 5 comes 3, which is not greater than 5.
Answer: false.

Complexity

  • Time: O(n). Each node enters and leaves the stack once.
  • Space: O(h). The stack holds at most one ancestor chain at a time.

Common pitfalls

  • Neighbour check only (the canonical bug, debunked above).
  • Equal values: the spec is strict. A node equal to its child is invalid. Use <= as the violation predicate, not <.
  • Integer-bound overflow with bounds-passed-down: if you use INT_MIN and INT_MAX as initial bounds, a node with value INT_MIN is mishandled. Use None or language-specific infinities, and treat None as "no constraint".
  • Stack overflow on a skewed BST: a 10⁴-node right-only chain overflows Python's default recursion limit. The iterative in-order avoids this entirely.
  • Empty tree confusion: the empty tree is a valid BST. Print true.

Where this shows up in the enterprise

  • PostgreSQL B-tree health checks: amcheck extension runs an in-order-style walk and asserts the BST ordering at every page. Production deployments run it weekly.
  • MySQL InnoDB: the storage engine's CHECK TABLE performs an analogous structural check on every secondary index.
  • MongoDB: db.collection.validate() walks indexes and verifies BST ordering on each.
  • CockroachDB and TiDB: range trees and meta-region trees are validated on every range split or merge.
  • TypeScript compiler: ordered union and intersection trees are validated for canonical ordering during incremental builds.
  • XGBoost and LightGBM: serialised models are validated to confirm the split-value ordering at every node before scoring is allowed.
  • AWS Route 53 and Cloudflare DNS: zone hierarchies are validated for ordered delegation before propagation.
  • Redis sorted sets: skip-list integrity is checked with a forward sweep that is structurally identical to in-order BST validation.
  • Java's TreeMap, C++ std::map: debug-mode assertions walk the tree in-order to verify the red-black invariant.

The pattern is universal: any ordered tree structure that powers an index has somebody running validate-bst against it on a schedule.

Variants you'll see elsewhere

  • recover-binary-search-tree: exactly two nodes are out of place; find and swap them. Uses the in-order traversal trick to spot the violators.
  • kth-smallest-element-in-bst: in-order halt at the kth visit.
  • range-sum-of-bst: same descent with the bounds-passed-down pattern as filter.
  • convert-sorted-array-to-bst: the inverse problem.
  • is-valid-avl-tree: validate BST plus balance, combining this problem with balanced-binary-tree.

How parikshan turns this into a learning loop

After you read this article and click through to the practice link, the parikshan flow integrates four layers:

  1. The editor runs your code against public tests and per-session dynamic private tests. The dynamic suite is heavy on the "local-check passes, global fails" trap; the canonical [5,1,4,null,null,3,6] shape is included along with deeper variants. The classic 10⁴-node right-skewed chain catches recursive solutions that ignore the recursion limit.
  2. Hints are graduated. The first hint asks what property the in-order traversal has. The third gives the iterative skeleton. Each hint costs integrity.
  3. The AI training assistant offers Explain, Review, Find Bugs, Add Comments, Optimise at small costs in exam mode. In practice mode it is free.
  4. The dispute flow runs after the verdict. The common dispute on validate-bst is candidates who used the local-check approach and are convinced their code is correct because they pass two of three public tests. The reviewer can walk through the counterexample.

Read the concept, attempt in practice mode, take hints when stuck, then redo in exam mode. When you can write either correct approach in under five minutes without touching AI, the pattern is in your fingers.

In the AI-integrated workspace

In your future job, you will not write is_valid_bst by hand. An agent will. Your job is to:

  1. Recognise that the BST property is global, not local, and that the only two correct approaches are bounds-passed-down or in-order traversal.
  2. Tell the agent precisely: "use bounds-passed-down with None for unbounded, OR iterative in-order tracking prev. Do NOT do a local-neighbour check".
  3. Read the agent's code and reject it immediately if you see left.val < node.val < right.val as the only check. That is the canonical bug, not a candidate solution.
  4. Stress-test on the edge cases the agent will silently miss: the empty tree, a single node, equal values at parent and child (should be invalid), the [5,1,4,null,null,3,6] global-violation shape.

Candidates who can name "BST validation needs global context, not local" within five seconds are the ones who direct AI agents productively. Candidates who cannot recognise the pattern end up approving the broken local-check code that ships, runs in production, and silently corrupts an index that later causes incident pages. parikshan is built to train the first kind.

validate-bst