first-bad-version

Difficulty: Easy · Topic: Binary Search · Practice it on parikshan: /practice/first-bad-version/start

The story

Linux kernel maintainers tag releases by number. When a regression report comes in pointing at the running 6.7.13 kernel, the maintainer suspects some earlier patch in the series broke it. Bisecting by hand across 12,000 commits is unthinkable; git bisect does it in around 14 probes. Each probe is a kernel build, a boot, and a manual reproduction step. The cost of the predicate is enormous; the cost of the search is "as few probes as possible".

first-bad-version is the textbook name for that pattern. The interesting thing is that the array is gone. You have only a predicate is_bad(v) and a range [1, n]. The same Binary Search template you used in search-insert-position carries directly over.

What's actually being asked

You manage a product with n versions numbered 1..n. From some version firstBad onward, every version is bad. Find that firstBad using as few calls as possible to the predicate is_bad(v).

In the parikshan formulation, firstBad is given directly on the input so the verifier can check your answer; you must still derive it by binary search, treating is_bad as a function whose internals you do not exploit. The point of the problem is the search loop, not the value.

The naive approach

Scan v = 1, 2, 3, ... and return the first v for which is_bad(v) is True. O(n) predicate calls. For n = 1,000,000,000 (one billion versions, which is the Bezos-scale framing Amazon's spec actually targets) that is one billion predicate evaluations. Each predicate call in the real world is a build, boot, and test. You will not get to the answer.

The key insight

The predicate is_bad(v) is monotone over the integers: False on [1, firstBad - 1] and True on [firstBad, n]. That is the same predicate shape as nums[mid] >= target from search-insert-position. The fact that the predicate is a function call instead of an array lookup changes nothing about the structure of the search.

This generalisation is the whole point of binary search as a topic. The array was never the thing. The thing is the boundary of a monotone predicate over an indexed range.

Step-by-step approach

def first_bad(n, is_bad):
    lo, hi = 1, n + 1                  # half-open: hi is one past the last legal version
    while lo < hi:
        mid = (lo + hi) // 2
        if is_bad(mid):                # predicate True → answer is at mid or to the left
            hi = mid
        else:                          # predicate False → answer is strictly to the right
            lo = mid + 1
    return lo                          # the boundary

The loop invariant in English: the first bad version lies in [lo, hi). The problem guarantees firstBad <= n, so the answer is always strictly less than n + 1, and the loop terminates with lo pointing at it.

You can also write this with hi = n (closed-style) and an inclusive predicate; the half-open form is preferred because it composes with search-insert-position and sqrt-x.

Worked example

n = 5, firstBad = 4. The predicate behaviour: is_bad(1..3) False, is_bad(4..5) True.

lo=1, hi=6. mid=3, is_bad(3)=False → lo=4.
lo=4, hi=6. mid=5, is_bad(5)=True → hi=5.
lo=4, hi=5. mid=4, is_bad(4)=True → hi=4.
lo=4, hi=4 → exit. Return 4. ✓

Three predicate calls instead of four. For n = 1,000,000,000, the maximum is about 30 calls. That is the difference between a 30-minute manual bisect and a one-second one.

Complexity

Time: O(log n) predicate calls. The work per call is whatever the predicate costs; the search adds no overhead.
Space: O(1).

For n = 2^31 - 1 (the worst case allowed) the loop runs at most 31 times.

Common pitfalls

lo + hi overflow. Python handles arbitrary integers, but a C, Java, Rust, or Go port of this code with n near INT_MAX overflows. Use mid = lo + (hi - lo) // 2. The parikshan integrity engine will flag a Python solution that ignores this, because the same engineer is expected to port the code to a typed language tomorrow.
Calling the predicate twice per iteration. The naive version reads if is_bad(mid - 1) and is_bad(mid). That doubles the cost. One call per iteration is enough.
Off-by-one on hi. With hi = n (closed form) the loop must be while lo < hi, not while lo <= hi. With hi = n + 1 (half-open) the loop is also while lo < hi. The trap is mixing one form's hi with the other form's loop condition.
Returning hi. After the loop, lo == hi. Returning either is correct, but the loop invariant was phrased about lo, so use lo for readability.

Where this shows up in the enterprise

git bisect: bisects commits between a known-good and a known-bad commit. The "predicate" is "does this commit reproduce the bug?".
Cloudflare deployment rollout: smallest percent rollout at which an error rate spikes; binary search the rollout fraction.
CockroachDB / Vitess schema migrations: smallest commit hash that triggers the migration regression. A git bisect against a CI suite.
Apple production builds (Xcode bisect): bisecting a regression across Xcode toolchain versions.
Datadog deployment markers: locating the first build at which a latency p99 alert fired.
Stripe pricing experiments: smallest price at which conversion crosses a threshold; the experiment platform binary-searches the response.
AWS auto-scaling group sizing: smallest instance count that keeps p95 latency under a target. Predicate is "did the load test fit?".

The signal in all of these: a monotone predicate over an integer range, and a predicate that is expensive to evaluate. The expense is what makes binary search a 30x to 1000x win.

Variants you'll see elsewhere

search-insert-position: same template, predicate is array-indexed.
binary-search: same template, predicate is nums[mid] == target with the equality check at the end.
sqrt-x: same template, predicate is mid * mid <= x on an arithmetic range.
last-good-version / peak-element: same template, predicate flipped.
kth-smallest-in-a-sorted-matrix: binary-search the answer space; predicate counts how many cells are at most mid.

This is what "the template that does not lie" buys you: every variant collapses to the same loop with a different predicate.

How parikshan turns this into a learning loop

first-bad-version is the problem that turns the binary-search template from "an algorithm I learned" into "a tool I reach for". Recommended sequence on parikshan:

Solve binary-search and search-insert-position first; that locks the half-open invariant.
Solve this one in practice mode, AI off. Write the predicate as a separate function and binary-search it. Resist using firstBad directly; the discipline is the lesson.
The dynamic private tests vary n from 1 up to 10^9. A solution that scans linearly will TLE on the big ones, even though the predicate is cheap. The TLE is the lesson.
Turn AI on in practice mode and ask it to Explain. Listen for the agent stating the loop invariant in one sentence. If it does, the agent has understood the problem; if it does not, you have caught the agent generating plausible code on autopilot.

The integrity score weighs which features you used. A submission that earned AC with no AI is a different signal from one that earned AC after three hints; both count, but a human reviewer (and a future recruiter dashboard) sees the difference.

In the AI-integrated workspace

Real production "find the first X where ..." problems almost always have an expensive predicate. The agent will routinely write a linear scan when a binary search is available, because the linear scan is shorter and equally correct on the sample inputs. Your spotting move is to ask "is the predicate monotone?" If yes, the right answer is binary search regardless of how short the linear version looks.

Three verification questions per agent-generated bisect:

"What is the predicate, in English?" Demand a one-sentence answer.
"Is it monotone?" If the agent cannot defend monotonicity, the search is unsafe; the answer might not even exist.
"How many predicate evaluations does the search make in the worst case?" The answer should be ceil(log2(range)).

A senior engineer who can run that ritual unprompted is the one who turns a 30-minute regression hunt into a 30-second one. parikshan's training mode is built to make that ritual reflexive.