gas-station

Difficulty: Medium · Topic: Greedy · Practice it on parikshan: /practice/gas-station/start

The story

You run the fleet routing layer at Ola Electric. The fleet circulates on a fixed clockwise loop of charging stations, picking up exactly gas[i] kWh at station i and burning cost[i] kWh to reach the next station. Drivers start their shifts with an empty pack. Dispatch needs to know one thing every morning: where in the loop should the driver start so that the pack never runs dry on a full circuit, or, if no such start exists, that drivers need to plug in mid-route.

That is gas-station. The exact same shape appears in Amazon's drone-delivery scheduling (where battery, not fuel, is the cycle), in telecom power planning (which cell-site in a microgrid ring should be the active source), and in CDN failover (which point of presence should be primary so propagation never lags). The problem is a circular surplus walk: find the unique start where running surplus stays non-negative.

What's actually being asked

Two arrays of length n, gas[i] and cost[i]. The route is circular: station n-1 connects back to station 0. Find the smallest index s such that starting at s with an empty tank, you can complete one full clockwise lap without the tank going negative. Return -1 if impossible. If a valid start exists, it is unique.

The naive approach

Try every starting station, simulate a lap, check the tank. That is O(n^2): n candidates times n steps per lap. For n = 10^5 that is 10^10 operations, far past the time limit. The naive answer is correct and useless, the familiar pattern.

The simulation can short-circuit: as soon as the tank dips below zero, stop. That helps on average inputs but does nothing on adversarial ones where every start is valid up to the last step.

The key insight

Here is the brilliant observation that turns this from O(n^2) to O(n).

Claim: if starting at station A, the tank first goes negative at station B, then no station between A and B (inclusive of A, exclusive of B + 1) can be a valid start.

Proof. Suppose, for contradiction, some station M strictly between A and B were a valid start. When you start at A, you arrive at M with some non-negative running surplus T(M) (it was non-negative at M, because the tank only first went negative at B > M). From M to B, the run subtracts the cost minus gas over those stations. Starting fresh at M means arriving at B with tank exactly tank(A → B) - T(M) <= tank(A → B) < 0. So starting at M also dips below zero by B. Contradiction.

The intuition: starting later in a path that ran out can only mean less gas in your tank at the failure point, never more, because everything before the start contributed only non-negative surplus to the failed run. If A cannot make it past B, no station in [A, B] can either.

This is a textbook exchange argument: assume an alternate optimal differs from your greedy choice, show the difference makes things worse (or no better), conclude your greedy choice is at least as good. Whenever a greedy algorithm survives this kind of argument, it is provably correct, not just empirically observed.

The total feasibility check

A separate question: does any valid start exist? The answer is yes iff sum(gas) >= sum(cost). If total gas is less than total cost, you simply do not have enough fuel for a lap, period. Otherwise, by the pigeonhole logic of the exchange argument, exactly one start works. The two checks combine cleanly: scan once, maintain total and tank, reset the candidate when tank dips negative.

Step-by-step approach

Initialise total = 0, tank = 0, start = 0.
For i from 0 to n - 1:
- Let surplus = gas[i] - cost[i].
- total += surplus; tank += surplus.
- If tank < 0: set start = i + 1, reset tank = 0.
If total < 0, print -1. Else print start.

One pass. Two scalars. O(n) time, O(1) memory.

Worked example

gas  = [1, 2, 3, 4, 5]
cost = [3, 4, 5, 1, 2]
n = 5

Walk:

i=0: surplus = 1-3 = -2. total=-2, tank=-2. tank<0 → start=1, tank=0.
i=1: surplus = 2-4 = -2. total=-4, tank=-2. tank<0 → start=2, tank=0.
i=2: surplus = 3-5 = -2. total=-6, tank=-2. tank<0 → start=3, tank=0.
i=3: surplus = 4-1 = +3. total=-3, tank=3.
i=4: surplus = 5-2 = +3. total=0, tank=6.

End: total = 0 (>= 0), start = 3. Output: 3. Matches.

Verify by simulation from station 3: tank = 0+4-1=3 → 3+5-2=6 → 6+1-3=4 → 4+2-4=2 → 2+3-5=0. Full circuit, exactly.

Complexity

Time: O(n), one pass.
Space: O(1) extra, three scalars.

Optimal. You cannot answer in less than one read of every station.

Common pitfalls

Two-pass solution: some implementations first sum the totals, then scan again to find the start. Correct but slower by a constant; one pass with total accumulated in the same loop is cleaner.
Off-by-one on the reset: setting start = i instead of i + 1 is a classic bug. The dip happened at i, so the next candidate must be i + 1.
Forgetting the total check: the algorithm returns the index it last reset to, even when no valid start exists. Without the total < 0 → -1 check, you would emit a wrong answer.
Confusing tank with total: total decides feasibility (across the whole loop), tank decides where to reset (within a window). They evolve identically but mean different things.

Where this shows up in the enterprise

Ola Electric / BluSmart: dispatch start point for fixed-loop EV fleets to minimise mid-route charging.
Amazon Prime Air: drone delivery loops where battery is the cycle and the optimal launch point is the start that keeps charge non-negative.
AWS Route 53 / Cloudflare DNS failover: which point of presence to designate primary in a ring of regional anycast nodes, so propagation never lags.
Reliance Jio / Airtel telecom microgrids: cell-site power rotation when sites share solar panels in a circular relay.
HFT cycle scheduling: which exchange in a ring of market-makers should hold inventory first, so the running position never goes negative.
Container ship bunker planning: which port to refuel at when crossing a circular route, same surplus walk on fuel reserve.
Database replication: in a ring of replicas, which one's WAL pointer should be the canonical clock so lag never goes negative.

The deeper recurrence: any circular constraint with additive surplus can be reduced to this scan. Spotting that abstraction in a problem disguised as something else is the whole skill.

Variants you'll see elsewhere

gas-station-ii: candidates with refuel cost per stop; turns greedy into a heap-based selection. Different problem despite the name.
circular-array-loop: similar circular scan, different invariant.
find-minimum-in-rotated-sorted-array: not greedy, but the same "circular structure" intuition for spotting the inflection point.
maximum-subarray (Kadane's): the linear cousin of the same reset trick, when running sum goes negative, drop it and restart.

gas-station and Kadane's algorithm are the two most beautiful examples of "reset the candidate when it goes negative" in interview lore. Knowing both forces you to internalise the pattern.

How parikshan turns this into a learning loop

The dynamic private tests cover edge cases that catch shortcuts: n = 1 with gas[0] == cost[0] (valid start is 0), all-equal sequences, adversarial inputs where the valid start is n - 1. The hint ladder leads you to the total + reset insight in steps; if you cannot derive it on your own, hint 3 names the reset rule explicitly at a 15-point integrity cost. Practice mode is for the AI assistant to walk you through the exchange argument, that is its single most useful service. Exam mode is where you write the eight lines from memory in two minutes flat. Aim for the latter.

In the AI-integrated workspace

Agents handle this problem competently because it is well-known, but the failure mode is subtle. The generated code often produces:

The correct algorithm with the wrong reset offset (start = i instead of i + 1).
The correct algorithm with no proof of correctness in the docstring, and that is the dangerous case. Greedy code without a proof is a guess, and on gas-station the guess happens to work, but generated code on related variants ("find the first start that lets you reach 80% of the loop") will produce confidently wrong answers if it learned the algorithm by pattern-matching alone.

Your senior-engineer move: insist that the agent include the exchange argument in the docstring. If it cannot articulate why skipping forward past a failure is safe, then the algorithm is unverified, and you have no business shipping it. The proof is the deliverable. The code is the by-product.