combination-sum
Difficulty: Medium · Topic: Backtracking · Practice it on parikshan: /practice/combination-sum/start
The story
You build the checkout engine at Razorpay. A merchant wants every way a customer could pay 2,500 rupees using a fixed menu of UPI denominations, say 100, 200, 500, and 2000, with no limit on how many notes of each denomination they tap. The risk team uses this list to compute friction per payment plan; the UX team uses it to pick the default suggestion. The fundamental question: enumerate every multiset of values from a given candidate set that sums to a target.
That is combination-sum. The same shape appears in tax accounting (every way to allocate a deductible across allowable buckets), in cap-table modelling (every way to issue shares from a class pool to total an investment), and in genomics (every way to combine reagent doses to a target concentration). The defining twist is reuse is allowed, a single candidate can appear any number of times in one combination, and that is what separates it from plain subsets.
What's actually being asked
Given a set of k distinct positive integers and a positive target, output every combination of values whose sum equals target. Each value may be used any number of times. Combinations are unordered multisets; print each as a non-decreasing sequence, with the combinations themselves in lex order.
The naive approach
You could enumerate every multiset up to some max length and filter the ones summing to target. With target = 30 and candidates as small as 1, the length could be 30, and the search space is k^30, utterly intractable. The naive answer is correct in principle and useless in practice.
The right approach is backtracking with two prunes: stop when remaining hits zero (success), stop when the current candidate already exceeds remaining (dead end). The combination of monotone candidate order and the early break is what makes the search tight.
The canonical backtracking skeleton
def backtrack(state):
if is_solution(state):
record(state)
return
for choice in choices(state):
if prune(state, choice):
continue # or break, when the prune is monotone
apply(state, choice)
backtrack(state)
undo(state, choice)
For combination-sum:
- state:
(path, start, remaining). - is_solution:
remaining == 0. - choices: every index
iin[start, k). - prune:
cand[i] > remaining. Becausecandis sorted ascending, every later index is also too large,break, notcontinue. - apply:
path.append(cand[i]). - recurse:
dfs(i, remaining - cand[i]), notei, noti + 1. - undo:
path.pop().
That i versus i + 1 is the entire "reuse allowed" twist. In subsets we passed i + 1 to prevent revisiting; here we pass i because revisiting is exactly what we want. The monotone start still prevents the same multiset from being emitted in two different orders.
The key insight
Why pass i and not i + 1? Because we want [2, 2, 3] but not [2, 3, 2]. The start index forbids ever going backwards, so once we move from 2 to 3, we cannot return to 2. That gives every multiset a single canonical non-decreasing representation, which is exactly one path in the DFS. No duplicates, no post-sort, lex order for free.
Why prune with break and not continue? Because the candidates are sorted ascending. If cand[i] > remaining, every cand[j] for j > i is also > remaining. Skipping with continue would still be correct but wasteful; break exits the loop immediately. On the stress-large test (cand = [2..8], target = 30) that single character difference is the gap between AC and TLE.
Step-by-step approach
- Sort
candascending. - Define
dfs(start, remaining)carrying a globalpath. - Base: if
remaining == 0, appendlist(path)to the output and return. - Loop
ifromstarttok - 1. Ifcand[i] > remaining,break. Else pushcand[i], calldfs(i, remaining - cand[i]), pop. - Call
dfs(0, target). Print the count, then every combination.
Worked example
cand = [2, 3, 6, 7], target = 7
Sorted: [2, 3, 6, 7]. DFS unfolds:
- start=0, rem=7, path=[]. Try
i=0(val 2): push, recursestart=0, rem=5.- Try
i=0(val 2): push, recursestart=0, rem=3.- Try
i=0(val 2): push, recursestart=0, rem=1. Loop:cand[0]=2 > 1, break. Backtrack, pop. - Try
i=1(val 3): push, recursestart=1, rem=0. Record[2, 2, 3]. Pop. - Try
i=2(val 6):6 > 3, break.
- Try
- Pop. Try
i=1(val 3): push, recursestart=1, rem=2. Loop:cand[1]=3 > 2, break. Pop. - Try
i=2(val 6):6 > 5, break.
- Try
- Pop. Try
i=1(val 3): push, recursestart=1, rem=4. Tryi=1(val 3): push, recurserem=1, break. Pop. Tryi=2(val 6):6 > 4, break. Pop. - Try
i=2(val 6): push, recurserem=1, break. Pop. - Try
i=3(val 7): push, recurserem=0. Record[7]. Pop.
Output: 2 followed by 2 2 3 and 7. Matches the spec.
Complexity
The branching factor is bounded by k and the depth by target / min(cand). In the worst case the recursion explores O(k^(target / min(cand))) paths, with the prune dramatically reducing the constant. For the problem's bounds (k <= 9, target <= 30) the search is small enough to be near-instant.
- Time: O(k^d) in the worst case where
d = target / min(cand); in practice the prune makes it far smaller. - Space: O(d) for the recursion stack plus the output.
Common pitfalls
- Passing
i + 1instead ofi: kills the reuse semantics; you getsubsetsinstead. - Forgetting to sort: the prune relies on monotone candidates. Without sorting, you cannot
breakoncand[i] > remaining, you would have tocontinueand lose the optimisation. - Using
continueinstead ofbreak: correct but slow. The stress test exposes this. - Recording without copy: same
out.append(list(path))rule as every backtracking problem. - Off-by-one on
remaining: if you allowremaining < 0paths and check at the top of the recursion, you do extra work. Better: check at the choice point (cand[i] > remaining).
Where this shows up in the enterprise
- Razorpay / Stripe / PayPal: every way to assemble a payout from a fixed set of payment-instrument denominations.
- Splitwise / Pleo: every way to break an expense into a fixed list of allowable per-person buckets.
- Snowflake billing: every combination of credit packs that totals a quoted price.
- Pharma compounding: every way to mix doses from a fixed list of vials to reach a prescribed concentration.
- Compiler autotuning: every combination of repeated optimisation flags whose total cost-budget matches the configured ceiling.
- 3D printing: every multiset of filament cartridges whose total length matches a print job.
- Recipe optimisation: every multiset of ingredient packs whose total weight matches a target (subscription-meal companies use this for cart packing).
When the candidate set grows large or the target balloons, this morphs into a DP problem (coin-change, coin-change-ii) because counting solutions becomes more useful than enumerating them.
Variants you'll see elsewhere
combination-sum-ii: candidates may repeat, but each candidate can only be used once per combination. Passi + 1and add a skip-duplicate rule.combination-sum-iii: candidates are fixed (1..9), use each at most once, target a specific count and sum.combination-sum-iv: count the number of compositions where order matters. Drop backtracking, switch to DP, the result counts permutations, not multisets.coin-change: minimum coin count to maketarget. Same candidate idea, but optimisation not enumeration, DP wins.coin-change-ii: count multisets (no order). DP withstart-index style outer loop.
All five are the same combinatorial shape with different counting questions. Recognising which one the spec is asking is half the battle.
How parikshan turns this into a learning loop
The dynamic private tests vary candidate sets and targets so that the break versus continue distinction shows up: a continue-only solution passes the small public tests and times out on the stress private. The graduated hints (hint 3 explicitly names the break prune at a 15-point integrity cost) are tuned to nudge without giving away the trick. The dispute flow exists for the edge case where you believe the judge mis-orders a tied combination, though in practice the spec's lex order is well-defined and the judge is correct. Run the problem in practice mode, take hints only when stuck for genuinely five minutes, then redo in exam mode at full speed.
In the AI-integrated workspace
Agents are exceptionally good at combination-sum because it is a textbook problem. They are exceptionally bad at three things on it:
- Choosing
breakversuscontinue. Generated code defaults tocontinuebecause it is the "safer" looking option. On the stress test that decision is a TLE. - Recursing with
iversusi + 1. This is a 50/50 coin flip in generated code if you do not specify in the prompt. The spec is "reuse allowed", and the right call isi. Read the recursion and verify. - Sorting the input. Without it, the prune is wrong and the order is wrong. Generated code sometimes sorts the output afterwards "to satisfy lex order", that is a smell. The right place to sort is once, up front.
The senior-engineer move is to name the pattern ("combination-sum with reuse, sort and break") and then read three lines of the agent's code: the sort, the loop's break, and the recursion's i. If those three are right, the rest is mechanical.