generate-parentheses
Difficulty: Medium · Topic: Stack · Practice it on parikshan: /practice/generate-parentheses/start
The story
A team at JetBrains writing the IntelliJ refactoring engine wants to enumerate all syntactically valid ways to wrap a sequence of n function calls in matched parentheses for a "rewrite to applicative form" suggestion. The IDE shows the user every valid grouping and asks them to pick. They give you n and want every well-formed string of n open and n close parens, in lexicographic order. The count for n = 4 is 14 (the fourth Catalan number); for n = 10 it is 16,796. Manageable for n up to 10.
That is generate-parentheses. It is the canonical "constrained generation" problem, and its solution is a backtracking recursion that uses the stack metaphor implicitly through the call stack and the matching invariant.
What's actually being asked
Given an integer n, output the count C_n of well-formed parentheses strings of n open and n close parens, followed by the strings themselves in lexicographic order.
C_n is the n-th Catalan number: 1, 2, 5, 14, 42, 132, ... for n = 1, 2, 3, 4, 5, 6, ....
The naive approach
Enumerate all 2^(2n) binary strings of length 2n, treat 0 as ( and 1 as ), and filter by validity. O(4^n) work and most candidates are invalid. For n = 10, that is 4^10 = 10^6 and 16,796 valid strings, passable, but the filter discards 98% of work.
The key insight
A partial string is viable (can be extended to a valid one) iff:
- The number of
(used so far,o, satisfieso <= n. - The number of
)used so far,c, satisfiesc <= o.
The second condition is the stack invariant: every prefix must have at least as many opens as closes. Equivalent: at every point you can think of a virtual stack of unmatched opens; the stack never goes negative, and at the end it must be exactly empty (which means o == c == n).
The recursion: from a partial string with o opens and c closes,
- If
o == nandc == n, emit the string. - If
o < n, recurse with one more open:o+1, c. - If
c < o, recurse with one more close:o, c+1.
The two branches together prune all invalid prefixes. The result is exactly C_n strings, in lexicographic order if you emit the open branch before the close branch (because ( is lexicographically less than )).
How the stack maps to the recursion
The "stack of unmatched opens" is implicit:
o - cis the current stack size.o < ncorresponds to "stack can grow" (push).c < ocorresponds to "stack is non-empty, can pop" (pop and match).
You never need a literal stack. The recursion's call stack is the stack of decisions; the invariant o >= c keeps the bracket stack non-negative at every step. This is the cleanest example of "the structure of the problem is the structure of the algorithm" in the stack chapter.
Step-by-step approach
results = [],path = [](a char list, used as the in-progress string).- Define
def gen(o, c):- If
len(path) == 2 * n: append''.join(path)toresults; return. - If
o < n: append'('topath; callgen(o + 1, c); pop frompath. - If
c < o: append')'topath; callgen(o, c + 1); pop frompath.
- If
- Call
gen(0, 0). - Print
len(results). Then print each string on its own line.
Because the open branch fires before the close branch and the recursion is depth-first, the strings emerge in lexicographic order naturally. No sort needed.
Worked example
n = 2
The recursion tree:
gen(0,0) -- path: ""
gen(1,0) -- path: "("
gen(2,0) -- path: "(("
gen(2,1) -- path: "(()"
gen(2,2) EMIT -- "(())"
gen(1,1) -- path: "()"
gen(2,1) -- path: "()("
gen(2,2) EMIT -- "()()"
Output:
2
(())
()()
For n = 3, the five Catalan strings are: ((())), (()()), (())(), ()(()), ()()().
Complexity
- Time: O(4^n / sqrt(n)), the size of the output, since each valid string is one unit of work and the Catalan numbers grow as
~4^n / (n * sqrt(n)). - Space: O(n) for the recursion stack (plus the output, which is part of the answer not the algorithm's working memory).
For n <= 10, this is fast: C_10 = 16,796.
Common pitfalls
- Generating all
2^(2n)strings and filtering: works, but does 60x more work than backtracking onn = 10. Worse for largern. - Using a literal stack in the recursion: redundant; the call stack and the
o, ccounters already encode the stack's state. - Emitting strings out of lexicographic order: if you recurse on
')'before'(', the order is wrong. Always recurse on'('first. - Mutating a shared list without restoring on return: classic backtracking bug. Always pop after the recursive call.
- Forgetting the
path.append/path.poppairing: leaves stale state in subsequent siblings.
Where this shows up in the enterprise
- IDE refactoring suggestions (IntelliJ, VS Code, Eclipse): enumerating valid groupings during a "rewrite to ..." refactor.
- Compiler test-case generation: building syntactically valid expressions of bounded size to stress-test parsers.
- Property-based testing (Hypothesis, QuickCheck): shrinking generators for grouped expressions.
- SAT solver test harnesses: generating valid CNF substructures.
- DNA / RNA secondary-structure prediction: matching base pairs map to valid bracket strings; enumerating candidate folds.
- Chemistry tools (RDKit): SMILES branch enumeration uses similar stack-balance generation.
- Grammars for fuzzing (LLVM libFuzzer): structured input generation that respects nesting.
The generalisation is constrained tree-enumeration with a balance invariant. Wherever it shows up, backtracking with a "running balance" counter is the right tool.
Variants you'll see elsewhere
letter-combinations-phone: pure enumeration without a balance constraint.subsets,permutations,combinations: same backtracking skeleton with different prune rules.restore-ip-addresses: backtracking with a length-and-value constraint.palindrome-partitioning: same skeleton, predicate is "is this prefix a palindrome?".n-queens: backtracking with multiple parallel constraints (column, diagonals).
How parikshan turns this into a learning loop
The dynamic private tests on parikshan check:
n = 1(one string:()).n = 2(two strings; ordering matters).n = 10(the upper limit; tests that you do not over-generate and time out).- Stress on counting: the first line of output must be
C_n, and miscounting is a common bug. - Strict lexicographic ordering: any out-of-order string fails the byte-exact check.
The integrity-scored AI sparring partner is the right place, in practice mode, to verbalise why the o >= c invariant suffices before exam mode. Once you can say "any prefix violating that invariant cannot be extended to a valid string", the recursion writes itself.
In the AI-integrated workspace
An agent generating parenthesis-enumeration code might use either backtracking or BFS over a state graph. Both work for small n. Your review:
- Is the algorithm using two counters (open used, close used) or a literal stack? The counters are cleaner; ask the agent to refactor if it built a stack.
- Is the recursion depth-first with open before close, guaranteeing lex order without a sort?
- Are the counters incremented before the recursive call and decremented (or restored via slice) after?
- Is the base case
len(path) == 2 * n, or some incorrect proxy?
In production, code that enumerates a constrained language is rare but high-stakes (fuzzing, code generation, refactoring). The agent's correctness on this kind of code is brittle. A senior reviewer should be able to read the recursion and immediately spot the missing path.pop() or the swapped branch order. parikshan trains that reviewer.
generate-parentheses is the moment the stack metaphor meets backtracking. The stack is the shape of the problem; backtracking is the algorithm that explores valid shapes. Master both here and the rest of the backtracking canon follows.