parikshan

generate-parentheses

Difficulty: Medium  ·  Topic: Stack  ·  Practice it on parikshan: /practice/generate-parentheses/start

The story

A team at JetBrains writing the IntelliJ refactoring engine wants to enumerate all syntactically valid ways to wrap a sequence of n function calls in matched parentheses for a "rewrite to applicative form" suggestion. The IDE shows the user every valid grouping and asks them to pick. They give you n and want every well-formed string of n open and n close parens, in lexicographic order. The count for n = 4 is 14 (the fourth Catalan number); for n = 10 it is 16,796. Manageable for n up to 10.

That is generate-parentheses. It is the canonical "constrained generation" problem, and its solution is a backtracking recursion that uses the stack metaphor implicitly through the call stack and the matching invariant.

What's actually being asked

Given an integer n, output the count C_n of well-formed parentheses strings of n open and n close parens, followed by the strings themselves in lexicographic order.

C_n is the n-th Catalan number: 1, 2, 5, 14, 42, 132, ... for n = 1, 2, 3, 4, 5, 6, ....

The naive approach

Enumerate all 2^(2n) binary strings of length 2n, treat 0 as ( and 1 as ), and filter by validity. O(4^n) work and most candidates are invalid. For n = 10, that is 4^10 = 10^6 and 16,796 valid strings, passable, but the filter discards 98% of work.

The key insight

A partial string is viable (can be extended to a valid one) iff:

  • The number of ( used so far, o, satisfies o <= n.
  • The number of ) used so far, c, satisfies c <= o.

The second condition is the stack invariant: every prefix must have at least as many opens as closes. Equivalent: at every point you can think of a virtual stack of unmatched opens; the stack never goes negative, and at the end it must be exactly empty (which means o == c == n).

The recursion: from a partial string with o opens and c closes,

  • If o == n and c == n, emit the string.
  • If o < n, recurse with one more open: o+1, c.
  • If c < o, recurse with one more close: o, c+1.

The two branches together prune all invalid prefixes. The result is exactly C_n strings, in lexicographic order if you emit the open branch before the close branch (because ( is lexicographically less than )).

How the stack maps to the recursion

The "stack of unmatched opens" is implicit:

  • o - c is the current stack size.
  • o < n corresponds to "stack can grow" (push).
  • c < o corresponds to "stack is non-empty, can pop" (pop and match).

You never need a literal stack. The recursion's call stack is the stack of decisions; the invariant o >= c keeps the bracket stack non-negative at every step. This is the cleanest example of "the structure of the problem is the structure of the algorithm" in the stack chapter.

Step-by-step approach

  1. results = [], path = [] (a char list, used as the in-progress string).
  2. Define def gen(o, c):
    • If len(path) == 2 * n: append ''.join(path) to results; return.
    • If o < n: append '(' to path; call gen(o + 1, c); pop from path.
    • If c < o: append ')' to path; call gen(o, c + 1); pop from path.
  3. Call gen(0, 0).
  4. Print len(results). Then print each string on its own line.

Because the open branch fires before the close branch and the recursion is depth-first, the strings emerge in lexicographic order naturally. No sort needed.

Worked example

n = 2

The recursion tree:

gen(0,0)               -- path: ""
  gen(1,0)             -- path: "("
    gen(2,0)           -- path: "(("
      gen(2,1)         -- path: "(()"
        gen(2,2) EMIT  -- "(())"
    gen(1,1)           -- path: "()"
      gen(2,1)         -- path: "()("
        gen(2,2) EMIT  -- "()()"

Output:

2
(())
()()

For n = 3, the five Catalan strings are: ((())), (()()), (())(), ()(()), ()()().

Complexity

  • Time: O(4^n / sqrt(n)), the size of the output, since each valid string is one unit of work and the Catalan numbers grow as ~4^n / (n * sqrt(n)).
  • Space: O(n) for the recursion stack (plus the output, which is part of the answer not the algorithm's working memory).

For n <= 10, this is fast: C_10 = 16,796.

Common pitfalls

  • Generating all 2^(2n) strings and filtering: works, but does 60x more work than backtracking on n = 10. Worse for larger n.
  • Using a literal stack in the recursion: redundant; the call stack and the o, c counters already encode the stack's state.
  • Emitting strings out of lexicographic order: if you recurse on ')' before '(', the order is wrong. Always recurse on '(' first.
  • Mutating a shared list without restoring on return: classic backtracking bug. Always pop after the recursive call.
  • Forgetting the path.append / path.pop pairing: leaves stale state in subsequent siblings.

Where this shows up in the enterprise

  • IDE refactoring suggestions (IntelliJ, VS Code, Eclipse): enumerating valid groupings during a "rewrite to ..." refactor.
  • Compiler test-case generation: building syntactically valid expressions of bounded size to stress-test parsers.
  • Property-based testing (Hypothesis, QuickCheck): shrinking generators for grouped expressions.
  • SAT solver test harnesses: generating valid CNF substructures.
  • DNA / RNA secondary-structure prediction: matching base pairs map to valid bracket strings; enumerating candidate folds.
  • Chemistry tools (RDKit): SMILES branch enumeration uses similar stack-balance generation.
  • Grammars for fuzzing (LLVM libFuzzer): structured input generation that respects nesting.

The generalisation is constrained tree-enumeration with a balance invariant. Wherever it shows up, backtracking with a "running balance" counter is the right tool.

Variants you'll see elsewhere

  • letter-combinations-phone: pure enumeration without a balance constraint.
  • subsets, permutations, combinations: same backtracking skeleton with different prune rules.
  • restore-ip-addresses: backtracking with a length-and-value constraint.
  • palindrome-partitioning: same skeleton, predicate is "is this prefix a palindrome?".
  • n-queens: backtracking with multiple parallel constraints (column, diagonals).

How parikshan turns this into a learning loop

The dynamic private tests on parikshan check:

  • n = 1 (one string: ()).
  • n = 2 (two strings; ordering matters).
  • n = 10 (the upper limit; tests that you do not over-generate and time out).
  • Stress on counting: the first line of output must be C_n, and miscounting is a common bug.
  • Strict lexicographic ordering: any out-of-order string fails the byte-exact check.

The integrity-scored AI sparring partner is the right place, in practice mode, to verbalise why the o >= c invariant suffices before exam mode. Once you can say "any prefix violating that invariant cannot be extended to a valid string", the recursion writes itself.

In the AI-integrated workspace

An agent generating parenthesis-enumeration code might use either backtracking or BFS over a state graph. Both work for small n. Your review:

  1. Is the algorithm using two counters (open used, close used) or a literal stack? The counters are cleaner; ask the agent to refactor if it built a stack.
  2. Is the recursion depth-first with open before close, guaranteeing lex order without a sort?
  3. Are the counters incremented before the recursive call and decremented (or restored via slice) after?
  4. Is the base case len(path) == 2 * n, or some incorrect proxy?

In production, code that enumerates a constrained language is rare but high-stakes (fuzzing, code generation, refactoring). The agent's correctness on this kind of code is brittle. A senior reviewer should be able to read the recursion and immediately spot the missing path.pop() or the swapped branch order. parikshan trains that reviewer.

generate-parentheses is the moment the stack metaphor meets backtracking. The stack is the shape of the problem; backtracking is the algorithm that explores valid shapes. Master both here and the rest of the backtracking canon follows.