happy-number

Difficulty: Easy · Topic: Arrays & Hashing · Practice it on parikshan: /practice/happy-number/start

The story

A reliability engineer at AWS Lambda is tuning a deterministic retry-jitter function. Customer code occasionally enters retry storms in which the same numeric state appears over and over, signalling a non-converging loop. The internal probe runs a synthetic iterative transformation on a state value, watches for either convergence to a sentinel (success) or repetition (cycle detected, fail fast). The synthetic test happens to be: replace the value with the sum of squares of its decimal digits. If it eventually reaches 1, the state was healthy. If it enters a cycle that does not contain 1, the state was sick.

That is happy-number. Strip away the metaphor and what you have is the cleanest possible cycle-detection drill. It teaches you two things that recur everywhere in systems engineering: how to detect a cycle in an iterated function, and how to do so without unbounded memory.

What's actually being asked

A positive integer is happy if the iterated map f(x) = sum of squares of decimal digits of x eventually reaches 1. If instead it enters a cycle that does not contain 1, the number is not happy. Given n, return true if happy, otherwise false.

The naive approach

Iterate forever and hope to see 1. That is wrong: for unhappy numbers the iteration never reaches 1 and the function never returns.

A first improvement: iterate up to "some big number of times" and return false if we have not seen 1. That makes the bug a tunable. Bad engineering.

A second naive variant: build a set of every value we have produced. Each iteration, check membership; if 1, return true; if already in the set, return false; otherwise add and continue. O(small) iterations, O(small) memory. This works.

The key insight

The iterated function visits a bounded state space. Any number with d digits maps to a value at most 81 * d. After one iteration, any 10-digit input drops below 810; after another, below 244; from there on, the iteration is bounded above by a small constant. So whatever happens, reaching 1 or hitting a cycle, happens within a few dozen steps.

That gives you two algorithms:

Hash set: store every produced value, detect repetition in O(1) lookup.
Floyd's tortoise-and-hare: maintain a slow and fast pointer through the iteration. If a cycle exists, they meet inside it; if the iteration reaches 1, the fast pointer notices first. Constant extra space.

Both are correct. The set version is shorter to write; the tortoise-and-hare version teaches the cycle-detection idiom you will reuse for linked-list cycles, state-machine loop detection, and game-of-life-style simulators.

Step-by-step approach

def is_happy(n):
    def step(x):
        s = 0
        while x:
            s += (x % 10) ** 2
            x //= 10
        return s
    seen = set()
    while n != 1 and n not in seen:
        seen.add(n)
        n = step(n)
    return n == 1

Floyd's variant:

def is_happy(n):
    def step(x):
        s = 0
        while x:
            s += (x % 10) ** 2
            x //= 10
        return s
    slow = n
    fast = step(n)
    while fast != 1 and slow != fast:
        slow = step(slow)
        fast = step(step(fast))
    return fast == 1

Worked example

n = 19

Trace:

step	value	sum-of-squares trace
0	19	1^2 + 9^2 = 1 + 81 = 82
1	82	8^2 + 2^2 = 64 + 4 = 68
2	68	6^2 + 8^2 = 36 + 64 = 100
3	100	1^2 + 0^2 + 0^2 = 1
4	1	hit sentinel, return `true`

Trace for n = 2:

step	value	next
0	2	4
1	4	16
2	16	37
3	37	58
4	58	89
5	89	145
6	145	42
7	42	20
8	20	4

4 is already in the set. Cycle detected. Return false. The cycle is the eight-element loop {4, 16, 37, 58, 89, 145, 42, 20}, which is the only cycle the digit-square map admits on positive integers under 1000.

Complexity

Time: O(log n) per step (decomposing an integer into digits), times O(1) steps (bounded state space). Overall O(log n) per call.
Space: O(log n) for the hash-set version (bounded by the state-space depth times log of the max value, which is small). O(1) for Floyd's variant.

Naive "iterate up to K" with a magic K was wrong in principle. The bounded-state-space argument turns the algorithm into a constant-time computation in practice.

Common pitfalls

Forgetting termination: a naive while n != 1 loops forever on unhappy numbers. You need either the set or the tortoise-and-hare to terminate.
Computing digits incorrectly: forgetting the integer division x //= 10 after squaring causes an infinite loop on the first iteration. Off-by-one in the digit decomposition is a classic bug.
Using floating-point: square roots, decimal conversions, anything that touches float is wrong. Stick to integer arithmetic.
Reseeding n = 1 checks incorrectly: n = 1 is happy by definition (already at the fixed point). Off-by-one in the initialisation skips this.
Storing the full sequence in a list: works but uses O(steps) memory rather than O(distinct states); makes the algorithm slightly slower for large inputs.
Implementing Floyd's with step(fast) instead of step(step(fast)): that is just two copies of the slow pointer, not a true hare. The cycle is never caught.

Where this shows up in the enterprise

AWS Lambda retry-storm detection: the story above; the canary tests that a state-iterating customer workload eventually converges or repeats.
Linked-list cycle detection (Cloudflare configuration graph): at edge POPs, configuration objects reference each other; a misconfiguration creates a cycle that Floyd's tortoise-and-hare detects in O(1) memory before the dispatcher loops forever.
Stripe Radar rules-engine guard: ensures the rule-rewriting pipeline does not enter a cycle when rules reference each other; same cycle-detection primitive.
Garbage collectors at JetBrains / Oracle JVM: cycle detection in reference graphs is the core of mark-and-sweep + cycle collectors.
Solidity / Ethereum static analysers: contracts must not be able to mutually call into recursive infinite loops; cycle detection over the call graph is the gate.
Compiler optimisation passes at LLVM: detect when a transformation has reached a fixed point so the pass can stop. Same iterate-until-stable structure.
GitHub Actions workflow validation: workflows triggering each other must be acyclic; the validator does a cycle-detection over the workflow graph.
Distributed-system snapshot algorithms: variants of tortoise-and-hare appear in failure-detection of token-passing rings (e.g., Chord, Akka clusters).

The skill the problem teaches, "detect a cycle in an iterated function without unbounded memory", shows up anywhere two parts of a system can reference each other.

Variants you'll see elsewhere

linked-list-cycle: same Floyd's algorithm, applied to .next pointers.
find-the-duplicate-number: array-as-function; Floyd's pinpoints the duplicate in O(1) extra space.
circular-array-loop: cycle detection with sign and length constraints.
cycle-in-directed-graph: DFS with three-colour marking; different algorithm, same goal.
nth-ugly-number: not a cycle problem, but a similar iterate-until-target structure.
power-of-two and power-of-three: numeric iterations that terminate by structural argument; bound-the-state-space reasoning.

How parikshan turns this into a learning loop

Read this article, then click into practice. The editor runs your code against public tests plus per-session dynamic private tests generated fresh each attempt, so memorising a gist fails. Hints are graduated: free first hint, small penalty for the second, larger for the third. The AI training assistant in exam mode offers Explain, Review, Find Bugs, Add Comments, Optimise, each with a small integrity cost so you spend AI deliberately. In practice mode AI chat is free, the right place to debate set vs tortoise-and-hare and to trace the cycle by hand. After the verdict, the dispute flow sends contested results to a human reviewer in one click. Attempt in practice mode first, then redo in exam mode untouched. When you can write both variants in five minutes and explain the bounded-state argument, the pattern is in your fingers.

In the AI-integrated workspace

An AI agent will write the set-based solution by default; it is short and obviously correct. The failure modes that escape review are:

Magic-number iteration cap: agents sometimes write for _ in range(1000) and call it a day. This is wrong in principle; the loop should terminate by structural argument, not by guess. Push back.
Forgetting to terminate on cycle: a while n != 1 without a seen-set or Floyd's pointer is an infinite loop on unhappy numbers. Verify the termination condition.
For variants like find-the-duplicate-number: agents will copy the set version and silently use O(n) memory when the problem demands O(1). Demand Floyd's.
Floating-point sneaks in via math.sqrt or **0.5: never; the algorithm is integer-only.

A senior engineer reads "iterated function, detect convergence or cycle" and immediately maps to Floyd's tortoise-and-hare. The candidate who can do that productively turns cycle-detection into a one-line review comment. The candidate who cannot ships infinite-loop bugs to production. parikshan trains the first kind.