number-of-1-bits

Difficulty: Easy · Topic: Bit Manipulation · Practice it on parikshan: /practice/number-of-1-bits/start

The story

A network operations engineer at Cloudflare is debugging a bloom filter that the WAF uses to fingerprint suspicious request signatures at the edge. Each request's fingerprint is a 64-bit integer derived from a few hash functions, and the filter's false-positive rate depends on the average number of set bits in the bitmap. To compute that average over a billion requests per second across 300 data centres, the production code needs a popcount routine that runs in tens of nanoseconds. The team picks Brian Kernighan's n & (n - 1) because it iterates exactly once per set bit, not once per total bit, and it fits inside a single hot-path register.

The same trick runs inside GitHub team-permissions checks, AWS IAM policy evaluation, PostgreSQL bitmap-heap scans, Redis bitcount, gzip's Huffman builder, and every CRC chip on every network card you have ever touched. Popcount is one of the most-used primitives in computing, and the algorithm fits in three lines.

What's actually being asked

Given an unsigned integer n (think of it as a 32-bit unsigned binary representation), return the number of 1 bits in its binary form. The count is variously called the Hamming weight, the population count, or the popcount.

Examples:

n = 11 → 3. Binary 1011, three set bits.
n = 128 → 1. Binary 10000000, one set bit.
n = 0 → 0. No bits set.
n = 2^32 - 1 = 4294967295 → 32. All bits set.
n = 7 → 3. Binary 111.

The naive approach

Walk the bits one by one. Look at the lowest bit with n & 1, accumulate it, then right-shift n and repeat until n == 0.

def popcount_naive(n):
    count = 0
    while n > 0:
        count += n & 1
        n >>= 1
    return count

Correct, O(b) where b is the bit-width. For a 32-bit integer this is at most 32 iterations, already fast. But there is a strictly better algorithm that runs in O(popcount), which for sparse numbers like 2^31 finishes in one iteration instead of 32.

The key insight

Brian Kernighan's trick: the expression n & (n - 1) clears the lowest set bit of n. Apply it in a loop and count the iterations.

def popcount(n):
    count = 0
    while n:
        n &= n - 1
        count += 1
    return count

The loop runs exactly once per set bit. For n = 0 it runs zero times. For n = 2^31 it runs once. For n = 2^32 - 1 it runs 32 times. Always O(popcount(n)) <= b, which beats the naive walk on every sparse input.

Why does `n & (n - 1)` clear the lowest set bit?

This is the proof you should be able to write on a whiteboard. Let the lowest set bit of n sit at position k. Then n looks like

n      = ...X 1 0 0 ... 0       (k zeros to the right of the 1 at position k)

Subtracting 1 from n borrows through the trailing zeros. The bit at position k becomes 0, and every position below it flips from 0 to 1:

n - 1  = ...X 0 1 1 ... 1       (k ones now to the right of the 0 at position k)

Now AND them bit-by-bit. The high bits ...X match in both and survive. At position k, 1 & 0 = 0. Below position k, every 0 & 1 = 0. Result:

n & (n-1) = ...X 0 0 0 ... 0

The bit at position k is gone, the bits below it are still zero, the bits above it are unchanged. Net effect: exactly one set bit was cleared, and not just any one, the lowest one. Apply it in a loop, count the iterations, you get the popcount.

This proof is the entire reason the trick works and the reason it iterates once per set bit. Internalise it; it is the foundational identity behind Fenwick trees, lowest-bit isolation (n & -n), and several other classic bit primitives.

Step-by-step approach

Read n from input.
Initialise count = 0.
While n > 0:
- n = n & (n - 1) (clear the lowest set bit).
- count += 1 (record that one more set bit existed).
Print count.

That is the full reference implementation. As a sanity-check, Python's built-in bin(n).count('1') produces the same answer in idiomatic one-liner form, and on Python 3.10+ int.bit_count() is a native function. In production at Cloudflare or PostgreSQL, the CPU's POPCNT instruction does it in a single cycle. Knowing the loop matters for languages where the intrinsic is unavailable and for understanding what the intrinsic does.

Worked example

n = 11. Binary 1011.

Iteration 1: n = 1011, n - 1 = 1010, n & (n - 1) = 1010. count = 1.
Iteration 2: n = 1010, n - 1 = 1001, n & (n - 1) = 1000. count = 2.
Iteration 3: n = 1000, n - 1 = 0111, n & (n - 1) = 0000. count = 3.
Loop exits. Return 3. ✓

Notice how each step strips off exactly the lowest 1 bit and leaves the rest untouched. The trace is the proof you can give to the agent that generated your code.

Sparse case: n = 2147483648 (which is 2^31, binary 1000...0 with 31 zeros).

Iteration 1: n - 1 = 0111...1. n & (n - 1) = 0. count = 1.
Loop exits. Return 1. ✓

The naive bit-walk would have taken 32 iterations. Kernighan finishes in one.

Complexity

Time: O(popcount(n)) for Kernighan; O(b) for the naive walk, where b is the bit-width.
Space: O(1) for both. Only the accumulator.

For 32-bit inputs both finish in at most 32 iterations. Kernighan wins on average across realistic distributions because real-world integers are sparse; a typical IAM permission mask might have 3 set bits, not 32.

Common pitfalls

Forgetting that n == 0 runs the loop zero times. Correct by construction in the Kernighan form. Several wrong "fixes" add an extra count = 1 initialiser that breaks the n = 0 case.
Signed-vs-unsigned right shift. Not an issue in Python (integers are arbitrary-precision and the bit-pattern matches the mathematical value). In Java use >>> (logical shift), not >> (arithmetic shift), or you will count phantom 1s in the sign bit. In C, shifting a signed negative is implementation-defined.
Overflow in n - 1 when n = 0. In unsigned arithmetic on bounded types, 0 - 1 wraps to all-ones, which then ANDs with 0 to give 0. The while (n) guard prevents the loop from executing, so it never matters.
Using bin(n).count('1') in a hot loop. Allocates a Python string of length up to 34 per call. Fine for one-off calls, slow for tight inner loops. Prefer int.bit_count() on Python 3.10+ or the Kernighan form on older versions.
Forgetting that some languages name it differently. Java has Integer.bitCount, C++ has __builtin_popcount, Rust has u32::count_ones, x86 has POPCNT. Recognise the name in code review.

Where this shows up in the enterprise

AWS IAM and Discord role masks: each permission is a bit; the user's permission set is one integer. Popcount answers "how many permissions does this user have?"
Linux file permissions and systemd capability bits: rwx packed into 9 bits; popcount counts granted rights.
GitHub team and repository permission bitmaps: aggregated permissions across teams use bitwise OR and popcount for reporting.
Cloudflare and Akamai bloom filters at the edge: filter quality depends on the popcount of the bitmap; production health metrics include the running popcount.
gzip and brotli compression: every codec is a bit twiddler. Popcount is a constant inside Huffman code-length validation and CRC-32 lookup-table construction.
PostgreSQL bitmap heap scans: when multiple WHERE clauses are AND-ed via bitmaps, the planner uses popcount to estimate the result-set size.
CRC and ECC chips on every Ethernet card: hardware popcount is part of the syndrome calculation; the standard is named Hamming weight for exactly this reason.
Chess and Go engines (Stockfish, KataGo): bitboards represent the entire board in 64 bits; popcount answers "how many pawns does white have?" in one CPU instruction.
Bioinformatics k-mer counting at Illumina and 10x Genomics: 2-bit-per-base sequence representations use popcount to compute Hamming distance between reads at chip speed.
Cryptographic hash side-channels: detecting biased outputs by checking that popcount of the digest is near the expected mean.

The pattern is universal: any time data is packed as bits and you need to know how many flags are set, you want popcount, and the Kernighan trick is the algorithm.

Variants you'll see elsewhere

counting-bits: popcount of every integer from 0 to n. DP recurrence: dp[i] = dp[i & (i-1)] + 1 reuses Kernighan inside.
hamming-distance: popcount of x ^ y. Two operations: one XOR, one popcount.
single-number: XOR every value; the survivor is the unique element. The same family of bit identities.
reverse-bits: also bit-twiddling but with a different primitive (mask-and-shift or table lookup).
power-of-two: n > 0 and (n & (n - 1)) == 0. Same primitive: a power of two has exactly one set bit.
fenwick-tree: uses i & -i to isolate the lowest set bit. Cousin identity to Kernighan, same proof structure.

Once you have n & (n - 1) and n & -n in your hands, every bit-manipulation primitive becomes a one-line application.

How parikshan turns this into a learning loop

This is one of two entry points to the bit-manipulation cluster (the other is single-number). The integrated loop is built to make these primitives muscle memory.

Read this article. Read the proof of n & (n - 1) twice; the proof is the trick.
Solve in practice mode with the AI chat off. Before writing code, trace n = 11 step by step on paper. Watch the lowest 1 disappear each iteration. The trace is the algorithm.
Submit. The bank's dynamic private tests include n = 0, n = 1, n = 2^31 (sparse), n = 2^32 - 1 (dense), and the alternating-bit pattern 0xAAAAAAAA. The sparse and dense extremes catch off-by-one and signed-shift bugs.
AI training: Optimise if you wrote the naive bit walk. The assistant should suggest Kernighan or int.bit_count(). Each request costs a slice of integrity score, so spend deliberately.
Move to single-number in exam mode. Same cluster, different identity (XOR cancellation). Solving it cold confirms you have absorbed the cluster, not memorised one trick.

Dispute is rare on bit problems; verdicts are unambiguous. The exception: if you believe your O(popcount) solution should not TLE, dispute it. A human reviewer can re-examine the limit on the bank's reference.

In the AI-integrated workspace

Bit-twiddling is the place where AI agents produce the most elegant or the most broken code, with no middle ground. Verifying it requires reading every operator.

The classic failure modes for popcount:

Arithmetic right shift instead of logical. The agent writes >> in Java and silently produces 1s in the sign-bit position. In Python this doesn't happen because integers are unbounded, but a port to Java or C breaks the test.
Forgetting n = 0. The agent writes do { ... } while (n) which executes once even when n == 0. Returns 1 instead of 0.
Naive bit walk presented as "popcount". Works but is strictly worse than Kernighan. In a 1-billion-row pipeline at Cloudflare, the constant factor matters.
Lookup-table optimisation without justifying memory. The agent generates a 256-byte lookup table for byte popcount; great for hot loops, overkill for one-shot use.

The 2027 engineer's discipline for bit primitives:

State the identity in English before the agent codes. "n & (n - 1) clears the lowest set bit. Loop and count until n == 0."
Tell the agent to implement Kernighan. Not "popcount"; spell out the identity.
Verify on n = 11 and n = 0 by hand. Always. The proof is short enough to do mentally.
Check the shift operator in the target language. >> in Python and Java means different things on negative inputs.
Ask whether the platform has an intrinsic (int.bit_count, Integer.bitCount, __builtin_popcount, POPCNT) and prefer it for production.

The discipline turns a probabilistic answer into a deterministic, verified one. parikshan's bit-manipulation problems are short by design; the algorithms are tiny, but the proofs are the point. When you walk into a code review at Cloudflare's edge team or PostgreSQL's storage team, you will direct the AI agent rather than approve its first bit-twiddle.