single-number
Difficulty: Easy · Topic: Bit Manipulation · Practice it on parikshan: /practice/single-number/start
The story
A satellite communications team at SpaceX's Starlink is debugging an error-correction layer. Every payload byte is transmitted along with its parity bit, a single extra bit such that the XOR of the byte plus the parity equals zero. When the ground station XORs everything together, the result tells them whether any bit flipped in transit. The same identity (x XOR x = 0) sits at the core of every error-correcting code on every satellite link, every ECC memory module, every QR code, every CD-ROM you have ever held. The byte arrives doubled by the parity; the lone bit that doesn't pair up is the error signature.
The same identity solves the single-number problem in one pass, zero auxiliary memory, no hash map, no sort. Once you see XOR as "pairs cancel", you will see it inside CRC32, inside zobrist hashing, inside one-time pads, inside every protocol that does light error detection without a heavy checksum.
What's actually being asked
You are given a non-empty array of integers. Every element appears exactly twice except for one element, which appears exactly once. Find and return that single element.
The constraints are tight: linear time, constant extra space. No hash map, no sort, no sidecar array.
Examples:
[2, 2, 1]→1.[4, 1, 2, 1, 2]→4.[99]→99(the only element is automatically the unique one).[1, 1, 0, 2, 2]→0(the unique value can be zero).[-1, -1, 7]→7(negatives are fine).
The naive approach
Hash-map count. Walk the array, increment a count per value, return the key with count 1.
def single_naive(nums):
seen = {}
for x in nums:
seen[x] = seen.get(x, 0) + 1
for k, v in seen.items():
if v == 1: return k
Correct, O(n) time, O(n) space. The problem explicitly bans the extra space.
Sort and pair. Sort the array, walk it two at a time, return the first index where the pair breaks.
def single_sort(nums):
nums.sort()
for i in range(0, len(nums) - 1, 2):
if nums[i] != nums[i + 1]:
return nums[i]
return nums[-1]
Correct, O(n log n) time, O(1) space if you sort in place. Better on memory but slower than the optimal solution.
The optimal is O(n) time, O(1) space, and it fits in one line.
The key insight
XOR has three properties that, taken together, solve this problem:
- Self-inverse:
x XOR x = 0for any integerx. Every value that appears twice contributes its own pair, andv XOR v = 0. - Identity element:
x XOR 0 = x. Zero is the neutral element. - Commutative and associative: the order in which you XOR a list of values does not matter. You can reorder the array however you like and the running XOR is the same.
Combine the three. Start an accumulator acc = 0. XOR every element of the array into acc. By commutativity, you can pretend the array was sorted so that all the duplicate pairs sit next to each other. By self-inverse, each pair cancels to 0. By identity, the lone value XORed with 0 is itself. The accumulator ends up holding the unique value.
The "everything cancels" proof
Let the array contain pairs (p_1, p_1), (p_2, p_2), ..., (p_k, p_k) and the unique element u. By commutativity:
XOR(nums) = (p_1 XOR p_1) XOR (p_2 XOR p_2) XOR ... XOR (p_k XOR p_k) XOR u
= 0 XOR 0 XOR ... XOR 0 XOR u
= u
That is the entire proof. No induction, no case analysis. The three identities above are enough.
Step-by-step approach
- Read
nand the arraynums. - Initialise
acc = 0. - For each
xinnums:acc ^= x. - Print
acc.
The whole algorithm is three lines. There is no auxiliary data structure of size dependent on n.
def single_number(nums):
acc = 0
for x in nums:
acc ^= x
return acc
Worked example
nums = [4, 1, 2, 1, 2].
Trace acc after each XOR. Use 4-bit binary for clarity:
| step | x | x binary | acc before | acc after |
|---|---|---|---|---|
| 1 | 4 | 0100 | 0000 | 0100 |
| 2 | 1 | 0001 | 0100 | 0101 |
| 3 | 2 | 0010 | 0101 | 0111 |
| 4 | 1 | 0001 | 0111 | 0110 |
| 5 | 2 | 0010 | 0110 | 0100 |
Return acc = 4. ✓
Watch what happened. After step 3 the accumulator was 0111 (the XOR of 4, 1, 2). After step 4, XOR-ing 1 removed it (the 1 bit cleared). After step 5, XOR-ing 2 removed it (the 2 bit cleared). What survived is precisely the bits of 4, the unique element.
The same trace works if you reorder the array. Try [1, 1, 2, 2, 4]: the first two XORs produce 0, the next two XORs produce 0, and the final XOR with 4 produces 4. Order does not matter because XOR is commutative.
Complexity
- Time: O(n). Single pass through the array.
- Space: O(1). Only the accumulator and a loop variable.
For the constraint n ≤ 10^5, this finishes in microseconds.
This is the theoretically optimal complexity: you must read every element at least once (otherwise the unique one might be the one you skipped), so O(n) is a lower bound on time, and O(1) is the smallest possible extra space.
Common pitfalls
- Initialising
acctonums[0]and starting the loop at index 1. This works but is fragile whenn = 1because the loop is empty. Initialising to0and looping the whole array is cleaner. - Using
|(OR) or&(AND) instead of^(XOR). Neither has the self-inverse property.5 | 5 = 5, not0. Pick the operator deliberately. - Assuming the array is sorted. It need not be; XOR doesn't care about order. Don't waste time sorting.
- Assuming positive integers only. XOR works on the underlying bit pattern, so negatives behave correctly in Python. In Java or C, signed integers XOR identically; the bit-pattern interpretation is what matters.
- Treating the unique value as something special. It might be zero.
[1, 1, 0]returns0. XOR handles this without any special case. - Reaching for a hash map first. The problem explicitly says constant space. If you submit the hash-map solution it may pass on a permissive judge, but in interviews and in real systems the XOR solution is the expected one.
Where this shows up in the enterprise
- Parity bits in ECC RAM (every server-class memory module): one extra bit per byte such that the XOR of the bits is zero. Reads check for the lone flipped bit.
- CRC32 in TCP/IP, Ethernet, ZIP, PNG: a longer XOR-based check, but the underlying primitive is the same identity.
- QR codes and barcodes (every checkout scanner in the world): Reed-Solomon error correction uses XOR-based polynomial arithmetic.
- Satellite links at Starlink and Iridium, deep-space Voyager links at NASA/JPL: heavy error-correcting codes built on XOR identities to recover from bit flips caused by cosmic radiation.
- One-time pad cryptography: ciphertext = plaintext XOR key. Decryption is the same operation, by the self-inverse identity.
- Zobrist hashing in chess engines (Stockfish, Leela Chess Zero): board hash is the XOR of per-piece-per-square random keys; moves update the hash with two XORs.
- Bloom-filter-style fingerprinting at Cloudflare and Akamai: collision detection sometimes uses XOR rolling hashes for streaming inputs.
- Distributed checksum at Amazon S3 and Google Cloud Storage: object-level checksums are combined via XOR-friendly operations for fast incremental updates.
- RAID-5 parity in every data centre: one drive's content is the XOR of all the others; lose any single drive and you reconstruct it with one XOR pass.
The pattern is universal: any time pairs of values combine such that two of the same thing cancel, XOR is the operator. Single-number is the smallest possible problem in that family.
Variants you'll see elsewhere
single-number-ii: every element appears three times except one, which appears once. XOR alone no longer works (three copies ofxXOR tox, not0). Use bitcount mod 3 or a two-state automaton on bits.single-number-iii: every element appears twice except for two unique elements. XOR all to getu1 XOR u2. Find any set bit in the result; that bit differs betweenu1andu2. Partition the array by that bit and XOR each partition separately.missing-number: array of sizencontains a subset of{0, 1, ..., n}missing exactly one. XOR all values with all indices; the missing one survives.find-the-duplicate-number: the dual case. Floyd's cycle detection, or sum-based, or bitcount-based.xor-of-array-elements: appears in segment-tree problems and competitive programming. Same identity, larger scale.
Once XOR cancellation clicks, this entire family becomes one technique with different counting structures.
How parikshan turns this into a learning loop
This is one of two entry points to the bit-manipulation cluster (the other is number-of-1-bits). The integrated loop is built to make these identities reflexive.
- Read this article. Read the "everything cancels" proof twice; the proof is the trick.
- Solve in practice mode with the AI chat off. Before writing code, trace
[4, 1, 2, 1, 2]step by step on paper. Watch each bit cancel. The trace is the proof. - Submit. The bank's dynamic private tests include the unique-is-zero case, a single-element array, negative integers, and a 10^5-element stress test. The unique-is-zero case is the silent assassin; a solution that hard-codes "the unique value is non-zero" will fail here.
- AI training: Explain if you wrote the hash-map version and aren't sure why XOR works. Each request costs a slice of integrity score, so spend it deliberately.
- Move to
single-number-iiorsingle-number-iiiin practice mode. The XOR identity escalates to bitcount-mod-3 (for triples) and two-pass partitioning (for two uniques). Solving them cold confirms you have absorbed the cluster.
Dispute is rare on bit problems; the verdict is unambiguous on counting. The exception: if you believe your O(n) solution should not TLE on the stress test, dispute it.
In the AI-integrated workspace
XOR identities are the cluster where AI agents misclassify the problem most often. The classic failure modes:
- Reaches for a hash map. The agent writes the O(n) time, O(n) space solution because that's what training data shows first. It passes the tests, ignores the "constant space" constraint. Reading the spec before approving the code is the entire fix.
- Sorts the array. The agent writes the O(n log n) sort-and-pair solution. Slower than necessary, but at least constant space.
- Uses
|instead of^. The agent confuses bitwise operators.acc |= xproduces "the OR of all elements", which is rarely useful and never the unique element. - Initialises
acctonums[0]and skips the first element. Off-by-one in the loop bound; usually still correct but breaks onn = 1.
The 2027 engineer's discipline for XOR problems:
- State the identity in English before the agent codes. "
x XOR x = 0andx XOR 0 = x. Every duplicate pair cancels; the unique element survives." - Tell the agent to implement that XOR scan. Not "find the single number"; spell out the identity.
- Verify on
[4, 1, 2, 1, 2]by hand. Always. Five XORs, fits on a sticky note. - Reject hash-map solutions immediately if the spec requires constant space. This is the spec read that separates engineers who direct AI agents from engineers who approve their first guess.
- Stress on edge cases. Single element, unique is zero, negatives, a 10^5-element array with the unique at the start, the middle, and the end.
The discipline turns a probabilistic answer into a verified one. parikshan's bit-manipulation problems are short by design; the algorithms are tiny, but the identities are the point. When you walk into a code review at Cloudflare's edge team, a satellite link at SpaceX, or an ECC validation at SK hynix, you will recognise x XOR x = 0 in five seconds and direct the AI agent toward the right solution.