parikshan

valid-palindrome

Difficulty: Easy  ·  Topic: Two Pointers  ·  Practice it on parikshan: /practice/valid-palindrome/start

The story

Stripe's risk engine receives merchant descriptors that need a sanity check before they hit the customer's statement. One specific rule the compliance team flagged: certain refund identifiers, after stripping punctuation and case, should be palindromic, because that is the fingerprint produced by a partner gateway in Manila. A non-palindromic identifier means the gateway re-encoded the descriptor, which means the chain of custody is broken. So an engineer is asked: given the raw descriptor, return true if its cleaned alphanumeric form reads the same forwards and backwards.

That is valid-palindrome. Once you see the pattern, you start seeing it in every "is this canonical form symmetric?" check across the industry: DNA palindromic sites in a Twist Bioscience pipeline, balanced ticker codes in a Bloomberg feed, mirror-symmetric file hashes in a deduplication store.

What's actually being asked

Given a string s, drop every character that is not in [A-Za-z0-9], lowercase the rest, and decide if the resulting sequence equals its own reverse. Print true or false. The empty cleaned sequence counts as a palindrome.

The naive approach

Build a cleaned copy, then compare it to its reverse. In Python that is two lines:

cleaned = [c.lower() for c in s if c.isalnum()]
print('true' if cleaned == cleaned[::-1] else 'false')

This is O(n) time and O(n) memory. It works, the judge accepts it, and you should not feel bad about writing it. The "naive" part is only the memory. For 200,000 characters that is fine. For a streaming sensor pushing gigabytes through the same logic in a Kafka consumer at Confluent, the extra allocation is the difference between staying inside the JVM heap and triggering a GC pause that breaks SLO.

The key insight

A palindrome is symmetric. Two pointers, one at each end, walking inward, can verify symmetry without allocating anything. The skip and compare logic is the converging-pointers flavour of two pointers: at every step you ask "do these two ends match?" and shrink the window by one on each side.

The only new wrinkle versus reverse-string is that you have to skip over non-alphanumeric characters as you go. The pointer that hit a punctuation mark moves on its own until it finds a real character; only then does the comparison happen.

Step-by-step approach

  1. Initialise l = 0, r = len(s) - 1.
  2. While l < r:
    • While l < r and not s[l].isalnum(), increment l.
    • While l < r and not s[r].isalnum(), decrement r.
    • If s[l].lower() != s[r].lower(), print false and return.
    • Otherwise increment l and decrement r.
  3. If the loop exits cleanly, print true.

Worked example

s = "A man, a plan, a canal: Panama"
steplrs[l]s[r]action
1029'A''a'match, both inward
2128' ''m'skip space, advance l
3228'm''m'match, both inward
4327'a''a'match, both inward
...............(continues mirroring)

Every comparison matches, the pointers cross, output is true. Now try "race a car": at some point l points to 'e' and r points to 'c', mismatch, output is false.

Complexity

  • Time: O(n). Each pointer moves at most n steps in total.
  • Space: O(1) for the two-pointer version, O(n) for the filter-then-compare version.

The two-pointer version is the right answer in a memory-bound context. The filter-then-compare version is the right answer when readability beats the constant-factor memory win.

Common pitfalls

  • Forgetting digits. isalnum covers letters and digits. If you hand-roll a check with just a-z and A-Z, the input "0P" will be cleaned to "0p" instead of being rejected, and you will return the wrong answer.
  • Comparing case-sensitively. 'A' != 'a' so you must lowercase before the comparison. Lowercasing both sides separately is fine.
  • Forgetting the l < r guard inside the skip loops. If the entire string is punctuation, the inner while runs forever without that bound.
  • Building the cleaned string in one language and using its raw indexing semantics. Unicode strings in Go and Rust are not random-access by byte for non-ASCII characters. Test inputs here are ASCII only, but the lesson generalises.

Where this shows up in the enterprise

  • Stripe / Adyen merchant descriptor checks: canonicalise the descriptor (strip punctuation, lowercase) and verify a structural property before storing.
  • Twist Bioscience / Illumina: DNA reverse-complement palindromes mark restriction enzyme sites; the same skip-and-compare pattern detects them on raw sequence strings.
  • Bloomberg ticker normalisation: rolling through symbol identifiers to detect mirrored ETF/ETN pairs.
  • Datadog log dedupe: hash the canonical form of a log line (lowercased, stripped) and test for self-symmetry to flag synthetic test traffic that uses palindromic markers.
  • Google Maps Places API: a category of automated abuse uses palindromic place names; the abuse classifier runs the same check on normalised names.
  • Cloudflare bot management: certain bot families emit palindromic user-agent fragments, and the rule engine canonicalises before testing.
  • GitHub repository name moderation: the trust and safety pipeline normalises a repo name and checks structural properties before allowing it.

The shape is identical everywhere: normalise to a canonical form, then check a symmetry property in linear time with O(1) memory.

Variants you'll see elsewhere

  • reverse-string is the same converging pointers without the skip logic.
  • valid-palindrome-ii lets you delete at most one character. The two-pointer solution remains, but on a mismatch you try two sub-checks.
  • longest-palindromic-substring flips the question: instead of yes/no on the whole string, find the longest palindromic window. Expand-around-centre is the standard approach.
  • palindromic-substrings counts them all using the same centre-expansion idea.
  • valid-anagram is the cousin that asks about multiset equality, not symmetry.

How parikshan turns this into a learning loop

After you read this article and click through to the practice link, the parikshan flow integrates four layers:

  1. The editor runs your code against public tests for instant feedback, and against per-session dynamic private tests (generated freshly per session) so memorising a known input is useless. Common synthetic cases here include strings of pure punctuation, single characters, and mixed-case unicode lookalikes.
  2. Hints are graduated. The first hint nudges you toward the two sub-problems (filter, compare). The third gives you the full converging-pointer skeleton. Each level docks integrity points; the score is parikshan's measure of how independently you reached the answer.
  3. The AI training assistant is a sparring partner in exam mode: Explain, Review, Find Bugs, Add Comments, Optimise. Each call costs integrity, so you pick carefully. In practice mode the AI chat is free.
  4. The dispute flow runs after the verdict. If you genuinely believe the judge is wrong, one click escalates to a human reviewer.

Read the concept here, attempt the problem in practice mode first, accept the integrity hit only when you truly need a hint, then redo the problem in exam mode with no help. When you can write the two-pointer version in under three minutes with no AI calls, the pattern is in your fingers.

In the AI-integrated workspace

In your future job, you will not hand-roll palindrome checks. An agent will. Your job is to:

  1. Recognise that the problem is "canonicalise then check symmetry". That re-framing is what lets you trust the answer in a Stripe-style compliance setting.
  2. Tell the agent precisely: "two pointers, skip non-alphanumeric on each side, lowercase compare". Not "make it a palindrome check", that hides the corner cases.
  3. Read the agent's code and verify three things: the skip loops are bounded by l < r, the comparison is case-insensitive, and digits are included in the alphanumeric set.
  4. Push the edge cases the agent will silently miss: empty string, all-punctuation, the "0P" digit-versus-letter trap.

The skill that compounds is recognising symmetry-on-canonical-form as a pattern. Candidates who can name it within five seconds are the ones who turn AI agents into leverage. Candidates who can't end up approving regex monstrosities because the sample test passed. parikshan is built to train the first kind.