longest-palindromic-substring
Difficulty: Medium · Topic: 1-D Dynamic Programming · Practice it on parikshan: /practice/longest-palindromic-substring/start
The story
A bioinformatics pipeline at Illumina scans short reads for palindromic motifs, which are biological markers for inverted-repeat regions where DNA folds back on itself. Each read is a string of base characters, and the analyst needs the longest contiguous palindromic substring in each read. The same query runs across millions of reads per sample.
That query is longest-palindromic-substring. Beyond genomics it appears in spell-correction palindrome filters at Google and Microsoft Word, in domain-name palindrome detection at Cloudflare, in compressed-string detection in HuggingFace tokenisers, and in the canonical 'expand around centre' family of string algorithms taught alongside Manacher's algorithm.
The interview punchline: there are two clean solutions. Expand-around-centre is simpler, ties to two pointers, and runs in O(n²) time with O(1) space. A full 2-D DP also exists; both are interview-frequent. This article walks both, then notes Manacher's O(n) algorithm for completeness.
What's actually being asked
Given a string s, return the longest contiguous substring of s that is a palindrome (reads the same forwards and backwards). If multiple substrings tie for the maximum length, return the lexicographically smallest one.
A contiguous substring is a slice s[i..j]; "subsequence" is a different problem.
Examples:
s = "babad"→"aba". Both"aba"and"bab"are length-3 palindromes;"aba"is lex-smaller.s = "cbbd"→"bb".s = "a"→"a". Single character.s = ""→"". Empty.s = "forgeeksskeegfor"→"geeksskeeg".
The naive approach
Enumerate every (i, j), slice s[i:j+1], check equality with its reverse, track the longest. O(n³). At |s| = 1000 that is 10⁹ operations: TLE.
def naive(s):
best = ""
n = len(s)
for i in range(n):
for j in range(i, n):
t = s[i:j+1]
if t == t[::-1] and (len(t) > len(best) or (len(t) == len(best) and t < best)):
best = t
return best
Correct, slow, never ship it.
The key insight
Every palindromic substring has a centre. For odd-length palindromes, the centre is a single character. For even-length palindromes, the centre is the gap between two adjacent characters. So there are exactly 2n - 1 possible centres in a string of length n.
For each centre, expand outward with two pointers l and r while s[l] == s[r] and the indices remain in bounds. The total work is O(n²): each centre might in the worst case expand to half the string.
# Expand-around-centre recurrence (operational definition):
# For each centre c in 0..n-1 (odd length):
# l = r = c
# while l >= 0 and r < n and s[l] == s[r]:
# record (l, r) as palindrome
# l -= 1; r += 1
#
# For each gap-centre c in 0..n-2 (even length):
# l = c; r = c + 1
# while l >= 0 and r < n and s[l] == s[r]:
# record (l, r) as palindrome
# l -= 1; r += 1
#
# Track the longest. Tie-break lexicographically at the end.
This is technically not "1-D DP" in the array sense, but it is the same DP idea collapsed: each centre's expansion reuses the result of the immediately smaller palindrome at the same centre (s[l+1..r-1] is a palindrome iff s[l..r] was, given s[l] == s[r]). The recurrence is implicit in the loop.
The key insight: explicit 2-D DP
State: dp[i][j] is True iff s[i..j] is a palindrome. The recurrence is:
# 2-D DP recurrence:
# dp[i][i] = True (length 1)
# dp[i][i+1] = (s[i] == s[i+1]) (length 2)
# dp[i][j] = (s[i] == s[j]) AND dp[i+1][j-1] (length >= 3)
#
# Track the longest (i, j) with dp[i][j] = True.
The four DP questions:
- State:
dp[i][j]= whethers[i..j]is a palindrome. - Base case:
dp[i][i] = Trueanddp[i][i+1] = (s[i] == s[i+1]). - Transition:
dp[i][j] = (s[i] == s[j]) and dp[i+1][j-1]forj - i >= 2. - Answer: among all
(i, j)withdp[i][j] = True, the longest; lex-tie-break by substring.
This is O(n²) time and O(n²) space. Worse memory than expand-around-centre, same time complexity, but the recurrence is explicit and useful to teach.
Step-by-step approach
def longest_palindrome(s):
n = len(s)
if n == 0:
return ""
max_len = 1
starts = [0]
for i in range(n):
# Odd-length palindromes centred at i
l, r = i, i
while l >= 0 and r < n and s[l] == s[r]:
l -= 1
r += 1
length = r - l - 1
if length > max_len:
max_len = length
starts = [l + 1]
elif length == max_len:
starts.append(l + 1)
# Even-length palindromes centred between i and i+1
l, r = i, i + 1
while l >= 0 and r < n and s[l] == s[r]:
l -= 1
r += 1
length = r - l - 1
if length >= 1:
if length > max_len:
max_len = length
starts = [l + 1]
elif length == max_len:
starts.append(l + 1)
return min(s[st:st + max_len] for st in starts)
O(n²) time, O(1) extra space (the starts list holds at most a handful of positions in practice). At |s| = 1000 that is 10⁶ operations: sub-millisecond.
Worked example
s = "babad"
Centres:
- i=0 ('b'), odd: l=0 r=0 match → l=-1 r=1. Palindrome
s[0:1] = "b", length 1. - i=0, even: l=0 r=1, s[0]='b', s[1]='a', no match. Length 0.
- i=1 ('a'), odd: l=1 r=1 match → l=0 r=2, s[0]='b', s[2]='b' match → l=-1 r=3. Palindrome
s[0:3] = "bab", length 3. - i=1, even: l=1 r=2, s[1]='a', s[2]='b', no match. Length 0.
- i=2 ('b'), odd: l=2 r=2 match → l=1 r=3, s[1]='a', s[3]='a' match → l=0 r=4, s[0]='b', s[4]='d', no match. Palindrome
s[1:4] = "aba", length 3. - i=2, even: l=2 r=3, s[2]='b', s[3]='a', no match. Length 0.
- i=3 ('a'), odd: length 1.
- i=3, even: no match. Length 0.
- i=4 ('d'), odd: length 1.
Length-3 palindromes start at 0 ("bab") and 1 ("aba"). Lex-smaller is "aba". Return "aba". Verified.
Complexity
- Naive O(n³): TLE at
|s| = 1000. - Expand-around-centre: O(n²) time, O(1) extra space. Ship this.
- 2-D DP: O(n²) time, O(n²) space. Useful for teaching, wasteful for production.
- Manacher's algorithm: O(n) time using palindrome-mirror reuse; the gold-standard for
|s| >= 10⁵. Out of scope for this article but worth knowing.
Common pitfalls
- Slicing inside the expand loop:
s[l:r+1] == s[l:r+1][::-1]inside the inner loop is O(n) per check, turning the algorithm into O(n³) and re-introducing the TLE. Compare characters by index; slice only once at the end per candidate. - Forgetting the even-length case: only handling odd-centred palindromes misses
"abba"entirely. Both odd and even centres are required. - Off-by-one on
r - l - 1: after the while loop,landrhave moved one step past the palindrome bounds. The palindrome iss[l+1..r-1], of lengthr - l - 1. Drawing this once and never forgetting is the whole pitfall. - Wrong tie-break: returning any palindrome of max length without lexicographic comparison. The spec is explicit; lex-smallest wins.
- Empty string handling:
s = ""should return"". If your code unconditionally indexess[0], it crashes. Guard the empty case.
Where this shows up in the enterprise
- Illumina, Oxford Nanopore, 23andMe: palindromic-motif detection in DNA reads for inverted-repeat regions, restriction-site mapping, and hairpin-loop structure prediction.
- Spell-check (Google search corrections, Microsoft Word suggestions): palindrome detection feeds heuristic features that influence suggestion ranking.
- Cloudflare domain analytics: detect palindromic vanity domains for traffic-pattern grouping.
- Stripe fraud detection: palindromic patterns in transaction IDs (and other identifiers) are weak signals in fraud-model features.
- HuggingFace tokeniser audits: palindromic substrings in vocabulary entries indicate symmetric merge artefacts.
- Compression and pattern-mining: palindromic substring lengths feed entropy estimators in some compression pipelines.
The pattern is universal whenever structure-preserving symmetry matters: any time the algorithm asks "where is the largest mirror-symmetric block", this is the tool.
Variants you'll see elsewhere
palindromic-substrings: count all palindromic substrings (same expansion, just increment a counter at each successful expansion).longest-palindromic-subsequence: subsequence, not substring. 2-D DP withdp[i][j]= LPS ofs[i..j], recurrencedp[i][j] = dp[i+1][j-1] + 2ifs[i] == s[j]elsemax(dp[i+1][j], dp[i][j-1]).palindrome-partitioning: partitionsinto the fewest substrings each of which is a palindrome. Combines this palindrome check with a 1-D DP on partitions.palindrome-pairs: given a list of words, find pairs whose concatenation is a palindrome.manachers-algorithm: O(n) palindrome radii using mirror reuse.valid-palindrome/palindrome-number: simpler check-only variants.
How parikshan turns this into a learning loop
This problem is the bridge between 1-D DP and 2-D DP. It uses 1-D-style centre expansion but conceptually relies on the 2-D recurrence.
- Read this article and state both recurrences before writing code. The 2-D DP
dp[i][j] = (s[i] == s[j]) AND dp[i+1][j-1]is the conceptual core; the expand-around-centre is the operational shortcut. - Practice mode with no AI: write expand-around-centre. Verify on
"babad" → "aba". The lex tie-break is the easiest place to get bug-tripped; the practice-mode dynamic tests include"aacabdkacaa"and similar tie-rich inputs. - AI training: Explain. Ask the assistant "why does expand-around-centre work?" The good answer mentions the implicit recurrence: a palindrome at radius
rplus matching outer characters becomes a palindrome at radiusr+1. If the assistant hand-waves, distrust its claims about other DP problems. - AI training: Find Bugs. Submit code that slices inside the inner loop. The assistant should flag the O(n³) regression. This is the most common Python-specific bug in this problem family.
- AI training: Optimise. Once correct, ask for Manacher's. The assistant likely produces it; verify the answer matches your O(n²) on
"abacaba"and"forgeeksskeegfor". - Exam mode: redo cold. Bank adversarial tests include
""(empty), single-character, all-same"aaaaa", and length-1000 inputs that force the algorithm to do real work.
In the AI-integrated workspace
This is one of the most common interview problems where AI agents produce code that almost works.
- The slicing bug: an agent writes
if s[l:r+1] == s[l:r+1][::-1]inside the expand loop. Code is correct but O(n³); the sample test passes; the stress test on|s| = 1000TLEs. - The missing even-centre bug: an agent writes only the odd-centre expansion. Code passes
"babad"and"cbbd"; fails on"abba"(returns"a"instead of"abba"). - The off-by-one in length: writes
length = r - l + 1after the while loop instead ofr - l - 1. Returns lengths that are wrong by 2. The bug shows up as palindromes that are too long. - The lex tie-break ignored: the agent returns any palindrome of max length. On
"babad"it might return"bab"instead of"aba". Sample tests sometimes accept both, masking the bug.
The 2027 engineer's discipline:
- Name the algorithm in five seconds: "expand around centre, O(n²), both odd and even centres, track (start, length) not slices". That single phrase steers the agent away from every failure mode above.
- State the off-by-one once: after the while loop, the palindrome bounds are
s[l+1..r-1]of lengthr - l - 1. Repeat this aloud before code generation. - Verify on
"abba". The answer must be"abba"(even-centre catcher). - Verify on
"babad". The answer must be"aba"(lex tie-break catcher). - Verify on
"". The answer must be""(empty-input catcher). - Ask explicitly for character-by-index comparison, not slicing. This single instruction prevents the O(n³) regression that nearly every default agent ships.
A candidate who can articulate "expand around centre, O(n²), watch the even centres and the lex tie-break" within seconds is the engineer who directs AI to a correct, fast, two-dozen-line solution. Without that articulation, the agent ships code that passes the sample inputs and quietly breaks in production on "abba", "", or |s| = 1000.