Blind75 - Group Anagrams (Hash Map & Visited)
Group Anagrams
Given an array of strings, group the anagrams together — words that become identical when their characters are rearranged.
This post covers the journey from basic list manipulation to efficient hash map usage in Python.
First Approach: Bruteforce with visited
My initial idea was to compare every word one by one.
However, using pop() while iterating over a list causes a critical index corruption problem.
To solve this, I introduced a visited list.
Core Principle
- Leave the original list untouched
- Track “has this word already been grouped?” using
True / False.
Key Takeaway
Instead of directly deleting elements from a list,
using a visited array to record state
allows safe control over the entire dataset without index errors.
Revised Code (O(N^2) Version)
from typing import List
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
answer = []
n = len(strs)
visited = [False] * n
for i in range(n):
if visited[i]: # Skip already processed words
continue
trial = [strs[i]]
visited[i] = True
for j in range(i + 1, n):
# Not yet visited and sorted result matches -> anagram!
if not visited[j] and sorted(strs[i]) == sorted(strs[j]):
trial.append(strs[j])
visited[j] = True
answer.append(trial)
return answer
The Downside: Time Complexity
The code above is logically correct, but leaves much to be desired in terms of performance.
Time Complexity
O(N^2 · K log K)
- Outer loop:
N - Inner loop:
N - String sorting:
K log K
With just 10,000 entries,
this requires over 100 million operations.
In real coding tests, this is very likely to result in Time Limit Exceeded (TLE).
Optimization: Using a Hash Map (Dictionary)
Eliminate comparison operations entirely
by using the sorted version of each word as a Key (bucket name).
The Art of Classification
A Hash Map is the optimal structure for instantly classifying data based on a specific “characteristic”.
Pythonic Way
Using defaultdict(list),
a new empty list is automatically created whenever a new Key is encountered,
making the code extremely concise.
Final Optimized Code (O(N · K log K))
from collections import defaultdict
from typing import List
class Solution:
def groupAnagrams(self, strs: List[str]) -> List[List[str]]:
# Use sorted string as Key, original string list as Value
groups = defaultdict(list)
for s in strs:
# "eat", "tea", "ate" -> all sort to "aet" (unique Key)
key = "".join(sorted(s))
groups[key].append(s)
return list(groups.values())
Summary & Reflection
The Discovery of the visited Array
- Extremely useful for tracking state without modifying the list
- A key strategy for preventing index corruption
The Efficiency of Hash Maps
- Hash maps are the most powerful tool for “grouping” problems
- Can improve time complexity from O(N^2) to O(N) level
- A core optimization pattern for real coding tests
Conclusion
Bruteforce → State Management → Hash Map Optimization
Through this process, I went beyond simple implementation
and gained a deep understanding of “the importance of choosing the right data structure”.
Leave a comment