Arrays & Hashing
Trade memory for speed.
Almost every interview problem starts life as an array. The first superpower you learn is the hash map: a structure that answers “have I seen this?” and “where?” in constant time. Master it and a whole class of O(n²) brute-force problems collapses to a single pass.
01Why brute force dies at scale
Most array problems have an obvious solution: compare everything against everything.
def has_pair(nums, target):
for i in range(len(nums)):
for j in range(i + 1, len(nums)):
if nums[i] + nums[j] == target:
return True
return False
Two nested loops over n items is O(n²) — for n = 100,000 that's ~10 billion comparisons. Interviewers hand you these constraints on purpose: when you see n <= 10^5, an O(n²) approach is a fail signal, and the question becomes what structure removes the inner loop?
Rule of thumb: ~10⁷–10⁸ simple operations per second is what you can afford. n² at n = 10⁵ is 10¹⁰. Dead.
02The hash map: O(1) memory of the past
A Python dict (hash map) and set (hash set) give you average O(1) insert and lookup. The mental model: instead of re-scanning the past, you remember it.
seen = {} # value -> index
for i, x in enumerate(nums):
if target - x in seen: # O(1) — "did my partner already pass by?"
return [seen[target - x], i]
seen[x] = i
The inner loop became a dictionary lookup. That's the entire trick — and it's worth roughly a third of all easy/medium array interview questions.
How it works underneath: the key is hashed to a bucket index. Equal keys always land in the same bucket, so lookup is one hash + one (short) bucket scan. This is also why keys must be immutable (str, int, tuple — never list).
03Counting: frequency tables
The second hash-map workhorse is the frequency table — count occurrences, then reason about the counts.
counts = {}
for ch in s:
counts[ch] = counts.get(ch, 0) + 1
Anagram checks, majority elements, top-k-frequent, ransom-note problems: all of them are "build counts, then compare/scan counts." Python's collections.Counter does this in one line — fine to use in interviews, but be ready to write the loop yourself.
04Canonical keys: make equivalent things collide
Sometimes the trick isn't remembering values — it's choosing what to use as the key. Design a key so that all "equivalent" items map to the same one:
| Problem | Canonical key |
|---|---|
| Group anagrams | "".join(sorted(word)) |
| Points on same line | slope as reduced fraction |
| Seen board states | tuple(tuple(row) for row in grid) |
groups = {}
for w in words:
groups.setdefault("".join(sorted(w)), []).append(w)
This idea — normalize, then bucket — is the same one behind cache keys and database sharding. It transfers straight into your system design rounds.
05Prefix accumulation: precompute the past
When a problem asks about ranges or everything except me, hash maps step aside for running accumulators:
prefix = [0]
for x in nums:
prefix.append(prefix[-1] + x)
# sum of nums[i..j] == prefix[j+1] - prefix[i], in O(1)
One O(n) precomputation buys unlimited O(1) range queries. The multiplication version of this (prefix × suffix products) solves "product of array except self" with no division. File the principle: precompute aggregates of the past so each position answers in O(1).
✦Prove it — checkpoint quiz
Your input has n up to 10⁵ and your solution is O(n²). Roughly how many operations is that?
Why can't a Python list be a dict key?
Group-anagrams uses sorted(word) as a dict key. What's the underlying principle?
A problem asks for the sum of nums[i..j] for many (i, j) queries. Best preprocessing?
⚒Now forge it
Reading is scaffolding. These drills are the building. Easiest first: