week 1 · lesson

Arrays & Hashing

Trade memory for speed.

Almost every interview problem starts life as an array. The first superpower you learn is the hash map: a structure that answers “have I seen this?” and “where?” in constant time. Master it and a whole class of O(n²) brute-force problems collapses to a single pass.

01Why brute force dies at scale

Most array problems have an obvious solution: compare everything against everything.

def has_pair(nums, target):
    for i in range(len(nums)):
        for j in range(i + 1, len(nums)):
            if nums[i] + nums[j] == target:
                return True
    return False

Two nested loops over n items is O(n²) — for n = 100,000 that's ~10 billion comparisons. Interviewers hand you these constraints on purpose: when you see n <= 10^5, an O(n²) approach is a fail signal, and the question becomes what structure removes the inner loop?

Rule of thumb: ~10⁷–10⁸ simple operations per second is what you can afford. n² at n = 10⁵ is 10¹⁰. Dead.

02The hash map: O(1) memory of the past

A Python dict (hash map) and set (hash set) give you average O(1) insert and lookup. The mental model: instead of re-scanning the past, you remember it.

seen = {}          # value -> index
for i, x in enumerate(nums):
    if target - x in seen:   # O(1) — "did my partner already pass by?"
        return [seen[target - x], i]
    seen[x] = i

The inner loop became a dictionary lookup. That's the entire trick — and it's worth roughly a third of all easy/medium array interview questions.

How it works underneath: the key is hashed to a bucket index. Equal keys always land in the same bucket, so lookup is one hash + one (short) bucket scan. This is also why keys must be immutable (str, int, tuple — never list).

03Counting: frequency tables

The second hash-map workhorse is the frequency table — count occurrences, then reason about the counts.

counts = {}
for ch in s:
    counts[ch] = counts.get(ch, 0) + 1

Anagram checks, majority elements, top-k-frequent, ransom-note problems: all of them are "build counts, then compare/scan counts." Python's collections.Counter does this in one line — fine to use in interviews, but be ready to write the loop yourself.

04Canonical keys: make equivalent things collide

Sometimes the trick isn't remembering values — it's choosing what to use as the key. Design a key so that all "equivalent" items map to the same one:

Problem	Canonical key
Group anagrams	`"".join(sorted(word))`
Points on same line	slope as reduced fraction
Seen board states	`tuple(tuple(row) for row in grid)`

groups = {}
for w in words:
    groups.setdefault("".join(sorted(w)), []).append(w)

This idea — normalize, then bucket — is the same one behind cache keys and database sharding. It transfers straight into your system design rounds.

05Prefix accumulation: precompute the past

When a problem asks about ranges or everything except me, hash maps step aside for running accumulators:

prefix = [0]
for x in nums:
    prefix.append(prefix[-1] + x)
# sum of nums[i..j] == prefix[j+1] - prefix[i], in O(1)

One O(n) precomputation buys unlimited O(1) range queries. The multiplication version of this (prefix × suffix products) solves "product of array except self" with no division. File the principle: precompute aggregates of the past so each position answers in O(1).

✦Prove it — checkpoint quiz

Your input has n up to 10⁵ and your solution is O(n²). Roughly how many operations is that?

Why can't a Python list be a dict key?

Group-anagrams uses sorted(word) as a dict key. What's the underlying principle?

A problem asks for the sum of nums[i..j] for many (i, j) queries. Best preprocessing?

⚒Now forge it

Reading is scaffolding. These drills are the building. Easiest first:

·Pair With Targeteasy ·Spot the Repeateasy ·Same Letters, Different Ordereasy ·Anagram Cliquesmedium ·Everyone Else's Productmedium