Big-O & Complexity
Count the work before you write the code.
Before any pattern, you need a yardstick. Big-O is how engineers count work without running the code — and how interviewers silently tell you which solution they expect. Learn to read a constraint like n <= 10^5 and know, before writing a line, that nested loops are dead on arrival.
01What Big-O actually measures
Big-O answers one question: as the input grows, how fast does the work grow? Not "how many milliseconds" — machines differ, languages differ. It measures shape.
def find_max(nums):
best = nums[0]
for x in nums: # touches each element once
if x > best:
best = x
return best
Double the list, double the work: that's linear, O(n). Big-O deliberately throws away constants and lower-order terms — O(3n + 7) is just O(n) — because at scale only the dominant term matters. An O(n) pass with a clunky constant will still demolish a sleek O(n²) at n = 100,000.
Principle: Big-O is about growth, not speed. A slow O(n) beats a fast O(n²) once n gets large enough — and interviews always assume n gets large.
02The common classes, in raw op counts
Make these concrete. Here's roughly how many operations each class costs at two sizes:
| Class | n = 1,000 | n = 100,000 | Typical source |
|---|---|---|---|
| O(1) | 1 | 1 | hash lookup, array index |
| O(log n) | ~10 | ~17 | binary search, halving |
| O(n) | 1,000 | 100,000 | one pass |
| O(n log n) | ~10,000 | ~1.7 million | good sorts |
| O(n²) | 1 million | 10 billion | nested loops |
| O(2ⁿ) | astronomical | forget it | trying all subsets |
A typical judge allows roughly 10⁷–10⁸ simple operations per run. Read the table against that budget: O(n log n) at n = 10⁵ is comfortable; O(n²) at the same n is a hundred times over budget.
Spotting the class is mostly pattern-matching: a loop over n is O(n); a loop inside a loop is O(n²); something that halves its problem each step is O(log n); sort-then-scan is O(n log n).
Principle: keep the ops table in your head. Complexity stops being abstract the moment you attach real numbers to it.
03Constraints are the interviewer talking to you
Constraints aren't legal fine print — they're a hint about the intended solution. Work backwards from the ~10⁸ ops budget:
| Constraint | Budget says | Likely intent |
|---|---|---|
n <= 20 | even 2ⁿ ≈ 10⁶ is fine | brute force, backtracking, bitmask |
n <= 500 | n³ ≈ 10⁸ squeaks by | DP with three loops |
n <= 5,000 | n² ≈ 2.5 × 10⁷ is fine | O(n²) is intended — don't overthink |
n <= 10^5 | n² is 10¹⁰ — dead | O(n log n) or O(n): sort, heap, hashing, two pointers |
n <= 10^8 | even O(n) is tight | O(log n) or O(1): math, binary search |
This works in reverse too: if your idea is O(n²) and the constraint is 10⁵, you don't need to run it to know it fails. Say so out loud in the interview — "n is 10⁵ so quadratic won't fly, I need to remove the inner loop" — that single sentence signals more seniority than ten minutes of silent coding.
Principle: read the constraints first. They tell you the target complexity before you've thought of a single approach.
04Space complexity: the other axis
Time gets the spotlight, but interviewers always follow up with "and how much memory?" Same notation, counting extra allocation as input grows:
def find_max(nums): # O(1) space — one variable, any input size
best = nums[0]
...
def with_copy(nums):
seen = set(nums) # O(n) space — grows with the input
...
Conventions worth knowing:
- The input itself doesn't count — only what you allocate on top.
- The output usually doesn't count either (you can't avoid producing it).
- Recursion isn't free: every nested call holds a stack frame, so a recursion of depth n is O(n) space even with zero data structures.
Many problems are really a time-space negotiation: a hash set finds duplicates in O(n) time and O(n) space; sorting first finds them in O(n log n) time and O(1) space. Neither is "right" — naming the trade-off is the point.
Principle: every data structure you allocate is a purchase. Know what you bought and what you paid.
05Amortized cost: expensive sometimes, cheap on average
Some operations are occasionally expensive but cheap averaged over a sequence. Python's list.append is the classic: usually O(1), but when the underlying array fills up, it allocates a bigger one and copies everything — O(n) for that one call.
Why do we still call append O(1)? Because the array roughly doubles each time. Copies of size 1, 2, 4, 8, … n sum to less than 2n — so n appends cost O(n) total, i.e. O(1) amortized each.
You'll lean on this constantly without ceremony:
list.append— O(1) amortized- dict/set insert — O(1) average (hashing) and amortized (resizes)
- a queue built from two stacks — each element moved at most twice, O(1) amortized per op
For interview purposes, one sentence suffices: "append is amortized O(1) — occasional resizes, but the doubling spreads the cost." Knowing the word and the why is enough.
Principle: judge an operation by its cost over the whole sequence, not its worst single call.
✦Prove it — checkpoint quiz
A problem states n <= 5,000. Your clean solution is O(n²). What do you do?
Why does Big-O drop constants — why is O(3n + 7) just O(n)?
A recursive function makes one nested call per element, depth n, and allocates no data structures. Its space complexity is:
Python's list.append is called O(1) amortized. What does that mean?
n <= 20 in the constraints most strongly suggests:
⚒Now forge it
Reading is scaffolding. These drills are the building. Easiest first: