Intervals
Sort once, sweep once — chaos becomes a timeline.
Calendars, server bookings, CPU reservations, ad slots — intervals are how software talks about time. Raw interval problems look combinatorial (compare every pair?), but nearly all of them fall to one master move: sort by one endpoint, then sweep left to right, carrying just enough state to make each decision locally. Get the overlap condition exactly right, learn the three sweep variants, and this whole topic becomes mechanical.
01The overlap condition, derived — not memorized
Two intervals a = [a.start, a.end] and b = [b.start, b.end]. When do they overlap?
Don't enumerate the four overlap pictures (partial left, partial right, contained, containing) — you'll fumble one under pressure. Instead, ask the opposite question: when do they miss each other? Only two ways:
aends beforebbegins:a.end < b.startbends beforeabegins:b.end < a.start
Negate the miss:
overlap = a_start <= b_end and b_start <= a_end
Two comparisons cover every overlap shape, including full containment. The boundary detail interviewers probe: do [1, 3] and [3, 5] overlap? With <= they touch at a point — merging usually says yes (they fuse into [1, 5]), while meeting rooms usually say no (a meeting ending at 3 frees the room for one starting at 3). State your boundary convention out loud before coding. Half the wrong answers in this topic are off-by-one boundary disputes.
02Sort-then-sweep: the master move
Unsorted intervals force you to compare every pair — O(n²). Sorting by start time buys you a powerful local guarantee:
After sorting, an interval can only overlap its neighbors in the sweep. If interval
idoesn't reach intervali+1, it can't reach anything after it either.
That turns a global tangle into a single left-to-right pass with O(1) state. The canonical form — merging:
def merge(intervals):
intervals.sort() # by start
out = []
for s, e in intervals:
if out and s <= out[-1][1]: # overlaps the last merged block?
out[-1][1] = max(out[-1][1], e) # extend (max handles containment)
else:
out.append([s, e])
return out
The max is the line people botch: [1, 10] followed by [2, 3] must not shrink the block to end at 3. Total cost O(n log n) — the sort dominates, the sweep is O(n).
03Three sweeps, one skeleton
Almost every interval interview question is one of three sweep variants. Same sort, different bookkeeping:
| Pattern | Question | State carried | Decision per interval |
|---|---|---|---|
| Merge | "Combine overlapping ranges" | last merged block | extend or start new block |
| Erase / keep | "Min removals so none overlap" | end of last kept interval | conflict ⇒ drop the one ending later |
| Count rooms | "Max simultaneous overlap" | heap of active end times | reuse a freed room or open one |
The erase variant hides a greedy proof worth saying aloud: when two intervals conflict, always keep the one that ends earlier — it leaves the most space for the future, so it can never be a worse choice. (Sorting by end time makes this automatic.)
The rooms variant is where intervals meet last lesson's heaps: sweep by start time, keep a min-heap of end times of in-flight meetings. Before seating a meeting, pop every end time <= its start (those rooms are free). The peak heap size is your answer — and the same sweep is literally how autoscalers size server pools against booked load.
04Reading the problem: which endpoint to sort by
A pre-coding checklist that prevents most interval bugs:
- Closed or half-open? Decide whether
[1, 3]and[3, 5]touch, and which way the problem wants it. Look at the examples — they always tell you. - Sort by start or end? Merging and counting sweep by start (you process events as they begin). Greedy keep/erase sorts by end (earliest finisher is the safest keep).
- What's my carried state? One block? One end time? A heap? If you can't name it in one phrase, you don't have the invariant yet.
- Empty input and single interval — the classic forgotten edges.
And one structural cousin to file away: if queries arrive offline (all known up front), you can sort the queries too and sweep both lists together, feeding a heap — turning "for each query, scan all intervals" O(n·q) into O((n+q) log n). That trick closes out this topic's hard problem.
✦Prove it — checkpoint quiz
Which single condition tests whether [a1, a2] and [b1, b2] overlap (closed intervals)?
While merging sorted intervals, the current interval overlaps the last merged block. What's the correct update to the block's end?
Two intervals conflict and you may erase one. Which do you keep, and why?
Minimum meeting rooms via sweep + min-heap of end times: what is the answer when the sweep finishes?
⚒Now forge it
Reading is scaffolding. These drills are the building. Easiest first: