Notice Period
week 4 · lesson

Intervals

Sort once, sweep once — chaos becomes a timeline.

Calendars, server bookings, CPU reservations, ad slots — intervals are how software talks about time. Raw interval problems look combinatorial (compare every pair?), but nearly all of them fall to one master move: sort by one endpoint, then sweep left to right, carrying just enough state to make each decision locally. Get the overlap condition exactly right, learn the three sweep variants, and this whole topic becomes mechanical.

01The overlap condition, derived — not memorized

Two intervals a = [a.start, a.end] and b = [b.start, b.end]. When do they overlap?

Don't enumerate the four overlap pictures (partial left, partial right, contained, containing) — you'll fumble one under pressure. Instead, ask the opposite question: when do they miss each other? Only two ways:

  • a ends before b begins: a.end < b.start
  • b ends before a begins: b.end < a.start

Negate the miss:

overlap = a_start <= b_end and b_start <= a_end

Two comparisons cover every overlap shape, including full containment. The boundary detail interviewers probe: do [1, 3] and [3, 5] overlap? With <= they touch at a point — merging usually says yes (they fuse into [1, 5]), while meeting rooms usually say no (a meeting ending at 3 frees the room for one starting at 3). State your boundary convention out loud before coding. Half the wrong answers in this topic are off-by-one boundary disputes.

02Sort-then-sweep: the master move

Unsorted intervals force you to compare every pair — O(n²). Sorting by start time buys you a powerful local guarantee:

After sorting, an interval can only overlap its neighbors in the sweep. If interval i doesn't reach interval i+1, it can't reach anything after it either.

That turns a global tangle into a single left-to-right pass with O(1) state. The canonical form — merging:

def merge(intervals):
    intervals.sort()                      # by start
    out = []
    for s, e in intervals:
        if out and s <= out[-1][1]:       # overlaps the last merged block?
            out[-1][1] = max(out[-1][1], e)   # extend (max handles containment)
        else:
            out.append([s, e])
    return out

The max is the line people botch: [1, 10] followed by [2, 3] must not shrink the block to end at 3. Total cost O(n log n) — the sort dominates, the sweep is O(n).

03Three sweeps, one skeleton

Almost every interval interview question is one of three sweep variants. Same sort, different bookkeeping:

PatternQuestionState carriedDecision per interval
Merge"Combine overlapping ranges"last merged blockextend or start new block
Erase / keep"Min removals so none overlap"end of last kept intervalconflict ⇒ drop the one ending later
Count rooms"Max simultaneous overlap"heap of active end timesreuse a freed room or open one

The erase variant hides a greedy proof worth saying aloud: when two intervals conflict, always keep the one that ends earlier — it leaves the most space for the future, so it can never be a worse choice. (Sorting by end time makes this automatic.)

The rooms variant is where intervals meet last lesson's heaps: sweep by start time, keep a min-heap of end times of in-flight meetings. Before seating a meeting, pop every end time <= its start (those rooms are free). The peak heap size is your answer — and the same sweep is literally how autoscalers size server pools against booked load.

04Reading the problem: which endpoint to sort by

A pre-coding checklist that prevents most interval bugs:

  1. Closed or half-open? Decide whether [1, 3] and [3, 5] touch, and which way the problem wants it. Look at the examples — they always tell you.
  2. Sort by start or end? Merging and counting sweep by start (you process events as they begin). Greedy keep/erase sorts by end (earliest finisher is the safest keep).
  3. What's my carried state? One block? One end time? A heap? If you can't name it in one phrase, you don't have the invariant yet.
  4. Empty input and single interval — the classic forgotten edges.

And one structural cousin to file away: if queries arrive offline (all known up front), you can sort the queries too and sweep both lists together, feeding a heap — turning "for each query, scan all intervals" O(n·q) into O((n+q) log n). That trick closes out this topic's hard problem.

Prove it — checkpoint quiz

1

Which single condition tests whether [a1, a2] and [b1, b2] overlap (closed intervals)?

2

While merging sorted intervals, the current interval overlaps the last merged block. What's the correct update to the block's end?

3

Two intervals conflict and you may erase one. Which do you keep, and why?

4

Minimum meeting rooms via sweep + min-heap of end times: what is the answer when the sweep finishes?

Now forge it

Reading is scaffolding. These drills are the building. Easiest first: