Tree Traversal: BFS & DFS
Pick the walk that matches the question.
Knowing trees isn't the same as knowing how to walk them. There are two fundamentally different walks — breadth-first with a queue, depth-first with recursion — and three flavors of DFS that each exist for a reason. Interviewers rarely ask 'traverse this tree'; they ask questions whose answers fall out of choosing the right walk. This lesson is about reading the question and picking the walk on purpose.
01BFS: a queue and one loop
Breadth-first search visits the tree level by level: the root, then both its children, then all four grandchildren. The engine is a queue — first node in, first node out — and the pattern is worth memorizing as a unit:
from collections import deque
def bfs_levels(root):
if not root:
return []
out, queue = [], deque([root])
while queue:
level = []
for _ in range(len(queue)): # freeze the level size NOW
node = queue.popleft()
level.append(node.val)
if node.left: queue.append(node.left)
if node.right: queue.append(node.right)
out.append(level)
return out
The load-bearing line is for _ in range(len(queue)). At the top of each outer iteration, the queue holds exactly one level. Snapshotting its length before the loop lets you drain that level while the next one accumulates behind it — that's how you know where one level ends and the next begins. Forget the snapshot and the levels smear together.
Everything "level-flavored" is this loop with a different body: per-level averages, zigzag order, rightmost node per level, first level containing X.
Seed the queue with root 1. Everything BFS will ever visit flows through this queue.
queue = deque([root])while queue:node = queue.popleft()visit(node)for child in (node.left, node.right):if child:queue.append(child)
02DFS three ways — and what each is for
Depth-first search dives down one branch before backtracking. The three orders differ only in when you process the node relative to its children, and each timing serves a different job:
def dfs(node):
if not node: return
# pre-order: process node HERE — before children
dfs(node.left)
# in-order: process node HERE — between children
dfs(node.right)
# post-order: process node HERE — after children
| Order | Timing | Built for |
|---|---|---|
| Pre-order | Parent before children | Copying/serializing a tree, anything where the parent's context must exist before the child's (passing constraints down) |
| In-order | Left, me, right | BSTs — emits values in sorted order; k-th smallest, validation |
| Post-order | Children before parent | Anything computed from the children: heights, sizes, deleting safely, "evaluate the subtree" |
The way to remember it isn't memorization, it's causality: who needs whose answer first? Serialization needs the parent written before the children (pre). A node's height needs the children's heights first (post). Sorted output from a BST needs left exhausted before me (in).
03When BFS beats DFS (and the price it pays)
Both walks visit every node, both are O(n) time. The choice is about what arrives first and what memory you burn.
Reach for BFS when:
- The answer is phrased in levels or distance: level-order output, minimum depth, "nearest", "first row where...". BFS visits nodes in increasing distance from the root, so the first hit is automatically the closest — you can stop early. DFS would have to explore everything to be sure.
- You want the shallowest anything. A DFS might wander 10,000 nodes down the left spine while the answer sits at depth 2 on the right.
Reach for DFS when:
- The answer is built from subtrees upward (heights, sums, diameters) — that's post-order by nature.
- You need path context from root to node (path sums, ancestor constraints) — recursion's call stack carries it for free.
The memory bill. DFS holds one root-to-leaf path: O(h). BFS holds an entire level: O(w). On a balanced tree the bottom level has ~n/2 nodes, so BFS can cost O(n) memory while DFS costs O(log n). On a degenerate, vine-shaped tree it flips: DFS's stack is O(n) (hello, recursion limit) while BFS holds one node at a time. Name the shape of the tree before naming the cost — interviewers love that precision.
04Carry state down, merge answers up
Most non-trivial tree problems are one of two information flows — and identifying the flow tells you what your recursion's signature looks like.
Carry state down (parameters): the node needs to know something about its ancestors.
def count_good(node, best_so_far):
if not node: return 0
good = 1 if node.val >= best_so_far else 0
best = max(best_so_far, node.val)
return good + count_good(node.left, best) + count_good(node.right, best)
The accumulated path knowledge rides down as an argument. Validate-BST is the same shape: (low, high) bounds tighten as they descend.
Merge answers up (return values): the node's answer is a function of its children's answers — max depth, subtree sums, diameter. That's the post-order combine step from the trees lesson.
And often: both at once. Path-sum problems carry the running total down and report success up. Max-path-sum returns the best downward path to the parent while updating a global best that combines both children. When a problem feels tangled, untangle it with one question: what flows down as parameters, what flows up as return values? Write those two signatures and the code mostly writes itself.
✦Prove it — checkpoint quiz
In the BFS level loop, why snapshot len(queue) before the inner for-loop?
You need every node's subtree height. Which traversal order does the computation force on you?
Why is BFS the better default for 'minimum depth of the tree'?
On a complete (bushy) binary tree with n nodes, the peak memory of BFS vs DFS is roughly:
In 'count nodes that are >= every ancestor on their path', what flows down and what flows up?
⚒Now forge it
Reading is scaffolding. These drills are the building. Easiest first: