Token bucket vs sliding window — the classic API gateway problem with a mermaid block you can sketch in 15 minutes.
Rate limiters protect your API from abuse, runaway clients, and accidental DDoS from a buggy cron job. Every system design interview eventually asks: how do you throttle fairly at scale?
Requirements (say these first)
Functional: cap requests per user/IP/API key over a time window; return 429 Too Many Requests when exceeded
Non-functional: low latency on the hot path; distributed (many API gateway instances); configurable limits per tier
Each gateway instance can't keep its own counter — a user could hit 1000 req/s by fanning across 10 nodes. Centralize state in Redis (or use a gossip/coordinated counter with careful consistency trade-offs).
What to say in the room
Clarify who is limited (user, IP, API key) and what counts (read vs write)
Pick an algorithm and name its weakness
Put Redis on the diagram — interviewers want to see shared state
Mention Retry-After header and idempotency for write endpoints
The diagram + one algorithm + Redis is usually enough for a 35-minute system design slot.