Interactive Systems Explainers

Retry Storm Playground

See how retries can amplify overload and destabilize distributed systems.

i
Retry storm experiment controls

Key Observations

What to Notice

Why retries can become dangerous

A retry feels helpful in isolation. Under stress, many retries arrive at once and compete with the original traffic.

Timeout amplification

Shorter timeouts fail faster. Each failure can create more work before the old work has drained.

Positive feedback loops

More load creates more waiting. More waiting creates more retries. The loop can accelerate suddenly.

Why systems collapse

Once the service is buried, recovery lags behind. Retry traffic keeps circling even after new traffic calms down.

Interactive Systems Explainers

Explore Next