The physics of AI compute

Tracking the
AI Compute
Bottlenecks

Memory bandwidth, interconnect, packaging,
and power — where physics gates AI compute.

A modern GPU peaks at 1,979 TFLOPS in FP8. In decode at batch = 1, a 70B model uses ~3% of that. The other 97% is tensor cores waiting for memory. This site is about that gap.

start exploring

explore the physics

Primer

A guided tour from corpus to silicon.
See how data flows and where it hits
the wall.

7 stages · ~7 minutes→ →

Simulate

Try the bottleneck simulator.
Change your stack, see where the
bottleneck moves.

interactive→ →

Tracking theAI ComputeBottlenecks

Primer

Simulate

Tracking the
AI Compute
Bottlenecks