The physics of AI compute
Tracking the
AI Compute
Bottlenecks
Memory bandwidth, interconnect, packaging,
and power — where physics gates AI compute.
A modern GPU peaks at 1,979 TFLOPS in FP8. In decode at batch = 1, a 70B model uses ~3% of that. The other 97% is tensor cores waiting for memory. This site is about that gap.
start exploringexplore the physics