Llama 3 70B · FP8 · 1,024× NVIDIAHigh · Steady-state · 0.8 overlap
compute-bound·2.0Mtok/s·8d
Tensor cores
You're at 45% of peak FP8.
FP8 utilization45%
Ridge AI
Above the ridge — compute is the gate, not HBM.
AI vs ridge7389.2 / 591
FLOP / byte
Time to train
Estimated wall time at the configured MFU + parallelism.
Days8d
Bottleneck: tensor compute. Compute is the gate — closer to peak silicon utilization.
Inside the system
where the bits are right nowHBM 0.12 / 3.35 TB/s
SMs 59 / 132 active
NVLink idle
reading 70 GB of weights per token · SMs starving. The HBM pipes run full and amber; the tensor cores sit mostly dark. This is the memory wall in pixel form.
What it costs
dollars, rates, and counterfactualsgreenfield on-prem · 3-yr amortized
GPUs · $30.7M
Chassis & CPUs · $5.1M
Power infra · $7.2M
Real estate · $15.6M
Fabric & storage · $5.4M
hover a segment for its unit formula
GPUs $30.7MInterconnect $4.6MChassis & CPUs $5.1MPower infra $7.2MCooling $4.3MReal estate $15.6MFabric & storage $5.4M
Total capex
$72.9M
3-yr TCO: $98.4M
1,024 × NVIDIA H100 SXM
$ / GPU-hour
$3.66/hr
3-yr amortized
incl. power + ops
$ / M tokens
$0.52/ M tok
decode at batch=1
continuous batching: ~$0.03
≈ 0.97 Gulfstream G700 ($75M each)≈ 3 Manhattan penthouses ($25M each)
capex indicative · power $0.08/kWh · 4%/yr ops · PUE 1.45 · greenfield on-prem · batch=1 decode (continuous batching drops $/M tokens ~20×)
Where it lives
watts, floor space, cooling1 GPU
NVIDIA H100 SXM
700 W
80 GB HBM
80 GB HBM
1 tray
8 GPUs · HGX board
5.6 kW
NVIDIA × 8
NVIDIA × 8
1 rack
NVL-class, 72 GPUs
50 kW
liquid-cooled
liquid-cooled
Your cluster
15 racks
1,024 GPUs · 717 kW
IT load
IT load
Site
Building + substation
1.04 MW
PUE 1.45 · ~825 sqft
PUE 1.45 · ~825 sqft
IT load717 kW≈ 611 US homes' continuous draw
Total facility1.04 MW≈ peak draw of a Walmart Supercenter
Annual energy9.1 GWh/yr≈ 0.23× a small college campus
Heat rejected1.04 MW≈ 1,642 gal/day evaporated at cooling tower
Training run energy201 MWhover 8 days · ≈ Iceland's annual × 0.001%
that's training.