Question 1

How do I size a GPU cluster?

Accepted Answer

Start from the GPU count your workload needs (from training-time or serving requirements), then derive the physical and network build: nodes = GPUs ÷ GPUs-per-node (typically 8); racks = nodes ÷ nodes-per-rack (limited by power density); total power = GPUs × GPU power + per-node overhead, times datacenter PUE; and the network — NVLink within nodes, InfiniBand/Ethernet between, with enough bisection bandwidth for collective operations. This calculator computes the nodes, racks, IT and facility power, power per rack, and approximate bisection bandwidth from your GPU count and topology choices.

Question 2

Why is power the limiting factor for AI clusters?

Accepted Answer

Because modern AI GPUs draw 700–1,000W+ each, a node of eight draws 6–10kW, and a rack of several nodes can exceed 30–70kW — versus the ~5–15kW a traditional datacenter rack was built for. So AI clusters are constrained by how much power and cooling can be delivered to each rack, not by physical space. This is why AI datacenters often run fewer servers per rack than legacy facilities and require liquid cooling, and why power provisioning, not floor area, dominates the build. This calculator reports power per rack so you can check it against your facility's density limit.

Question 3

What is the difference between NVLink and InfiniBand in a cluster?

Accepted Answer

NVLink (and NVSwitch) is the high-bandwidth, all-to-all interconnect within a node — connecting the 8 GPUs in a server with terabytes per second of bandwidth, forming an NVLink domain where any GPU can reach any other at full speed. InfiniBand (or high-speed Ethernet/RoCE) connects nodes to each other through a network of switches, at lower per-GPU bandwidth than NVLink but spanning the whole cluster. The two-tier hierarchy — fast NVLink inside, scalable InfiniBand outside — is the standard AI-cluster topology, and this calculator reflects the NVLink domain size and inter-node NICs.

Question 4

What is bisection bandwidth and why does it matter?

Accepted Answer

Bisection bandwidth is the total bandwidth available across the worst-case split of the network into two halves — effectively how much data can cross the middle at once. It matters because distributed training relies on collective operations (especially all-reduce of gradients) that move data across the whole cluster every step; if the bisection bandwidth is too low, those collectives bottleneck and adding GPUs stops improving training speed. A full fat-tree provides full bisection bandwidth (every node can communicate at full rate simultaneously). This calculator estimates the bisection bandwidth from the per-GPU NIC rate and GPU count.

Question 5

How many GPUs go in a node and a rack?

Accepted Answer

A standard AI server node holds 8 GPUs (connected by NVLink), though some designs use 4 or 16. Racks are limited by power density: at 6–10kW per node, a rack provisioned for 30–40kW holds 4–5 nodes, while high-density liquid-cooled racks (up to 100kW+) hold more. So GPUs per rack is typically 16–64 depending on the GPU power and the facility's rack power limit — far fewer than legacy racks held, because of the power draw. This calculator lets you set GPUs-per-node and nodes-per-rack to match your design and reports the resulting power per rack.

Question 6

What is a fat-tree topology?

Accepted Answer

A fat-tree is a multi-level switch network where bandwidth doesn't shrink as you go up toward the root — the 'branches' get 'fatter' to maintain full bisection bandwidth, so any node can communicate with any other at full rate simultaneously. It's the standard topology for AI training clusters because collective operations need uniform, non-blocking bandwidth across all nodes. Building a full (non-blocking) fat-tree requires many switches and cables; some clusters use a slightly oversubscribed tree to save cost where the workload tolerates it. This calculator assumes full bisection for its bandwidth estimate.

Question 7

How does interconnect affect training scaling efficiency?

Accepted Answer

Profoundly. Distributed training synchronizes gradients across all GPUs every step via all-reduce, so the time spent communicating competes with time computing. If the interconnect is fast enough (high bisection bandwidth, low latency), the communication overlaps with computation and the cluster scales near-linearly — more GPUs really do mean proportionally faster training. If it's under-provisioned, communication dominates and adding GPUs yields diminishing returns (low scaling efficiency, which shows up as low MFU). This is why AI clusters invest heavily in NVLink and InfiniBand, and why network design is as important as GPU count.

Question 8

How much power does a large GPU cluster need?

Accepted Answer

Substantial: total power ≈ GPUs × per-GPU power + per-node overhead (CPUs, NICs, fans), all multiplied by datacenter PUE for the facility draw. A 1,024-GPU H100 cluster pulls roughly 1MW IT (about 1.2–1.3MW facility with PUE); a 16,384-GPU frontier cluster exceeds 20MW. These are utility-scale loads requiring dedicated substations, which is why frontier AI datacenters are sited near abundant power and increasingly drive new generation. This calculator computes the IT and facility power so you can plan the electrical and cooling infrastructure.

Question 9

How does this connect to the rest of cluster planning?

Accepted Answer

This tool turns a GPU count into the physical build — nodes, racks, power, network. The GPU count itself comes from your workload: the training-cost calculator (for a training run) or the LLM-serving calculator (for inference). The facility power feeds the data-center power estimator for energy and cost, and the buy-vs-rent decision for the GPUs is the accelerator-ROI calculator. Use this as the bridge from 'how many GPUs' to 'what does the datacenter look like and draw'.

Question 10

How accurate is this cluster-sizing estimate?

Accepted Answer

The node, rack and power arithmetic is exact for your inputs; the bisection-bandwidth figure is a first-order estimate assuming a full fat-tree and per-GPU NIC rate. Real cluster design adds switch counts and tiers, cable lengths, oversubscription choices, storage and management networks, redundancy, and cooling infrastructure — and rack power limits depend heavily on whether it's air or liquid cooled. Use this to scope the build (node/rack count, power envelope, rough network) for planning; detailed design requires a network architect and facility engineering. The power and node/rack counts are the most reliable outputs.

Question 11

Does this tool send my data anywhere?

Accepted Answer

No. All cluster-sizing math runs entirely in your browser in JavaScript — nothing is uploaded and there's no telemetry.

GPU Cluster Sizing Console

Cluster build console

Why a cluster is power and network

From a pile of GPUs to a machine

GPU Cluster Sizing FAQs

Trusted by Datacenter & HPC Infrastructure Teams

Related tools

Similar Calculators

Inference Cost Calculator

Training Cost Calculator

Model Fit Checker

HBM Bandwidth Calculator

AI Chip Comparator

Token Cost Estimator

Often Used Together

Wafer Cost Calculator

Die Per Wafer Calculator

Yield Calculator

Chip Profitability Calculator

Related Articles

Technical Services