Interconnect Latency Console
On-chip network delay is hops × per-hop + serialization. Topology sets the hops, link width sets the serialization, and wire delay no longer scales. Compare topologies and find the latency floor.
Topology & size → average packet latency.
Simple grid — average hops grow linearly with dimension; cheap wiring, higher diameter.
Network latency console
Torus halves the hops of a mesh; ring grows with node count. All carry the same 4-cycle serialization tax for this packet/link.
A 2D Mesh of 64 nodes averages 5.3 hops at 3 cycles each, plus 4 cycles to serialize a 512-bit packet over a 128-bit link — 20 cycles ≈ 10.0 ns at 2 GHz.
Hop traversal is 80% of the latency; serialization is 20%. Hop-dominated — a lower-diameter topology or closer placement helps most.
Drive placement & distances in the Floorplan console; size link throughput in Memory Bandwidth.
Why on-chip distance costs time
On-chip network delay is the number of router hops times the cost of each hop, plus the time to push the packet onto a link. Topology sets the hop count; link width sets the serialization.
A mesh is cheap to wire but has many hops; a torus halves the hops with wrap-around links; richer topologies cut hops further at the cost of more wiring and area. The right choice depends on traffic and floor space.
Transistors got faster every node, but wires didn't — cross-chip signal time is now many cycles. That's why big chips are networks of tiles: you can't drive a signal corner-to-corner in one cycle any more.
A 512-bit packet on a 128-bit link takes four cycles just to inject. Wider links cut serialization but cost wires and power. Latency-sensitive traffic wants wide links; bulk traffic can tolerate narrow.
Chips became networks
There was a time when a signal could cross a chip in a single clock cycle and global wires were free. That time is gone. Transistors kept getting faster every node, but wires did not — the resistance and capacitance of interconnect scales badly, so relative to the fast logic around them, wires got slower. Driving a signal from one corner of a large die to the other now takes many cycles, and that single physical fact reshaped how chips are built.
The answer was to stop treating the chip as one big circuit and start treating it as a network of tiles connected by routers — a network-on-chip. Communication is pipelined across hops, and latency becomes a structural property of the network: the average number of hops times the cost of each hop, plus the time to serialize a packet onto a link. Topology sets the hop count, the router pipeline and link length set the per-hop cost, and the link width sets the serialization tax.
Each of those is a lever with a price. A torus halves a mesh's hops but needs wrap-around wires; a wider link erases serialization but burns area and power; a shallower router pipeline cuts per-hop latency but may cost frequency. The biggest lever of all isn't in the network at all — it's placement. Keep communicating blocks close and traffic stays local, hop counts stay low, and the network barely matters. Spread them out and no topology will save you.
Use this console to compute the zero-load latency floor, compare topologies for your node count, and see whether hops or serialization dominate — which tells you whether to change topology, widen links, or improve placement. Reason about that placement in the Floorplan console and size link throughput against demand in Memory Bandwidth.
Trusted by NoC & Fabric Architects
“Hops × per-hop + serialization with the right average-hop formulas per topology is exactly the zero-load model I use to compare networks. Showing torus halving the hops vs mesh, and serialization as a fixed injection tax, is the framing juniors need. The wire-delay-doesn't-scale point is the whole reason NoCs exist.”
“Link width as a serialization lever is captured perfectly — I use it to justify wide links for our latency-critical coherence traffic. Router-cycle knob lets me model our pipeline. Pairs naturally with the floorplan tool for placement-driven locality. Exactly the right fidelity for early exploration.”
“Clean zero-load latency across topologies with realistic NoC presets. The latency floor it gives is my design target before simulation. Would love a queueing/load model, but the tool is upfront that it's zero-load, and for topology comparison it's spot-on.”
“Modeling our 16×16 mesh latency and seeing where serialization vs hops dominate told us to widen links rather than change topology. The cross-chip-wire-delay reality is well represented. Feeds straight into floorplan and bandwidth planning. Fast and genuinely useful.”
Love using our calculator?
Related tools
Similar Calculators
More tools in the same category
Memory Bandwidth Calculator
Calculate memory throughput requirements for CPU, GPU, and AI workloads with cache-hierarchy modeling, prefetch analysis, and bandwidth-saturation detection. Supports DDR5, HBM, LPDDR, and CXL memory pools with multi-tier bandwidth planning and bottleneck identification.
Cache Size Estimator
Estimate optimal cache sizing for performance targets with workload-characteristic analysis, miss-rate modeling, and area-power trade-off evaluation. Supports L1/L2/L3 hierarchy design, non-inclusive vs. exclusive policies, and last-level cache (LLC) partitioning for multi-core systems.
Clock Tree Estimator
Estimate clock distribution overhead including skew, jitter, power consumption, and area for hierarchical and mesh clock networks. Models multi-corner multi-mode (MCMM) scenarios, clock-gating efficiency, and adaptive frequency scaling for advanced-node designs.
Floorplan Estimator
Generate early-stage floorplan metrics including aspect ratio, wire-length estimation, and congestion prediction from RTL hierarchy and connectivity graphs. Supports macro placement, pin assignment, and power-domain planning with thermal-aware optimization for AI and HPC chips.
Transistor Count Estimator
Estimate transistor count from architecture specifications including core count, cache size, vector width, and accelerator block definitions. Supports FinFET, GAA, and CFET node scaling with density-per-micron tracking and die-area extrapolation for roadmap planning.
SRAM Area Calculator
Calculate SRAM area requirements for various bit-cell designs (6T, 8T, 10T) across process nodes with row/column redundancy, sense-amplifier overhead, and peripheral circuit modeling. Supports single-port, dual-port, and custom-port configurations with yield-aware sizing.
Often Used Together
Complementary tools for complete analysis
Related Articles
Dive deeper with our expert guides and tutorials related to Interconnect Latency Calculator
latency = avg hops × (router + link) + ⌈packet ÷ link width⌉ cycles ÷ clock · Last reviewed: 2026-06