Skip to content
Hops × per-hop + serialization · mesh / torus / ring

Interconnect Latency Console

On-chip network delay is hops × per-hop + serialization. Topology sets the hops, link width sets the serialization, and wire delay no longer scales. Compare topologies and find the latency floor.

01 · Quick estimate

Topology & size → average packet latency.

Simple grid — average hops grow linearly with dimension; cheap wiring, higher diameter.

Avg latency
10.0
ns · 20 cyc · 5.3 hops
Topology comparison & breakdown ↓
02 · Deep analysis

Network latency console

Topology comparison (same 64 nodes)
2D Mesh
10.0 ns · 5.3h
2D Torus
8.0 ns · 4.0h
Ring
26.0 ns · 16.0h

Torus halves the hops of a mesh; ring grows with node count. All carry the same 4-cycle serialization tax for this packet/link.

Avg latency
10.0 ns
Total cycles
20
2 GHz
Avg hops
5.3
3 cyc/hop
Serialization
4 cyc
512b / 128b
Latency breakdown

A 2D Mesh of 64 nodes averages 5.3 hops at 3 cycles each, plus 4 cycles to serialize a 512-bit packet over a 128-bit link — 20 cycles ≈ 10.0 ns at 2 GHz.

Hop traversal is 80% of the latency; serialization is 20%. Hop-dominated — a lower-diameter topology or closer placement helps most.

Drive placement & distances in the Floorplan console; size link throughput in Memory Bandwidth.

Why it matters

Why on-chip distance costs time

Latency = hops × per-hop + serialization

On-chip network delay is the number of router hops times the cost of each hop, plus the time to push the packet onto a link. Topology sets the hop count; link width sets the serialization.

Topology trades wires for hops

A mesh is cheap to wire but has many hops; a torus halves the hops with wrap-around links; richer topologies cut hops further at the cost of more wiring and area. The right choice depends on traffic and floor space.

Wire delay doesn't scale

Transistors got faster every node, but wires didn't — cross-chip signal time is now many cycles. That's why big chips are networks of tiles: you can't drive a signal corner-to-corner in one cycle any more.

Serialization punishes narrow links

A 512-bit packet on a 128-bit link takes four cycles just to inject. Wider links cut serialization but cost wires and power. Latency-sensitive traffic wants wide links; bulk traffic can tolerate narrow.

Field notes

Chips became networks

There was a time when a signal could cross a chip in a single clock cycle and global wires were free. That time is gone. Transistors kept getting faster every node, but wires did not — the resistance and capacitance of interconnect scales badly, so relative to the fast logic around them, wires got slower. Driving a signal from one corner of a large die to the other now takes many cycles, and that single physical fact reshaped how chips are built.

The answer was to stop treating the chip as one big circuit and start treating it as a network of tiles connected by routers — a network-on-chip. Communication is pipelined across hops, and latency becomes a structural property of the network: the average number of hops times the cost of each hop, plus the time to serialize a packet onto a link. Topology sets the hop count, the router pipeline and link length set the per-hop cost, and the link width sets the serialization tax.

Each of those is a lever with a price. A torus halves a mesh's hops but needs wrap-around wires; a wider link erases serialization but burns area and power; a shallower router pipeline cuts per-hop latency but may cost frequency. The biggest lever of all isn't in the network at all — it's placement. Keep communicating blocks close and traffic stays local, hop counts stay low, and the network barely matters. Spread them out and no topology will save you.

Use this console to compute the zero-load latency floor, compare topologies for your node count, and see whether hops or serialization dominate — which tells you whether to change topology, widen links, or improve placement. Reason about that placement in the Floorplan console and size link throughput against demand in Memory Bandwidth.

Interconnect Latency FAQs

Have more questions? Contact us

Trusted by NoC & Fabric Architects

4.8
Based on 2,480 reviews

Hops × per-hop + serialization with the right average-hop formulas per topology is exactly the zero-load model I use to compare networks. Showing torus halving the hops vs mesh, and serialization as a fixed injection tax, is the framing juniors need. The wire-delay-doesn't-scale point is the whole reason NoCs exist.

D
Dr. Yuki Tanaka
NoC architect
June 12, 2026

Link width as a serialization lever is captured perfectly — I use it to justify wide links for our latency-critical coherence traffic. Router-cycle knob lets me model our pipeline. Pairs naturally with the floorplan tool for placement-driven locality. Exactly the right fidelity for early exploration.

P
Priit Kask
Interconnect design
May 19, 2026

Clean zero-load latency across topologies with realistic NoC presets. The latency floor it gives is my design target before simulation. Would love a queueing/load model, but the tool is upfront that it's zero-load, and for topology comparison it's spot-on.

M
Maria Costa
GPU fabric
April 1, 2026

Modeling our 16×16 mesh latency and seeing where serialization vs hops dominate told us to widen links rather than change topology. The cross-chip-wire-delay reality is well represented. Feeds straight into floorplan and bandwidth planning. Fast and genuinely useful.

A
Aleksandr Volkov
Chiplet fabric
January 14, 2026

Love using our calculator?

Connected instruments

Related tools

Similar Calculators

More tools in the same category

Memory Bandwidth Calculator

Calculate memory throughput requirements for CPU, GPU, and AI workloads with cache-hierarchy modeling, prefetch analysis, and bandwidth-saturation detection. Supports DDR5, HBM, LPDDR, and CXL memory pools with multi-tier bandwidth planning and bottleneck identification.

Cache Size Estimator

Estimate optimal cache sizing for performance targets with workload-characteristic analysis, miss-rate modeling, and area-power trade-off evaluation. Supports L1/L2/L3 hierarchy design, non-inclusive vs. exclusive policies, and last-level cache (LLC) partitioning for multi-core systems.

Clock Tree Estimator

Estimate clock distribution overhead including skew, jitter, power consumption, and area for hierarchical and mesh clock networks. Models multi-corner multi-mode (MCMM) scenarios, clock-gating efficiency, and adaptive frequency scaling for advanced-node designs.

Floorplan Estimator

Generate early-stage floorplan metrics including aspect ratio, wire-length estimation, and congestion prediction from RTL hierarchy and connectivity graphs. Supports macro placement, pin assignment, and power-domain planning with thermal-aware optimization for AI and HPC chips.

Transistor Count Estimator

Estimate transistor count from architecture specifications including core count, cache size, vector width, and accelerator block definitions. Supports FinFET, GAA, and CFET node scaling with density-per-micron tracking and die-area extrapolation for roadmap planning.

SRAM Area Calculator

Calculate SRAM area requirements for various bit-cell designs (6T, 8T, 10T) across process nodes with row/column redundancy, sense-amplifier overhead, and peripheral circuit modeling. Supports single-port, dual-port, and custom-port configurations with yield-aware sizing.

Often Used Together

Complementary tools for complete analysis

Learn More

Related Articles

Dive deeper with our expert guides and tutorials related to Interconnect Latency Calculator

Loading articles...

latency = avg hops × (router + link) + ⌈packet ÷ link width⌉ cycles ÷ clock · Last reviewed: 2026-06