Question 1

What does the API Rate Limit Simulator do?

Accepted Answer

You pick a rate-limiting algorithm — token bucket, fixed window, sliding window log or leaky bucket — set its parameters (limit, window, bucket capacity, refill rate) and describe an incoming traffic pattern (steady requests-per-second over a duration, plus an optional spike). The tool replays that traffic second-by-second through the algorithm's exact admission rules and shows you, for every tick, how many requests were allowed versus throttled (429), a timeline chart, summary stats and the headers each algorithm would return. It's the fastest way to see how a limit behaves before you ship it.

Question 2

Is anything sent to a server?

Accepted Answer

No. The entire simulation runs in your browser in JavaScript — there's no account, no upload and no telemetry, and it works offline once the page has loaded. The arrival sequence is derived deterministically from your inputs (no randomness and no clock), so the same parameters always produce exactly the same result, which makes it easy to reason about and to share.

Question 3

Token bucket vs fixed window vs sliding window vs leaky bucket — which should I use?

Accepted Answer

Token bucket is the usual default: it caps the long-run rate (the refill rate) while still allowing short bursts up to the bucket size, which matches how real clients behave. Fixed window is the cheapest but lets a client send up to double the limit across a window boundary. Sliding window log removes that boundary burst and enforces a smooth, accurate rate, at the cost of storing per-request timestamps. Leaky bucket produces a perfectly even output rate and is ideal when a downstream system needs steady throughput, but it adds queuing latency. Run the same traffic through each here to feel the difference.

Question 4

Why does the fixed window let through more than the limit during a burst?

Accepted Answer

Because the counter resets on a hard boundary. A client can send the full limit in the last second of one window and the full limit again in the first second of the next — twice the limit within a rolling window of one window length. The simulator shows this clearly: set a spike near a window boundary with the fixed-window algorithm and watch two windows admit back-to-back. Sliding window log and token bucket both avoid this.

Question 5

How are bursts handled?

Accepted Answer

Token and leaky buckets absorb bursts up to their capacity — the spike is admitted as long as there's room (tokens available, or queue space) and throttled once it's exhausted, after which the bucket recovers at the refill/drain rate. Fixed and sliding windows admit up to the remaining count in the window and throttle the rest. Use the 'spike at t=Ns' input to inject a one-off burst and compare how each algorithm copes.

Question 6

What is the Retry-After header and how is it computed?

Accepted Answer

Retry-After tells the client how long to wait before retrying, in seconds (or as an HTTP date). For a token bucket it's roughly the time to accrue one token — (1 − tokens) / refillRate. For a fixed window it's the time remaining until the window resets. For a sliding window it's the time until the oldest in-window request ages out. For a leaky bucket it's the time for the queue to drain enough to accept the request. Honouring Retry-After is what separates a well-behaved client from one that hammers a 429.

Question 7

What are the X-RateLimit-* headers?

Accepted Answer

They're the de-facto standard for communicating limit state to clients: X-RateLimit-Limit (the ceiling), X-RateLimit-Remaining (how many requests are left right now) and X-RateLimit-Reset (when the window or budget refreshes, as a timestamp or seconds). The IETF's RateLimit header fields draft standardises similar semantics. The simulator's per-tick 'remaining/level' column maps directly to the value you'd surface in X-RateLimit-Remaining.

Question 8

How do I choose the right limit?

Accepted Answer

Start from your capacity and your SLA: estimate the sustainable requests-per-second your backend can serve, divide by your expected concurrent clients, and leave headroom. Set the long-run rate (refill rate, or limit/window) to that sustainable figure and size the burst (bucket capacity) to the largest legitimate spike you want to absorb — often a few seconds of traffic. Then simulate your real pattern here and tune until the throttle rate on legitimate traffic is near zero while abusive bursts still get capped.

Question 9

Does this model distributed rate limiting across many servers?

Accepted Answer

It models a single logical limiter — which is exactly what you get when all nodes share state in a central store like Redis. In a distributed fleet without shared state, each node enforces its own slice of the limit, so the effective global limit is roughly per-node-limit × node-count and bursts can be larger. To approximate that here, divide your global limit by the node count and simulate one node, or simulate the global limit to model the shared-store (Redis token bucket / sorted-set sliding window) design that most platforms actually use.

Question 10

Can I trust the numbers — is the math exact?

Accepted Answer

Yes. Each algorithm applies its textbook admission rule on every tick: the token bucket refills then spends tokens, the fixed window counts against a boundary-aligned counter, the sliding window evicts aged timestamps from a log before admitting, and the leaky bucket drains then enqueues. There's no sampling and no randomness, so the allowed/throttled totals are the exact outcome for the pattern you described. Adjust any input and every number, the chart and the table recompute instantly.

Time (s)	Incoming	Allowed	Throttled (429)	Tokens left
0	5	5	0	15
1	5	5	0	12
2	5	5	0	9
3	5	5	0	6
4	5	5	0	3
5	5	5	0	0
6	5	2	3	0
7	5	2	3	0
8	5	2	3	0
9	5	2	3	0
10	5	2	3	0
11	5	2	3	0
12	5	2	3	0
13	5	2	3	0
14	5	2	3	0
15	5	2	3	0
16	5	2	3	0
17	5	2	3	0
18	5	2	3	0
19	5	2	3	0
20	5	2	3	0
21	5	2	3	0
22	5	2	3	0
23	5	2	3	0
24	5	2	3	0
25	5	2	3	0
26	5	2	3	0
27	5	2	3	0
28	5	2	3	0
29	5	2	3	0
30	45	2	43	0
31	5	2	3	0
32	5	2	3	0
33	5	2	3	0
34	5	2	3	0
35	5	2	3	0
36	5	2	3	0
37	5	2	3	0
38	5	2	3	0
39	5	2	3	0
40	5	2	3	0
41	5	2	3	0
42	5	2	3	0
43	5	2	3	0
44	5	2	3	0
45	5	2	3	0
46	5	2	3	0
47	5	2	3	0
48	5	2	3	0
49	5	2	3	0
50	5	2	3	0
51	5	2	3	0
52	5	2	3	0
53	5	2	3	0
54	5	2	3	0
55	5	2	3	0
56	5	2	3	0
57	5	2	3	0
58	5	2	3	0
59	5	2	3	0

API Rate Limit Simulator

Per-second simulation log

Picking and tuning a rate limit

API Rate Limit FAQs

Trusted by Platform & Backend Engineers

Related API tools

Related Articles

Technical Services

Time (s)	Incoming	Allowed	Throttled (429)	Tokens left
0	5	5	0	15
1	5	5	0	12
2	5	5	0	9
3	5	5	0	6
4	5	5	0	3
5	5	5	0	0
6	5	2	3	0
7	5	2	3	0
8	5	2	3	0
9	5	2	3	0
10	5	2	3	0
11	5	2	3	0
12	5	2	3	0
13	5	2	3	0
14	5	2	3	0
15	5	2	3	0
16	5	2	3	0
17	5	2	3	0
18	5	2	3	0
19	5	2	3	0
20	5	2	3	0
21	5	2	3	0
22	5	2	3	0
23	5	2	3	0
24	5	2	3	0
25	5	2	3	0
26	5	2	3	0
27	5	2	3	0
28	5	2	3	0
29	5	2	3	0
30	45	2	43	0
31	5	2	3	0
32	5	2	3	0
33	5	2	3	0
34	5	2	3	0
35	5	2	3	0
36	5	2	3	0
37	5	2	3	0
38	5	2	3	0
39	5	2	3	0
40	5	2	3	0
41	5	2	3	0
42	5	2	3	0
43	5	2	3	0
44	5	2	3	0
45	5	2	3	0
46	5	2	3	0
47	5	2	3	0
48	5	2	3	0
49	5	2	3	0
50	5	2	3	0
51	5	2	3	0
52	5	2	3	0
53	5	2	3	0
54	5	2	3	0
55	5	2	3	0
56	5	2	3	0
57	5	2	3	0
58	5	2	3	0
59	5	2	3	0

Time (s)	Incoming	Allowed	Throttled (429)	Tokens left
0	5	5	0	15
1	5	5	0	12
2	5	5	0	9
3	5	5	0	6
4	5	5	0	3
5	5	5	0	0
6	5	2	3	0
7	5	2	3	0
8	5	2	3	0
9	5	2	3	0
10	5	2	3	0
11	5	2	3	0
12	5	2	3	0
13	5	2	3	0
14	5	2	3	0
15	5	2	3	0
16	5	2	3	0
17	5	2	3	0
18	5	2	3	0
19	5	2	3	0
20	5	2	3	0
21	5	2	3	0
22	5	2	3	0
23	5	2	3	0
24	5	2	3	0
25	5	2	3	0
26	5	2	3	0
27	5	2	3	0
28	5	2	3	0
29	5	2	3	0
30	45	2	43	0
31	5	2	3	0
32	5	2	3	0
33	5	2	3	0
34	5	2	3	0
35	5	2	3	0
36	5	2	3	0
37	5	2	3	0
38	5	2	3	0
39	5	2	3	0
40	5	2	3	0
41	5	2	3	0
42	5	2	3	0
43	5	2	3	0
44	5	2	3	0
45	5	2	3	0
46	5	2	3	0
47	5	2	3	0
48	5	2	3	0
49	5	2	3	0
50	5	2	3	0
51	5	2	3	0
52	5	2	3	0
53	5	2	3	0
54	5	2	3	0
55	5	2	3	0
56	5	2	3	0
57	5	2	3	0
58	5	2	3	0
59	5	2	3	0

Time (s)	Incoming	Allowed	Throttled (429)	Tokens left
0	5	5	0	15
1	5	5	0	12
2	5	5	0	9
3	5	5	0	6
4	5	5	0	3
5	5	5	0	0
6	5	2	3	0
7	5	2	3	0
8	5	2	3	0
9	5	2	3	0
10	5	2	3	0
11	5	2	3	0
12	5	2	3	0
13	5	2	3	0
14	5	2	3	0
15	5	2	3	0
16	5	2	3	0
17	5	2	3	0
18	5	2	3	0
19	5	2	3	0
20	5	2	3	0
21	5	2	3	0
22	5	2	3	0
23	5	2	3	0
24	5	2	3	0
25	5	2	3	0
26	5	2	3	0
27	5	2	3	0
28	5	2	3	0
29	5	2	3	0
30	45	2	43	0
31	5	2	3	0
32	5	2	3	0
33	5	2	3	0
34	5	2	3	0
35	5	2	3	0
36	5	2	3	0
37	5	2	3	0
38	5	2	3	0
39	5	2	3	0
40	5	2	3	0
41	5	2	3	0
42	5	2	3	0
43	5	2	3	0
44	5	2	3	0
45	5	2	3	0
46	5	2	3	0
47	5	2	3	0
48	5	2	3	0
49	5	2	3	0
50	5	2	3	0
51	5	2	3	0
52	5	2	3	0
53	5	2	3	0
54	5	2	3	0
55	5	2	3	0
56	5	2	3	0
57	5	2	3	0
58	5	2	3	0
59	5	2	3	0