Why does every exposed endpoint get abused?

Automated tools scan the entire internet, so endpoints are probed regardless of who owns them, facing credential-stuffing, scraping, and overload attempts.

Why limit at multiple scopes instead of one global limit?

An attacker with many clients can stay under a per-client limit while abusing an account, so layering per-client, per-account, and per-operation limits closes the gaps.

How do you avoid punishing legitimate users with rate limits?

Tune limits to stop automated abuse while leaving normal bursts untouched, smooth bursts rather than rejecting at a hard cliff, and signal clearly when to retry.

← All articles

May 7, 20266 min readsecurity, rate-limiting, abuse-prevention, engineering

Rate Limiting and Abuse Prevention Basics

By Mykhailo Boichuk · Co-founder & Vice-President

In short

Any endpoint exposed to the internet will eventually be abused, from brute-force attempts to scraping to denial-of-service. Rate limiting caps how often a client can act, which blunts most automated abuse. Effective protection layers limits at several scopes, fails in a way that protects the service, and distinguishes legitimate bursts from attacks so real users are not punished alongside abusers.

Exposed endpoints attract abuse

An endpoint reachable from the internet will be probed and abused, not because of who you are but because automated tools scan everything. Login forms face credential-stuffing, public APIs face scraping, and any expensive operation faces attempts to overwhelm it. Treating abuse as inevitable rather than hypothetical changes the design posture from reactive to prepared.

Rate limiting is the foundational defense because most abuse depends on doing something many times quickly. Capping the rate at which a client can perform an action removes the volume that brute-force, scraping, and many denial-of-service attempts rely on, without needing to identify the attacker.

Limit at several scopes

A single global limit is rarely enough. Effective rate limiting applies at several scopes: per client, per account, and per operation, because an attacker controlling many clients can stay under a per-client limit while still abusing an account, and an expensive operation may need a tighter limit than a cheap one. Layering limits closes the gaps that any single dimension leaves open.

Limit per client and per account, not only globally.
Apply tighter limits to expensive or sensitive operations.
Use approaches that smooth bursts rather than rejecting at a hard cliff.

Protect the service when limits are hit

When a limit is reached, the response should protect the service and inform the legitimate client. Returning a clear signal that the limit was exceeded, along with when to retry, lets a well-behaved client back off, while the limit itself contains a misbehaving one. The system should also fail safe under extreme load, shedding excess work rather than collapsing, so that an attack degrades the service gracefully instead of taking it down.

Rate limiting is a blunt instrument, and tuning it badly punishes real users. A limit set too low blocks legitimate bursts; set too high it lets abuse through. The aim is to find the level that stops automated abuse while leaving normal use untouched.

Key takeaways

Any internet-exposed endpoint will be probed and abused by automated tools.
Rate limiting removes the volume that most automated abuse depends on.
Apply limits at several scopes: per client, per account, and per operation.
Signal clearly when a limit is hit and tell well-behaved clients when to retry.
Tune limits to stop abuse while leaving legitimate bursts of use untouched.

Frequently asked questions

Why does every exposed endpoint get abused?: Automated tools scan the entire internet, so endpoints are probed regardless of who owns them, facing credential-stuffing, scraping, and overload attempts.
Why limit at multiple scopes instead of one global limit?: An attacker with many clients can stay under a per-client limit while abusing an account, so layering per-client, per-account, and per-operation limits closes the gaps.
How do you avoid punishing legitimate users with rate limits?: Tune limits to stop automated abuse while leaving normal bursts untouched, smooth bursts rather than rejecting at a hard cliff, and signal clearly when to retry.

References

OWASP

About the author

Mykhailo Boichuk

Co-founder & Vice-President

Mykhailo is an engineer who builds native applications and the systems behind them. He concentrates on macOS and iOS performance, local-first data architecture, and the synchronization problems that come with offline-capable software.