
2023 OWASP API Security Top 10 Best Practices
After four long years since the original…
Key Takeaway
Rate limiting is a mechanism used to control the amount of data or requests that can be transmitted between two systems within a specified time period. It helps prevent abuse, protect system resources, and ensure fair usage for all users. By implementing rate limiting, organizations can mitigate the risk of server overload, improve network performance, and enhance overall security.
A digital resource, such as an application programming interface (API) is finite in nature. It will only be able to handle so many requests per minute or hour, assuming it’s deployed on a fixed amount of infrastructure, e.g., one virtual server. To prevent overloading such a resource, its owners will often apply “rate limiting,” a practice that restricts how many requests it will handle for each user in a given period of time. Rate limiting is also sometimes called “throttling,” because the process performs the digital equivalent of narrowing a pipe to restrict the flow of air.
Rate limiting serves multiple purposes. It ensures reliable access to digital resources that meets service quality expectations. With rate limiting, system owners do not have to invest in infrastructure they don’t need. Rate limiting is also an important tool for protecting digital assets from malicious or unauthorized use, as well as for being able to better understand your traffic needs to determine how to scale your environment.
The best way to understand rate limiting is as a policy. Rate limiting is a set of rules that control the rate at which a user can access a digital resource. For example, how many times can a user attempt to log into a website in a minute? If there is no rate limiting policy, the answer is “as many times as he wants.” With rate limiting, the answer might be, “the user can attempt to log in up to three times in a given minute.” If he tries to log in a fourth time, he will be blocked.
Rate limiting occurs when a rate limiting policy gets enforced by some sort of hardware or software solution. Both the policy and its enforcement are necessary. Without the policy, the enforcement is meaningless. Without the policy enforcement point, the policy is meaningless. Rate limiting also implies the existence of some kind of monitoring tool that tracks the rate of resource usage—and flags problematic situations like a server “running hot” or a suspicious activity that indicates that a cyberattack is under way.
Rate limiting is important because it provides a key operational element of service quality, cybersecurity, and finance. Regarding service quality, rate limiting prevents resources from getting overwhelmed by an excessive number of requests, a situation that can lead to service slowdowns or outages. With well-designed rate limiting policies, all eligible users can enjoy a similar quality of service (QoS). On a related front, rate limiting helps ensure fairness by preventing a user from monopolizing a digital resource.
For cybersecurity, rate limiting serves as a countermeasure against a number of different threats. A denial of service (DoS) attack, for instance, attempts to flood a resource with requests so it will shut down. Rate limiting can block the DoS attacker from accomplishing this goal. Rate limiting can similarly mitigate brute force attacks, which involve rapid-fire guessing of passwords. Rate limiting also works against credential stuffing, a variant on brute force wherein the attacker quickly tries different stolen username/password pairs to gain unauthorized access to a resource, e.g., a banking website.
Rate limiting can help a business with its finances, as well. A digital asset costs money, so the more efficiently it’s utilized, the better its financial outcome will be. Rate limit keeps usage of digital assets within predictable bounds. To understand why this matters, consider a scenario where a business has to purchase additional servers to keep up with a high volume of traffic caused by users with unlimited access. That’s not a worthwhile investment.
Looking at this issue from a different perspective, each service request carries a cost. It might be small, perhaps a fraction of a penny, when bandwidth, data center expenses and software/hardware depreciation are taken into account. However, if millions of unwanted service requests are choking the system, that will lead to a waste of money.
Rate limiting may also be part of the monetization of a digital asset. For example, a company might allow a user 5,000 API calls per week for $100. The API owner needs rate limiting to enforce this maximum number of API calls.
Rate limiting is based on a userID or a user’s Internet Protocol (IP) address. A rate limiting solution tracks the IP addresses associated with service requests. Because the IP address represents a unique code for each connection to the requested service, it enables the rate limiting solution to effectively block out-of-policy behavior.
The actual process of rate limiting involves keeping track of the total number of requests made by users from a given IP address. The rate limiting solution then compares the request activity with its policies. It can easily detect users that are violating the rules and stop them from continuing. In most cases, the rate limiting solution will send an error message to the user.
API pagination is also a tool used to control the rate at which API requests are made. It is used to ensure that the system is not overloaded and that data is retrieved efficiently. This technique also helps protect against malicious requests and reduces the risk of data breaches. By limiting the number of requests, it also helps reduce server load and increases overall performance.
System owners use rate limiting for a variety of reasons, most of which have to do with QoS and security. The goals are almost always to keep systems running as expected, deliver a good user experience, and protect digital assets from malicious misuse. In particular:
This article has focused on rate limiting based on a user’s service requests in a defined time period. However, there are many other ways to limit requests. For example, rate limiting rules can restrict the volume of requests based on frequency and total request volume. A user may be forbidden from trying to log into a site more than 10 times per minute. However, the user might also be forbidden from trying to log in more than 100 times per day. Both rules can apply. Otherwise, that user might try to log in 10 times per minute for all 1,440 minutes in a 24-hour period, which is not ideal from a QoS perspective.
It is also possible to do rate limiting based on location. Users from Germany might be allowed 100 log in attempts per day, while those in France get 200. Alternatively, this kind of rate limiting policy can trigger re-routing of service requests, e.g., sending traffic from an overworked server in Germany to one in France.
Rate limiting is an essential control and countermeasure for system owners who want to stay secure and prevent lapses in service. The practice also helps ensure desired financial outcomes for investments in digital resources. It’s a simple concept—restrict access based on rules about requests per minute, and so forth—but implementation requires close attention to detail. With effective rate limiting in effect, servers and APIs, among other infrastructure elements, will be available for those who are entitled to their use.
Experience the speed, scale, and security that only Noname can provide. You’ll never look at APIs the same way again.