One of the imperative architectural concerns for software architects is to protect APIs and service endpoints from harmful events such as denial-of-service attacks, cascading failures, or overuse of resources. Rate limiting is a technique used to control the rate by which an API or a service is consumed, which in turn can protect you from these events that can bring your services to a screeching halt. In a distributed system, no better option exists than to centralize configuring and managing the rate at which consumers can interact with APIs. Only those requests within a defined rate would make it to the API. Any more would return an HTTP 429 (“Too Many Requests”) error.
Spring Cloud Gateway is a simple and lightweight component that can be used to limit API consumption rates. In this post, I am going to demonstrate how easily that can be accomplished using a configuration method. As shown in the figure below, the demonstration consists of both a front- and backend service, with a Spring Cloud Gateway service in between.
No code whatsoever is needed to include the Spring Cloud Gateway in the architecture. You need instead to include the Spring Boot Cloud dependency org.springframework.cloud:spring-cloud-starter-gateway
in a vanilla Spring Boot application, then you’ll be set to go with the appropriate configuration settings.
Requests received by Spring Cloud Gateway from a frontend service can be routed to a backend service based on a configured route definition, which makes clear to the gateway how a request should be routed to a backend endpoint. A route configuration usually defines conditions based on information that can be extracted from HTTP requests, such as paths and headers.
For example, the snippet below lists a YAML stanza to configure the condition under which requests should be routed to a backend service; it shows that requests should target the backend service when the gateway is hit with “/backend” in the path. In the configuration, the route is given an identifier and the backend service URL.
spring:
cloud:
gateway:
routes:
- id: route1
uri: http://localhost:8081
predicates:
- Path=/backend
RequestRateLimiter
is one of the many gateway filters offered by Spring Cloud Gateway; it determines whether a request is allowed to proceed or has exceeded its limit. It also lets you (optionally) plug in a key to limit the number of requests to different services. While implementing how to resolve a key is customizable, the gateway comes with one that leverages a user’s Principal
name. A secured gateway is needed to resolve a user’s principal name, but you have the option to implement the KeyResolver
interface to instead resolve a different key from the ServerWebExchange
. You can point to a custom KeyResolver
bean (e.g., named customKeyResolver
) in the configuration by using a SPEL #{@customKeyResolver}
expression. The following listing shows the KeyResolver
interface:
public interface KeyResolver {
Mono<String> resolve(ServerWebExchange exchange);
}
The gateway would deny requests if no key was resolved. To let the gateway accept a missing resolved key, you can set the following property:
spring.cloud.gateway.filter.request-rate-limiter.deny-empty-key=false
You can also specify a status code that the gateway should report when it cannot figure out a key by setting the following property:
spring.cloud.gateway.filter.request-rate-limiter.empty-key-status-code=
Consider a blueprint architecture in which a gateway controls the limiting of API consumption by using Redis. The provided Redis implementation uses the token bucket algorithm. To enable its use, you need to include the spring-boot-starter-data-redis
Spring Boot starter dependency in the gateway application. Basically, the token bucket algorithm uses balance tokens as a way to maintain an accumulating budget of utilization. The algorithm assumes tokens will be added to a bucket at a certain rate while calls to an API consume the tokens from the bucket. One API invocation may perform many operations in order to compose a response so that it fulfills a request (think of GraphQL-based APIs). In such cases, the algorithm helps Spring Cloud Gateway recognize that one invocation may cost an API more than one token.
The provided Redis implementation lets you define the request rate at which users can make calls within a certain time period. It also makes it possible to accommodate sporadic demands while constrained by the defined consumption rate. For example, a configuration can define a replenish rate of 500 requests per second by setting the redis-rate-limiter.replenishRate=500
property and a burst capacity of 1,000 requests per second by setting the redis-rate-limiter.burstCapacity=1000
property. Doing so limits consumption to 500 requests every second. If a burst in the number of requests occurs, only 1,000 requests are allowed. However, since this exceeds our defined limit of 500 requests per second, the gateway won’t route the other 500 requests until the next second. The configuration also lets you define how many tokens a request would cost by setting the property redis-rate-limiter.requestedTokens
. Typically, it is set to one.
To use a gateway with a request limiting feature, it needs to be configured with the RequestRateLimiter
gateway filter. The configuration can specify arguments to define a replenish rate, burst capacity, and the number of tokens that a request costs. The example below illustrates how to configure a gateway with these arguments:
spring:
cloud:
gateway:
routes:
- id: route1
uri: http://localhost:8081
predicates:
- Path=/backend
filters:
- name: RequestRateLimiter
args:
redis-rate-limiter.replenishRate: 500
redis-rate-limiter.burstCapacity: 1000
redis-rate-limiter.requestedTokens: 1
Spring Cloud Gateway provides the flexibility to define your own custom rate limiter implementation; it offers a RateLimiter
interface to implement and define a bean. The rate limiter bean can be configured by using a SPEL expression, as in the case of a custom key resolver. For instance, you can define a custom rate limiter bean named customRateLimiter
and a custom key resolver named customKeyResolver
and configure a route like this:
@Bean
public KeyResolver customKeyResolver {
return exchange -> .... // returns a Mono of String
}
spring:
cloud:
gateway:
routes:
- id: route1
uri: http://localhost:8081
predicates:
- Path=/backend
filters:
- name: RequestRateLimiter
args:
rate-limiter: "#{customRateLimiter}"
key-resolver: "#{customKeyResolver}"
You can find the code to try it yourself on GitHub. If you’re interested in learning more about Spring Cloud Gateway, make sure to check out our What is Spring Cloud Gateway and Getting Started with Spring Cloud Gateway guides!