Apply Rate Limit Assertion
The Apply Rate Limit assertion allows you to limit the rate of transactions passing through the gateway for a given user, client IP address, or other identifier. When this limit is reached, the Gateway can either begin throttling requests or it can attempt to delay the requests until the rate falls below the limit. You can also set a maximum concurrency level to prevent a user from monopolizing Gateway resources.
gateway
The
Apply Rate Limit
assertion allows you to limit the rate of transactions passing through the CA API Gateway
for a given user, client IP address, or other identifier. When this limit is reached, the Gateway can either begin throttling requests or it can attempt to delay the requests until the rate falls below the limit. You can also set a maximum concurrency level to prevent a user from monopolizing Gateway resources.Use this assertion only if you need to limit the flow of transactions entering the Gateway. If you have a cluster of Gateways, the limits entered in this assertion are divided among the number of "up" nodes in the cluster. A node is considered “up” if it has posted its status within the past 8 seconds (configurable via the
ratelimit.clusterStatusInterval
cluster property). The Apply Rate Limit Assertion checks the status of cluster nodes every 43 seconds (configurable via the ratelimit.clusterPollInterval
cluster property). The Gateway automatically adjusts the rates internally should nodes be added or removed from a cluster. There is no need to modify the values in this assertion. If no authenticated user is established in the policy, then the IP address of the requestor is used instead in the Apply Rate Limit Assertion.
Using the Assertion
- Do one of the following:
- To add the assertion to the Policy Development window, see Add an Assertion.
- To change the configuration of an existing assertion, proceed to step 2 below.
- Right-clickApply Rate Limit...in the policy window and chooseRate Limit Propertiesor double-click the assertion in the policy window. The assertion properties are displayed.
- Configure the properties as follows:SettingDescriptionMaximum requests per secondSpecify how many requests per second should be processed by the Gateway or cluster. You can enter a context variable that resolves to the maximum requests value.The context variable must either be single-value or multivalued with a specific index reference.Cluster wideIf the Gateway cluster comprises more than one node, this setting determines whether the value entered in theMaximum requests per second fieldis split among the nodes or applied to each node.
- Select this check box to split the value cross all the nodes in the cluster. For example, if the maximum is 100, each node in a 4-node cluster will be limited to 25 requests per second. If a node drops out of the cluster, the 100 limit is redistributed across the remaining three nodes.
- Clear this check box to allow the maximum requests value oneachnode. For example, if the maximum is 100, each node in a 4-node cluster will be allowed 100 requests per second, resulting in an effective maximum of 400 requests per second. If one node drops out of the cluster, the effective maximum drops to 300 requests per second (3 x 100).
Spread limit overXsec windowDetermines whether to allow a burst of requests to be spread across a window of time or whether to enforce a hard cap.- Select the check box to allow requests to arrive in arbitrary bursts that exceed theMax requests per secondrate over anXsecond window. This can avoid throttling of traffic over prolonged traffic bursts.You may enter a context variable containing theXsecond window value. This variable can be either single-value or multivalued with a specific index reference.
- Clear the check box to disallow bursts. In this scenario, the Gateway only accepts requests arriving no sooner than1/limitof a second. For example, if theMax requests per secondis 100, at least 1/100 second must have elapsed between requests. Requests that arrive sooner are either throttled or shaped (based on the "When limit exceeded" setting). Disallowing burst traffic is recommended only for advanced users.
It is not recommended to disable burst traffic on a counter that will be servicing multiple concurrent requests, particularly at high rates. Doing so can lead to unintended throttling or delaying of multiple requests that arrive at exactly the same time.The following graph illustrates how spreading the limit will allow more traffic and throttle fewer requests.rate_limit_arc2The effect is akin to a gas tank that slowly refills when not being used. Each request "consumes" some gas and the request fails if there is no more gas. The "Spread limit over" setting lets you control the size of the gas tank.Limit eachSpecify how limiting should occur:- by theUser or client IPaddress
- by theAuthenticated username
- by theClient IPaddress
- by theSOAP operationwithin the request
- by theSOAP namespacewithin the request
- by theGateway node
- by aCustomcounter value (enables a limit per value of a context variable); enter the node identifier followed by a context variable that resolves to the correct entity during run time.To help you construct a custom format, the entry box displays the actual node identifier and context variable associated with each of the other limit options when you select the Custom option. For example, when you first open the Rate Limit Properties,User or client IPis selected by default. Now, chooseCustomand then reselectUser or client IP. You see that the actual coding behind this is<node identifier>-${request.clientid}.
The limit breakdown impacts both the maximum number of requests per second as well as the maximum concurrency.For example, if you choose “by client IP address” and set the maximum concurrency to 10 and maximum number of requests per second to 100, the assertion will fail if any incoming IP address exceeds either the concurrency of 10 or the 100 requests per second; all IP addresses combined are permitted to exceed these limits however. You can combine multiple instances of this assertion to impose difference limits by different breakdown factors, such as “maximum 10 per IP and maximum 100 for all combined”.When limit exceededSpecify what should happen if the rate limit is exceeded:- Throttle:Excess requests causes this assertion to fail and send audit code 6950(Rate limit exceeded on rate limiter XXXX)to the audit log.
- Shape:The assertion attempts to delay requests to avoid exceeding the limit. If theAPI Gatewayis unable to spare sufficient resources to hold a request any further, a 503(Service Unavailable)error may still occur.
- Log Only:The assertion logs that the rate limit has been exceeded, but the assertion does not fail. Audit message 6950 is logged.
- Blackout forSelect this check box to fail all requests for the nextXsec:Xseconds after the limit is exceeded, even if the rate of requests falls below the limits defined in this assertion.IMPORTANT:For blackout period greater than 13 seconds, increase theratelimit.cleanerPeriodcluster property to prevent the rate limit counters from being flushed before the blackout period ends. If the counters are flushed prematurely, the rate limits are not applied. For more information on this cluster property, see Rate Limit Cluster Properties.
The number of threads that can be queued within a node is defined by theratelimit.maxQueuedThreadscluster property. For more information, see Rate Limit Cluster Properties.Maximum concurrent requestsIndicate whether to enforce concurrency limits for a given named rate limiter (as specified by theLimit eachsetting).- Unlimited:Concurrency is not enforced. A named rate limiter can have an unlimited number of active requests simultaneously in the Gateway or cluster. This may result in someone consuming a disproportionately high amount of system resources.
- Limited to:Ensure that no named rate limiter can have more than the specified number of concurrent requests passing through this assertion. Requests that exceed the concurrency limit will cause the assertion to fail, with the audit event 6953(Concurrency exceeded on rate limiter XXXX).You can enter a context variable that contains the maximum concurrent requests value. This variable can be either single-value or multivalued with a specific index reference.
- Cluster wide:If the Gateway cluster comprises more than one node, this setting determines whether the value entered in theLimited tofield is split among nodes or to be applied to each node. This setting is the default.
- Select this check box to split the value across all the nodes in the cluster. For example, if the maximum is 10, each node in a 5-node cluster will result in a concurrency limit of 2 requests per node.
- Clear this check box to allow the maximum requests value oneachnode. For example, if the maximum is 10, every node in the cluster will be allowed 10 concurrent requests.
Additional note about how the concurrency limit works:- The concurrency counter is incremented when a request passes through the Apply Rate Limit Assertion (even if the assertion ends up failing). The counter is decremented once the request is completely finished.
- Click [OK]