Rate Limit

Rate limiting defines the maximum number of requests an application can make to the API within a specified time period. This mechanism is crucial for maintaining the overall health and reliability of the services.

Why we use rate limiting

Rate limiting serves several important purposes:

API stability and performance

  • Prevents any single client from overwhelming our servers with too many requests

  • Ensures consistent response times for all users

  • Protects backend services from traffic spikes

  • Maintains predictable load patterns on the infrastructure

Security benefits

  • Defends against brute force attacks

  • Mitigates distributed denial-of-service (DDoS) attacks

  • Reduces the impact of poorly implemented client applications

  • Limits damage from compromised API credentials

Fair resource allocation

  • Ensures equitable access to shared API resources across all clients

  • Prevents aggressive clients from degrading service for other users

  • Helps prioritize traffic according to business needs

  • Encourages efficient API usage patterns

How rate limiting works

Each endpoint may have different limits, defined by:

  • Time window (per minute, hour, or day)

  • Application type (standard or enterprise)

  • Operation type (higher limits for read operations, lower for writes)

When you exceed your rate limit, you receive a HTTP 429 (Too Many Requests) response.

Example rate limit exceeded response

When you exceed the rate limit, you'll receive a response like this:

{
  "message": "Too Many Requests"
}

Handling 429 too many requests errors

When you receive a 429 error, implement these strategies to handle it gracefully:

Implement exponential backoff with jitter

Exponential backoff is a retry strategy where you progressively increase the waiting time between retries:

// Pseudo-code example of exponential backoff with jitter
function fetchWithBackoff(url, maxRetries = 5) {
  let retries = 0;
  
  while (retries < maxRetries) {
    try {
      const response = await fetch(url);
      if (response.status !== 429) {
        return response; // Success or different error
      }
      
      // Calculate backoff time with jitter
      const baseWaitTime = Math.pow(2, retries) * 1000; // 1s, 2s, 4s, 8s, 16s
      const jitter = Math.random() * 0.5 * baseWaitTime;
      const waitTime = baseWaitTime + jitter;
      
    
      console.log(`Rate limited. Waiting ${waitTime/1000} seconds before retry ${retries + 1}/${maxRetries}`);
      await sleep(waitTime);
      retries++;
      
    } catch (error) {
      // Handle other errors
      throw error;
    }
  }
  
  throw new Error("Maximum retry attempts reached");
}

Review application behavior

  • Audit your code to identify inefficient API usage patterns

  • Look for unintended loops or redundant API calls

  • Consider batching multiple operations into single requests where applicable

  • Distribute your requests more evenly over time instead of sending them in bursts

Implement caching

  • Cache frequently accessed and rarely changing data

  • Implement proper cache invalidation strategies

  • Use appropriate cache timeouts based on data volatility

  • Consider using ETags or Last-Modified headers for conditional requests

Last updated