Rate Limiting

ARX enforces per-organization rate limits to protect the platform from abuse and ensure fair resource allocation. Rate limiting is backed by Redis using a sliding window algorithm.

Default Limits

Endpoint Category Default Limit
API endpoints 1,000 requests per minute
Auth endpoints (/v1/auth/*, /v1/sso/*, /v1/scim/*) 100 requests per minute

Auth endpoints have a lower limit to protect against credential stuffing and brute-force attacks.

How It Works

ARX uses a sliding window algorithm implemented with Redis sorted sets:

  1. Each request is timestamped and added to a sorted set keyed by organization (or client IP for unauthenticated requests).
  2. Entries older than the 60-second window are removed.
  3. The total count of entries in the window is compared against the limit.
  4. If the count exceeds the limit, the request is rejected with 429 Too Many Requests.

The sliding window approach avoids the burst problem of fixed windows, where a burst at the window boundary could allow double the intended rate.

Rate Limit Keys

Context Key Format
Authenticated (org-scoped) arxsec:ratelimit:org:<org_id>:api or arxsec:ratelimit:org:<org_id>:auth
Unauthenticated (IP-scoped) arxsec:ratelimit:ip:<client_ip>:api or arxsec:ratelimit:ip:<client_ip>:auth

Authenticated requests are rate-limited by organization. All users in the same org share the org's rate limit pool. Unauthenticated requests are rate-limited by client IP using the X-Forwarded-For header (or the direct client address if the header is absent).

Custom Per-Organization Limits

Organizations can have custom rate limits stored in the orgs table:

Column Description
rate_limit_rpm Custom API requests-per-minute limit
rate_limit_auth_rpm Custom auth requests-per-minute limit

If these columns are NULL, the default limits apply. Custom limits are cached in memory with a 5-minute TTL to minimize database lookups.

To set custom limits:

UPDATE orgs
SET rate_limit_rpm = 5000, rate_limit_auth_rpm = 500
WHERE id = '<org-id>';

The cache refreshes automatically; changes take effect within 5 minutes.

Response Headers

Every response includes rate limit headers:

Header Description
X-RateLimit-Limit The maximum number of requests allowed in the current window
X-RateLimit-Remaining The number of requests remaining in the current window
X-RateLimit-Reset Unix timestamp when the current window resets

When a request is throttled, the response also includes:

Header Description
Retry-After Number of seconds to wait before retrying

Handling 429 Responses

When you receive a 429 Too Many Requests response:

{
  "detail": "Rate limit exceeded. Please retry after the specified time."
}

Recommended client behavior:

  1. Read the Retry-After header to determine the wait time.
  2. Implement exponential backoff with jitter for repeated 429s.
  3. Avoid tight retry loops, which will extend the throttling period.
  4. For batch operations, spread requests evenly across the window rather than sending bursts.

Bypassed Paths

The following paths are exempt from rate limiting:

Path Reason
/health Health check endpoint for load balancers
/docs API documentation
/redoc Alternative API documentation
/openapi.json OpenAPI schema

Fail-Open Behavior

If Redis is unavailable, the rate limiter fails open -- requests are allowed through without rate limiting. This ensures that a Redis outage does not cause a complete platform outage. A warning is logged (rate_limit.redis_unavailable) when this occurs.

Redis Configuration

Rate limiting requires a Redis instance. Configure the connection via environment variable:

REDIS_URL=redis://localhost:6379

The Redis client uses a 2-second connection and socket timeout to avoid blocking request processing.