Rate Limiting¶

ARX enforces per-organization rate limits to protect the platform from abuse and ensure fair resource allocation. Rate limiting is backed by Redis using a sliding window algorithm.

Default Limits¶

Endpoint Category	Default Limit
API endpoints	1,000 requests per minute
Auth endpoints (`/v1/auth/`, `/v1/sso/`, `/v1/scim/*`)	100 requests per minute

Auth endpoints have a lower limit to protect against credential stuffing and brute-force attacks.

How It Works¶

ARX uses a sliding window algorithm implemented with Redis sorted sets:

Each request is timestamped and added to a sorted set keyed by organization (or client IP for unauthenticated requests).
Entries older than the 60-second window are removed.
The total count of entries in the window is compared against the limit.
If the count exceeds the limit, the request is rejected with 429 Too Many Requests.

The sliding window approach avoids the burst problem of fixed windows, where a burst at the window boundary could allow double the intended rate.

Rate Limit Keys¶

Context	Key Format
Authenticated (org-scoped)	`arxsec:ratelimit:org:<org_id>:api` or `arxsec:ratelimit:org:<org_id>:auth`
Unauthenticated (IP-scoped)	`arxsec:ratelimit:ip:<client_ip>:api` or `arxsec:ratelimit:ip:<client_ip>:auth`

Authenticated requests are rate-limited by organization. All users in the same org share the org's rate limit pool. Unauthenticated requests are rate-limited by client IP using the X-Forwarded-For header (or the direct client address if the header is absent).

Custom Per-Organization Limits¶

Organizations can have custom rate limits stored in the orgs table:

Column	Description
`rate_limit_rpm`	Custom API requests-per-minute limit
`rate_limit_auth_rpm`	Custom auth requests-per-minute limit

If these columns are NULL, the default limits apply. Custom limits are cached in memory with a 5-minute TTL to minimize database lookups.

To set custom limits:

UPDATE orgs
SET rate_limit_rpm = 5000, rate_limit_auth_rpm = 500
WHERE id = '<org-id>';

The cache refreshes automatically; changes take effect within 5 minutes.

Response Headers¶

Every response includes rate limit headers:

Header	Description
`X-RateLimit-Limit`	The maximum number of requests allowed in the current window
`X-RateLimit-Remaining`	The number of requests remaining in the current window
`X-RateLimit-Reset`	Unix timestamp when the current window resets

When a request is throttled, the response also includes:

Header	Description
`Retry-After`	Number of seconds to wait before retrying

Handling 429 Responses¶

When you receive a 429 Too Many Requests response:

{
  "detail": "Rate limit exceeded. Please retry after the specified time."
}

Recommended client behavior:

Read the Retry-After header to determine the wait time.
Implement exponential backoff with jitter for repeated 429s.
Avoid tight retry loops, which will extend the throttling period.
For batch operations, spread requests evenly across the window rather than sending bursts.

Bypassed Paths¶

The following paths are exempt from rate limiting:

Path	Reason
`/health`	Health check endpoint for load balancers
`/docs`	API documentation
`/redoc`	Alternative API documentation
`/openapi.json`	OpenAPI schema

Fail-Open Behavior¶

If Redis is unavailable, the rate limiter fails open -- requests are allowed through without rate limiting. This ensures that a Redis outage does not cause a complete platform outage. A warning is logged (rate_limit.redis_unavailable) when this occurs.

Redis Configuration¶

Rate limiting requires a Redis instance. Configure the connection via environment variable:

REDIS_URL=redis://localhost:6379

The Redis client uses a 2-second connection and socket timeout to avoid blocking request processing.