Designing rate limits for auth endpoints: the numbers that actually matter

Auth endpoints are the most attack-sensitive endpoints in any application. They are targeted by credential stuffing, brute force, account enumeration, and SMS pumping — and they serve legitimate users simultaneously. The rate limits you set directly determine whether attacks are feasible and whether real users experience friction. Picking numbers without understanding what they are protecting against leads to either over-permissive limits that let attacks proceed or over-restrictive limits that break legitimate flows.

The login endpoint

The login endpoint accepts email and password. The threat is credential stuffing — automated submission of username/password pairs from breach databases. A typical credential stuffing attack will test 1,000 credentials per second if unconstrained.

A rate limit of 5 failed attempts per minute per IP address stops low-sophistication attacks. But as covered elsewhere, IP-based limiting alone is insufficient because attackers rotate IPs. The more effective limit is 5 failed attempts per account per 10-minute window, regardless of source IP. This forces the attacker to either slow down (allowing more time for detection) or spread attempts across time (reducing the damage window).

// Rate limiter using token bucket algorithm in Redis
// Multiple dimensions: per-IP and per-account
class AuthRateLimiter {
  constructor(private redis: Redis) {}

  async checkLogin(ip: string, email: string): Promise<RateLimitResult> {
    const now = Date.now();
    const windowMs = 10 * 60 * 1000;  // 10-minute window
    const windowStart = now - windowMs;

    // Check per-IP limit (5/min = 50/10min)
    const ipKey = `rl:login:ip:${ip}`;
    const ipCount = await this.countAndAdd(ipKey, now, windowStart, 50);
    if (ipCount.count > 50) {
      return { allowed: false, reason: 'ip_limit', retryAfterMs: windowMs };
    }

    // Check per-account limit (5/10min)
    const accountKey = `rl:login:account:${email.toLowerCase()}`;
    const accountCount = await this.countAndAdd(accountKey, now, windowStart, 5);
    if (accountCount.count > 5) {
      return {
        allowed: false,
        reason: 'account_limit',
        retryAfterMs: accountCount.oldestEntryMs + windowMs - now
      };
    }

    return { allowed: true };
  }

  private async countAndAdd(
    key: string,
    now: number,
    windowStart: number,
    limit: number
  ) {
    const pipe = this.redis.pipeline();
    pipe.zremrangebyscore(key, 0, windowStart);
    pipe.zadd(key, now, `${now}-${Math.random()}`);
    pipe.zcard(key);
    pipe.zrange(key, 0, 0, 'WITHSCORES');
    pipe.expire(key, 600);
    const results = await pipe.exec();

    const count = results[2][1] as number;
    const oldest = results[3][1] as string[];
    const oldestEntryMs = oldest.length ? parseInt(oldest[1]) : now;

    return { count, oldestEntryMs };
  }
}

The token endpoint

The OAuth token endpoint is different from the login endpoint. It serves multiple grant types: authorization code, refresh token, client credentials. The appropriate limit depends on the client type.

Authorization code grants: these are one-shot — each code can only be used once. A limit of 10 per minute per client is generous. If a client is exchanging codes faster than 10/min, something is wrong.
Refresh token grants: mobile apps refresh every 15 minutes. A user with 5 open tabs each refreshing at the token expiry creates 5 simultaneous requests. A limit of 60 per minute per user per client is reasonable for most applications.
Client credentials grants: these are machine-to-machine. The appropriate limit depends on the client's use case. A service that calls the token endpoint to refresh its own access token once per hour needs a very different limit than a service that issues tokens to end users on their behalf. Default to 100/minute per client and make it configurable.

// Token endpoint rate limiting — per-client and per-user
app.post('/oauth/token', async (req, res) => {
  const { grant_type, client_id } = req.body;

  // Per-client rate limit
  const clientLimit = await rateLimiter.check(
    `rl:token:client:${client_id}`,
    { limit: 100, windowSeconds: 60 }
  );

  if (!clientLimit.allowed) {
    return res.status(429)
      .set('Retry-After', String(clientLimit.retryAfterSeconds))
      .json({ error: 'rate_limit_exceeded' });
  }

  if (grant_type === 'refresh_token') {
    const token = await validateRefreshToken(req.body.refresh_token);
    // Per-user rate limit for refresh grants
    const userLimit = await rateLimiter.check(
      `rl:token:user:${token.user_id}:${client_id}`,
      { limit: 60, windowSeconds: 60 }
    );
    if (!userLimit.allowed) {
      return res.status(429).json({ error: 'rate_limit_exceeded' });
    }
  }

  // ... process grant
});

The registration endpoint

Registration endpoints are targets for account farming — creating large numbers of accounts for spam, fraud, or to exhaust free tier credits. They are also targeted by phone number verification abuse (SMS pumping), where attackers trigger verification SMS to expensive phone numbers to generate carrier revenue.

Effective limits: 3 registrations per hour per IP, 1 registration per hour per email domain (for domains not on your allowlist of major providers), and 1 SMS verification per phone number per 10 minutes. The SMS limit is critical — SMS pumping attacks send thousands of verification messages in minutes.

// Registration rate limits
async function checkRegistrationLimits(
  ip: string,
  email: string,
  phone?: string
): Promise<void> {
  const emailDomain = email.split('@')[1];
  const isMajorProvider = MAJOR_EMAIL_PROVIDERS.has(emailDomain);

  // 3 new accounts per hour per IP
  await enforceLimit(`rl:reg:ip:${ip}`, 3, 3600);

  // For non-major email providers, limit by domain (catches disposable email services)
  if (!isMajorProvider) {
    await enforceLimit(`rl:reg:domain:${emailDomain}`, 5, 3600);
  }

  // SMS verification: 1 per phone per 10 minutes, 3 per phone per day
  if (phone) {
    const normalizedPhone = normalizePhone(phone);
    await enforceLimit(`rl:sms:10m:${normalizedPhone}`, 1, 600);
    await enforceLimit(`rl:sms:day:${normalizedPhone}`, 3, 86400);
  }
}

async function enforceLimit(key: string, limit: number, windowSeconds: number) {
  const count = await redis.incr(key);
  if (count === 1) await redis.expire(key, windowSeconds);
  if (count > limit) {
    throw new RateLimitError(key, windowSeconds);
  }
}

MFA verification endpoints

TOTP and SMS OTP codes are 6-digit numbers — 1,000,000 possible values. With a 30-second TOTP window, an attacker who can make unlimited guesses would crack a code on average in about 500,000 attempts — but with a 1-attempt-per-second constraint, that is 6 days of continuous guessing. The time window for each code is short enough that brute force is impractical as long as you limit to around 3–5 attempts per code window.

// MFA attempt limiting
async function verifyMfaCode(
  userId: string,
  code: string,
  sessionId: string
): Promise<boolean> {
  const attemptKey = `mfa:attempts:${userId}:${sessionId}`;
  const attempts = await redis.incr(attemptKey);

  if (attempts === 1) {
    // Expire after the MFA session timeout (10 minutes)
    await redis.expire(attemptKey, 600);
  }

  if (attempts > 3) {
    // Lock out this MFA session — user must restart login
    await redis.setex(`mfa:locked:${sessionId}`, 300, '1');
    throw new MfaLockedError('Too many attempts. Please restart login.');
  }

  const valid = await validateTotp(userId, code);

  if (valid) {
    // Clear attempt counter on success
    await redis.del(attemptKey);
  }

  return valid;
}

When a MFA code attempt fails, do not reveal whether the failure was due to an incorrect code or a lockout until the lockout threshold is reached. Revealing "1 attempt remaining" helps an attacker calibrate their timing. Return a generic "invalid code" message until the session is locked.

Responding to rate limit violations

Always return a Retry-After header. Clients need to know when they can retry, and without this header they either retry immediately (hammering your rate limiter) or use an arbitrary backoff. The Retry-After value should be the remaining time in the current window, not a fixed penalty time.

Use HTTP 429 with a consistent application/json body that includes an error code, a human-readable message, and the retry timing. Include the limit and current usage in response headers for developer debugging:

HTTP/1.1 429 Too Many Requests
Retry-After: 47
X-RateLimit-Limit: 5
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1664236847
Content-Type: application/json

{
  "error": "rate_limit_exceeded",
  "message": "Too many login attempts. Retry after 47 seconds.",
  "retry_after": 47
}

← Back to blog Try Bastionary free →