Mining your auth logs: detecting attacks from patterns you're already collecting

Every login attempt, successful or failed, contains information about what your auth system is experiencing. Most applications log this data and never query it for patterns. The logs sit in a datastore accumulating evidence of ongoing attacks that no one is looking at. The signals needed to detect credential stuffing, account takeover, and mass enumeration are already in your auth logs — they just need queries written against them.

What to log for every auth event

Useful auth log analysis requires structured logs, not just plaintext. At minimum, log these fields for every login attempt:

// Structured auth event log schema
interface AuthEvent {
  eventType: 'login_success' | 'login_failure' | 'mfa_success' | 'mfa_failure'
             | 'password_reset_requested' | 'session_created' | 'session_revoked';
  timestamp: string;       // ISO 8601
  userId: string | null;   // null for failed attempts with nonexistent accounts
  email: string;           // Always log, even for failures
  ip: string;
  country: string | null;  // GeoIP resolution
  region: string | null;
  asn: string | null;      // Autonomous system number — identifies hosting vs residential
  userAgent: string;
  deviceFingerprint: string | null;
  sessionId: string | null;
  failureReason: string | null;  // 'wrong_password', 'account_locked', 'mfa_failed', etc.
  riskScore: number | null;
}

Failed login velocity analysis

The most basic detection query: how many failed logins occurred in the last hour, broken down by account and by IP? Elevated counts on either dimension are a signal.

-- PostgreSQL: accounts with high failure rate in last hour
SELECT
  email,
  COUNT(*) AS failures,
  COUNT(DISTINCT ip) AS distinct_ips,
  MIN(timestamp) AS first_attempt,
  MAX(timestamp) AS last_attempt
FROM auth_events
WHERE
  event_type = 'login_failure'
  AND timestamp > NOW() - INTERVAL '1 hour'
GROUP BY email
HAVING COUNT(*) > 10
ORDER BY failures DESC;

-- IPs with distributed targeting (low per-account, high total)
SELECT
  ip,
  country,
  asn,
  COUNT(*) AS total_failures,
  COUNT(DISTINCT email) AS distinct_accounts_targeted,
  COUNT(*) / COUNT(DISTINCT email) AS avg_attempts_per_account
FROM auth_events
WHERE
  event_type = 'login_failure'
  AND timestamp > NOW() - INTERVAL '1 hour'
GROUP BY ip, country, asn
HAVING COUNT(DISTINCT email) > 50  -- Targeting many accounts
ORDER BY distinct_accounts_targeted DESC;

Geolocation clustering and impossible travel

Impossible travel detection: a successful login from New York followed by a successful login from Tokyo 30 minutes later is physically impossible. This is a strong signal of account sharing or account compromise. Implement this as a post-login check that looks at the last known login location.

// TypeScript: check for impossible travel after successful login
async function checkImpossibleTravel(userId: string, currentIP: string): Promise<boolean> {
  const lastLogin = await db.query(`
    SELECT ip, country, timestamp
    FROM auth_events
    WHERE user_id = $1
      AND event_type = 'login_success'
      AND timestamp > NOW() - INTERVAL '24 hours'
    ORDER BY timestamp DESC
    LIMIT 1
  `, [userId]);

  if (!lastLogin.rows.length) return false;

  const { ip: lastIP, country: lastCountry, timestamp: lastTime } = lastLogin.rows[0];
  const currentCountry = await geoip.lookup(currentIP)?.country;

  if (!currentCountry || !lastCountry) return false;
  if (currentCountry === lastCountry) return false;

  // Calculate minimum travel time between countries (rough approximation)
  const minTravelHours = getMinTravelTime(lastCountry, currentCountry);
  const hoursSinceLastLogin = (Date.now() - new Date(lastTime).getTime()) / 3600000;

  return hoursSinceLastLogin < minTravelHours;
}

User-agent anomaly detection

Browser-based logins from real users have User-Agent strings that match established browser versions. Bot traffic often uses outdated UA strings, blank UAs, or library-specific strings like python-requests/2.28.0. Compare the UA distribution of failed logins to successful logins — a sudden spike in a specific UA in the failure population indicates automated tooling.

-- User-agents appearing disproportionately in failures
SELECT
  user_agent,
  SUM(CASE WHEN event_type = 'login_failure' THEN 1 ELSE 0 END) AS failures,
  SUM(CASE WHEN event_type = 'login_success' THEN 1 ELSE 0 END) AS successes,
  ROUND(
    SUM(CASE WHEN event_type = 'login_failure' THEN 1.0 ELSE 0 END) /
    NULLIF(COUNT(*), 0) * 100, 1
  ) AS failure_pct
FROM auth_events
WHERE timestamp > NOW() - INTERVAL '24 hours'
GROUP BY user_agent
HAVING COUNT(*) > 100
ORDER BY failure_pct DESC;

Time-of-day baselines

For a given user or org, login activity has a predictable pattern. An enterprise SaaS product with US customers will see near-zero logins between 2am and 6am Eastern. A sudden spike at 3am suggests automated access. Build per-org baselines and alert when activity deviates significantly.

// Build hourly baseline for an org (last 30 days)
async function getOrgLoginBaseline(orgId: string) {
  const result = await db.query(`
    SELECT
      EXTRACT(HOUR FROM timestamp AT TIME ZONE org_timezone) AS hour,
      EXTRACT(DOW FROM timestamp AT TIME ZONE org_timezone) AS day_of_week,
      COUNT(*) AS login_count
    FROM auth_events ae
    JOIN users u ON ae.user_id = u.id
    JOIN orgs o ON u.org_id = o.id
    WHERE
      o.id = $1
      AND ae.event_type = 'login_success'
      AND ae.timestamp > NOW() - INTERVAL '30 days'
    GROUP BY hour, day_of_week
    ORDER BY day_of_week, hour
  `, [orgId]);

  return result.rows;
}

// Alert if current hour is significantly above baseline
async function checkAnomalousLoginVolume(orgId: string) {
  const currentHour = new Date().getUTCHours();
  const currentDOW = new Date().getUTCDay();
  const baseline = await getOrgLoginBaseline(orgId);

  const baselineForNow = baseline.find(
    b => b.hour === currentHour && b.day_of_week === currentDOW
  );
  const expectedLogins = baselineForNow?.login_count / 4 ?? 0; // Per-week avg

  const recentLogins = await db.query(`
    SELECT COUNT(*) FROM auth_events
    WHERE user_id IN (SELECT id FROM users WHERE org_id = $1)
      AND event_type = 'login_success'
      AND timestamp > NOW() - INTERVAL '15 minutes'
  `, [orgId]);

  const currentRate = parseInt(recentLogins.rows[0].count) * 4; // Project to hourly

  if (expectedLogins > 5 && currentRate > expectedLogins * 5) {
    await alerting.warn({
      title: 'Anomalous login volume',
      orgId,
      currentRate,
      expectedRate: expectedLogins,
    });
  }
}
Retain auth logs for at least 90 days. Security investigations frequently need to look back weeks or months to understand the full scope of a compromise. Many compliance frameworks (SOC 2, PCI DSS) require 12 months of log retention. Compress and archive logs after 30 days to a cheaper storage tier, but keep them accessible for queries.
← Back to blog Try Bastionary free →