EdgeCases Logo
Apr 2026
Next.js
Surface
6 min read

Vercel Firewall & Rate Limiting

Protect your APIs from abuse without external services. Use Vercel's Edge Runtime and middleware for rate limiting and firewall capabilities.

vercel
firewall
rate-limiting
middleware
edge-runtime
security

Protecting your API from abuse doesn't require third-party services. Vercel's built-in Edge Runtime and middleware provide powerful rate limiting and firewall capabilities—no Cloudflare, no API gateway, just your Next.js app.

The Problem: API Abuse Without External Services

Traditional approaches to API protection require external infrastructure:

  • Cloudflare WAF rules and rate limiting
  • AWS API Gateway throttling
  • Auth0 or Firebase rate limiting

These add complexity, cost, and latency. Vercel's Edge Runtime lets you implement protection at the edge, with zero additional infrastructure.

Pattern 1: Middleware-Based Rate Limiting

Use Next.js middleware with Edge Runtime for IP-based rate limiting:

// middleware.ts
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

// Simple in-memory rate limiting (single instance)
const rateLimit = new Map<string, { count: number; resetTime: number }>();

export async function middleware(request: NextRequest) {
  const ip = request.ip || 'anonymous';
  const now = Date.now();
  const limit = 100; // requests per window
  const window = 60 * 1000; // 1 minute

  const record = rateLimit.get(ip);

  if (!record || now > record.resetTime) {
    rateLimit.set(ip, {
      count: 1,
      resetTime: now + window,
    });
  } else if (record.count >= limit) {
    return NextResponse.json(
      { error: 'Too many requests' },
      { status: 429, headers: {
        'X-RateLimit-Limit': limit.toString(),
        'X-RateLimit-Remaining': '0',
        'X-RateLimit-Reset': record.resetTime.toString(),
      }}
    );
  } else {
    record.count++;
  }

  return NextResponse.next();
}

export const config = {
  matcher: '/api/:path*',
};

Pattern 2: KV-Backed Rate Limiting for Scale

In-memory rate limiting resets on deployments and doesn't work across multiple Edge locations. Use Vercel KV for distributed rate limiting:

// middleware.ts
import { kv } from '@vercel/kv';

export async function middleware(request: NextRequest) {
  const ip = request.ip || 'anonymous';
  const key = `rate-limit:${ip}`;
  const limit = 100;

  const current = await kv.get<number>(key) || 0;

  if (current >= limit) {
    return NextResponse.json(
      { error: 'Too many requests' },
      { status: 429 }
    );
  }

  await kv.incr(key);
  await kv.expire(key, 60); // 1 minute TTL

  return NextResponse.next();
}

Pattern 3: Route-Based Rate Limits

Different endpoints need different limits:

// middleware.ts
const rateLimits: Record<string, { limit: number; window: number }> = {
  '/api/auth/login': { limit: 5, window: 300 }, // 5 per 5 minutes
  '/api/auth/register': { limit: 3, window: 3600 }, // 3 per hour
  '/api/subscribe': { limit: 10, window: 86400 }, // 10 per day
  '/api/search': { limit: 60, window: 60 }, // 60 per minute
};

export async function middleware(request: NextRequest) {
  const path = request.nextUrl.pathname;
  const limitConfig = Object.entries(rateLimits).find(([route]) =>
    path.startsWith(route)
  )?.[1];

  if (!limitConfig) {
    return NextResponse.next();
  }

  // Apply rate limit based on route
  const ip = request.ip || 'anonymous';
  const key = `rate-limit:${path}:${ip}`;

  const current = await kv.get<number>(key) || 0;

  if (current >= limitConfig.limit) {
    return NextResponse.json(
      { error: 'Rate limit exceeded' },
      { status: 429, headers: {
        'Retry-After': limitConfig.window.toString(),
      }}
    );
  }

  await kv.incr(key);
  await kv.expire(key, limitConfig.window);

  return NextResponse.next();
}

Pattern 4: IP-Based Blocking

Block malicious IPs at the edge:

// middleware.ts
const blockedIPs = new Set([
  '192.168.1.1',
  '10.0.0.1',
]);

export async function middleware(request: NextRequest) {
  const ip = request.ip;

  if (blockedIPs.has(ip)) {
    return new Response('Blocked', { status: 403 });
  }

  return NextResponse.next();
}

For large blocklists, use Vercel KV:

// middleware.ts
export async function middleware(request: NextRequest) {
  const ip = request.ip;

  const isBlocked = await kv.sismember('blocked-ips', ip);

  if (isBlocked) {
    return new Response('Blocked', { status: 403 });
  }

  return NextResponse.next();
}

Edge Case 1: CDN Cache Invalidation

Middleware runs on cached responses too. This means rate limiting applies even to cached API responses—which can be confusing for users:

// middleware.ts
export async function middleware(request: NextRequest) {
  // Check if response is cached
  const cached = request.headers.get('x-vercel-cache');
  const isCacheHit = cached === 'HIT';

  // Skip rate limit for cache hits
  if (isCacheHit) {
    return NextResponse.next();
  }

  // Apply rate limit only for cache misses
  // ...
}

Edge Case 2: CDN Propagation Delays

When you deploy rate limit changes, CDN propagation isn't instant. Old middleware might enforce new limits, or new middleware might fail to initialize:

// middleware.ts
export async function middleware(request: NextRequest) {
  try {
    // Apply rate limit
  } catch (error) {
    // Graceful fallback on deployment issues
    console.error('Rate limit error:', error);

    // Allow request on failure (fail-open)
    return NextResponse.next();
  }
}

Edge Case 3: Shared IP Addresses

Many users share IP addresses (corporate proxies, VPNs, NAT). IP-based rate limiting can unfairly block legitimate users:

// middleware.ts
export async function middleware(request: NextRequest) {
  const ip = request.ip;
  const userAgent = request.headers.get('user-agent') || '';

  // Check if user agent suggests VPN/proxy
  const isVPN = userAgent.includes('VPN') || userAgent.includes('Proxy');

  // Stricter limits for VPNs
  const limit = isVPN ? 20 : 100;

  // Apply rate limit...
}

Better approach: use user-specific rate limits with authentication:

// middleware.ts
export async function middleware(request: NextRequest) {
  const session = await getSession(request);

  // Use user ID if authenticated, otherwise IP
  const identifier = session?.userId || request.ip || 'anonymous';

  // Higher limits for authenticated users
  const limit = session?.userId ? 1000 : 100;

  // Apply rate limit based on identifier...
}

Pattern 5: Token Bucket Algorithm

Fixed window rate limiting has burst problems. Use token bucket for smoother limits:

// middleware.ts
interface TokenBucket {
  tokens: number;
  lastUpdate: number;
}

const buckets = new Map<string, TokenBucket>();

function refillBucket(bucket: TokenBucket, capacity: number, refillRate: number): number {
  const now = Date.now();
  const elapsed = now - bucket.lastUpdate;
  const tokensToAdd = (elapsed / 1000) * refillRate;

  bucket.tokens = Math.min(capacity, bucket.tokens + tokensToAdd);
  bucket.lastUpdate = now;

  return bucket.tokens;
}

export async function middleware(request: NextRequest) {
  const ip = request.ip || 'anonymous';
  const capacity = 100;
  const refillRate = 1; // 1 token per second

  let bucket = buckets.get(ip);

  if (!bucket) {
    bucket = { tokens: capacity, lastUpdate: Date.now() };
    buckets.set(ip, bucket);
  }

  refillBucket(bucket, capacity, refillRate);

  if (bucket.tokens < 1) {
    return NextResponse.json(
      { error: 'Rate limit exceeded' },
      { status: 429 }
    );
  }

  bucket.tokens -= 1;

  return NextResponse.next();
}

Monitoring Rate Limits

Track rate limit violations for security monitoring:

// middleware.ts
export async function middleware(request: NextRequest) {
  // Apply rate limit...

  if (isRateLimited) {
    // Log to monitoring service
    await logToDatadog({
      event: 'rate_limit_exceeded',
      ip: request.ip,
      path: request.nextUrl.pathname,
      userAgent: request.headers.get('user-agent'),
    });

    // Auto-block after 10 violations
    const violations = await kv.incr(`violations:${ip}`);
    await kv.expire(`violations:${ip}`, 3600);

    if (violations >= 10) {
      await kv.sadd('blocked-ips', ip);
      await kv.expire('blocked-ips', 86400); // Block for 24 hours

      return new Response('Blocked due to repeated violations', { status: 403 });
    }
  }

  return NextResponse.next();
}

Key Takeaways

  • In-memory: Fast but doesn't scale across instances
  • Vercel KV: Distributed rate limiting for production
  • Route-based: Different limits per endpoint
  • Token bucket: Smoother rate limiting than fixed windows
  • Cache hits: Skip rate limiting for cached responses
  • User vs IP: Prefer user-specific limits when possible

Advertisement

Related Insights

Explore related edge cases and patterns

architecture
Surface
Vercel Edge Config vs KV vs Blob
6 min
Next.js
Deep
proxy.ts: Node.js Runtime for Next.js Request Interception
8 min

Advertisement