EdgeCases Logo
Apr 2026
Vercel
Surface
6 min read

Vercel Firewall & Rate Limiting

Protect your APIs from abuse without external services. Use Vercel's Edge Runtime and middleware for rate limiting and firewall capabilities.

Vercel
Firewall
Rate Limiting
API Security
Edge Functions

Protecting your API from abuse doesn't require third-party services. Vercel's built-in Edge Runtime and middleware provide powerful rate limiting and firewall capabilities—no Cloudflare, no API gateway, just your Next.js app.

The Problem: API Abuse Without External Services

Traditional approaches to API protection require external infrastructure:

  • Cloudflare WAF rules and rate limiting
  • AWS API Gateway throttling
  • Auth0 or Firebase rate limiting

These add complexity, cost, and latency. Vercel's Edge Runtime lets you implement protection at the edge, with zero additional infrastructure.

Pattern 1: Middleware-Based Rate Limiting

Use Next.js middleware with Edge Runtime for IP-based rate limiting:

// middleware.ts
import { NextResponse } from 'next/server';
import type { NextRequest } from 'next/server';

// Simple in-memory rate limiting (single instance)
const rateLimit = new Map<string, { count: number; resetTime: number }>();

export async function middleware(request: NextRequest) {
  const ip = request.ip || 'anonymous';
  const now = Date.now();
  const limit = 100; // requests per window
  const window = 60 * 1000; // 1 minute

  const record = rateLimit.get(ip);

  if (!record || now > record.resetTime) {
    rateLimit.set(ip, {
      count: 1,
      resetTime: now + window,
    });
  } else if (record.count >= limit) {
    return NextResponse.json(
      { error: 'Too many requests' },
      { status: 429, headers: {
        'X-RateLimit-Limit': limit.toString(),
        'X-RateLimit-Remaining': '0',
        'X-RateLimit-Reset': record.resetTime.toString(),
      }}
    );
  } else {
    record.count++;
  }

  return NextResponse.next();
}

export const config = {
  matcher: '/api/:path*',
};

Pattern 2: KV-Backed Rate Limiting for Scale

In-memory rate limiting resets on deployments and doesn't work across multiple Edge locations. Use Vercel KV for distributed rate limiting:

// middleware.ts
import { kv } from '@vercel/kv';

export async function middleware(request: NextRequest) {
  const ip = request.ip || 'anonymous';
  const key = `rate-limit:${ip}`;
  const limit = 100;

  const current = await kv.get<number>(key) || 0;

  if (current >= limit) {
    return NextResponse.json(
      { error: 'Too many requests' },
      { status: 429 }
    );
  }

  await kv.incr(key);
  await kv.expire(key, 60); // 1 minute TTL

  return NextResponse.next();
}

Pattern 3: Route-Based Rate Limits

Different endpoints need different limits:

// middleware.ts
const rateLimits: Record<string, { limit: number; window: number }> = {
  '/api/auth/login': { limit: 5, window: 300 }, // 5 per 5 minutes
  '/api/auth/register': { limit: 3, window: 3600 }, // 3 per hour
  '/api/subscribe': { limit: 10, window: 86400 }, // 10 per day
  '/api/search': { limit: 60, window: 60 }, // 60 per minute
};

export async function middleware(request: NextRequest) {
  const path = request.nextUrl.pathname;
  const limitConfig = Object.entries(rateLimits).find(([route]) =>
    path.startsWith(route)
  )?.[1];

  if (!limitConfig) {
    return NextResponse.next();
  }

  // Apply rate limit based on route
  const ip = request.ip || 'anonymous';
  const key = `rate-limit:${path}:${ip}`;

  const current = await kv.get<number>(key) || 0;

  if (current >= limitConfig.limit) {
    return NextResponse.json(
      { error: 'Rate limit exceeded' },
      { status: 429, headers: {
        'Retry-After': limitConfig.window.toString(),
      }}
    );
  }

  await kv.incr(key);
  await kv.expire(key, limitConfig.window);

  return NextResponse.next();
}

Pattern 4: IP-Based Blocking

Block malicious IPs at the edge:

// middleware.ts
const blockedIPs = new Set([
  '192.168.1.1',
  '10.0.0.1',
]);

export async function middleware(request: NextRequest) {
  const ip = request.ip;

  if (blockedIPs.has(ip)) {
    return new Response('Blocked', { status: 403 });
  }

  return NextResponse.next();
}

For large blocklists, use Vercel KV:

// middleware.ts
export async function middleware(request: NextRequest) {
  const ip = request.ip;

  const isBlocked = await kv.sismember('blocked-ips', ip);

  if (isBlocked) {
    return new Response('Blocked', { status: 403 });
  }

  return NextResponse.next();
}

Edge Case 1: CDN Cache Invalidation

Middleware runs on cached responses too. This means rate limiting applies even to cached API responses—which can be confusing for users:

// middleware.ts
export async function middleware(request: NextRequest) {
  // Check if response is cached
  const cached = request.headers.get('x-vercel-cache');
  const isCacheHit = cached === 'HIT';

  // Skip rate limit for cache hits
  if (isCacheHit) {
    return NextResponse.next();
  }

  // Apply rate limit only for cache misses
  // ...
}

Edge Case 2: CDN Propagation Delays

When you deploy rate limit changes, CDN propagation isn't instant. Old middleware might enforce new limits, or new middleware might fail to initialize:

// middleware.ts
export async function middleware(request: NextRequest) {
  try {
    // Apply rate limit
  } catch (error) {
    // Graceful fallback on deployment issues
    console.error('Rate limit error:', error);

    // Allow request on failure (fail-open)
    return NextResponse.next();
  }
}

Edge Case 3: Shared IP Addresses

Many users share IP addresses (corporate proxies, VPNs, NAT). IP-based rate limiting can unfairly block legitimate users:

// middleware.ts
export async function middleware(request: NextRequest) {
  const ip = request.ip;
  const userAgent = request.headers.get('user-agent') || '';

  // Check if user agent suggests VPN/proxy
  const isVPN = userAgent.includes('VPN') || userAgent.includes('Proxy');

  // Stricter limits for VPNs
  const limit = isVPN ? 20 : 100;

  // Apply rate limit...
}

Better approach: use user-specific rate limits with authentication:

// middleware.ts
export async function middleware(request: NextRequest) {
  const session = await getSession(request);

  // Use user ID if authenticated, otherwise IP
  const identifier = session?.userId || request.ip || 'anonymous';

  // Higher limits for authenticated users
  const limit = session?.userId ? 1000 : 100;

  // Apply rate limit based on identifier...
}

Pattern 5: Token Bucket Algorithm

Fixed window rate limiting has burst problems. Use token bucket for smoother limits:

// middleware.ts
interface TokenBucket {
  tokens: number;
  lastUpdate: number;
}

const buckets = new Map<string, TokenBucket>();

function refillBucket(bucket: TokenBucket, capacity: number, refillRate: number): number {
  const now = Date.now();
  const elapsed = now - bucket.lastUpdate;
  const tokensToAdd = (elapsed / 1000) * refillRate;

  bucket.tokens = Math.min(capacity, bucket.tokens + tokensToAdd);
  bucket.lastUpdate = now;

  return bucket.tokens;
}

export async function middleware(request: NextRequest) {
  const ip = request.ip || 'anonymous';
  const capacity = 100;
  const refillRate = 1; // 1 token per second

  let bucket = buckets.get(ip);

  if (!bucket) {
    bucket = { tokens: capacity, lastUpdate: Date.now() };
    buckets.set(ip, bucket);
  }

  refillBucket(bucket, capacity, refillRate);

  if (bucket.tokens < 1) {
    return NextResponse.json(
      { error: 'Rate limit exceeded' },
      { status: 429 }
    );
  }

  bucket.tokens -= 1;

  return NextResponse.next();
}

Monitoring Rate Limits

Track rate limit violations for security monitoring:

// middleware.ts
export async function middleware(request: NextRequest) {
  // Apply rate limit...

  if (isRateLimited) {
    // Log to monitoring service
    await logToDatadog({
      event: 'rate_limit_exceeded',
      ip: request.ip,
      path: request.nextUrl.pathname,
      userAgent: request.headers.get('user-agent'),
    });

    // Auto-block after 10 violations
    const violations = await kv.incr(`violations:${ip}`);
    await kv.expire(`violations:${ip}`, 3600);

    if (violations >= 10) {
      await kv.sadd('blocked-ips', ip);
      await kv.expire('blocked-ips', 86400); // Block for 24 hours

      return new Response('Blocked due to repeated violations', { status: 403 });
    }
  }

  return NextResponse.next();
}

Key Takeaways

  • In-memory: Fast but doesn't scale across instances
  • Vercel KV: Distributed rate limiting for production
  • Route-based: Different limits per endpoint
  • Token bucket: Smoother rate limiting than fixed windows
  • Cache hits: Skip rate limiting for cached responses
  • User vs IP: Prefer user-specific limits when possible

Advertisement

Related Insights

Explore related edge cases and patterns

Next.js
Surface
Vercel Blob Storage: When It Makes Sense (and When It Doesn't)
6 min
Next.js
Deep
Neon on Vercel: The Connection Pooling Maze
7 min
Next.js
Expert
Vercel Billing Demystified: Edge Requests, Function Duration, and ISR Costs
8 min

Advertisement