EdgeCases Logo
Apr 2026
Vercel
Surface
6 min read

Vercel Firewall & Rate Limiting

Built-in WAF protection without external services or complex middleware configurations

Vercel
Firewall
Rate Limiting
API Security
Edge Functions

Hit 429s on your API endpoints, and your first instinct might be reaching for Redis + Upstash, Auth0 rate limiting, or Cloudflare Workers. But if you're on Vercel Pro or Enterprise, you already have enterprise-grade rate limiting built into the platform—Vercel Firewall WAF operates at the edge before your serverless functions even execute.

Vercel Firewall vs DIY Middleware

Most developers reach for custom middleware solutions when they need API protection:

// Traditional middleware approach
import { NextResponse } from 'next/server';
const rateLimit = new Map();

export async function middleware(request) {
  const ip = request.ip || 'anonymous';
  // Rate limiting logic here...
  if (exceeded) {
    return NextResponse.json({ error: 'Rate limited' }, { status: 429 });
  }
  return NextResponse.next();
}

This works but has limitations: resets on deployments, doesn't scale across edge regions, and still executes your code for every request. Vercel Firewall WAF intercepts requests at the CDN level—before middleware even runs.

Dashboard WAF Rules: Zero-Code Protection

Configure rate limits through the Vercel dashboard for instant, production-ready protection. Most effective for protecting authentication endpoints and preventing abuse patterns across your entire application.

// WAF rule configuration (via dashboard)
Rule Name: "Auth endpoint protection"
Condition: Request path contains "/api/auth"
Rate Limit: 10 requests per 60 seconds per IP
Algorithm: Fixed Window
Action: Deny (429)

// This automatically protects:
// /api/auth/login
// /api/auth/register
// /api/auth/reset
// /api/auth/verify

The firewall operates at Vercel's edge network, blocking requests before they consume serverless function invocations or middleware execution time. For authentication endpoints, this can reduce compute costs by 20-40% during traffic spikes.

Fixed Window vs Token Bucket

Vercel offers two rate limiting algorithms. Fixed Window (available on all plans) counts requests in discrete time periods. Token Bucket (Enterprise only) provides smoother, burstable limits:

// Fixed Window (Pro+)
100 requests per 60 seconds
// At 0:00 - user gets 100 requests
// At 0:59 - user hits limit
// At 1:00 - immediately gets 100 new requests (burst possible)

// Token Bucket (Enterprise)
100 token capacity, refill 1.67 tokens/second
// Smoother distribution
// Natural burst handling up to capacity
// No sudden resets

SDK Approach: Granular Control

For complex business logic or user-specific rate limiting, the @vercel/firewall SDK provides programmatic control while maintaining edge-level performance.

import { checkRateLimit } from '@vercel/firewall';

export async function POST(request: Request) {
  // Custom rate limiting based on user tier
  const auth = await authenticateUser(request);
  const limitId = auth.tier === 'pro' ? 'pro-api-limit' : 'free-api-limit';

  const { rateLimited } = await checkRateLimit(limitId, {
    request,
    rateLimitKey: auth.userId, // Per-user rather than per-IP
  });

  if (rateLimited) {
    return new Response(JSON.stringify({
      error: 'Rate limit exceeded',
      tier: auth.tier,
      upgradeUrl: '/pricing'
    }), { status: 429 });
  }

  // API logic continues
}

The SDK requires a corresponding dashboard rule using @vercel/firewall as the condition and a matching Rate limit ID.

Per-Region Counting: The Hidden Gotcha

Rate limit counters are tracked per-region, not globally. A sophisticated attacker hitting your API from multiple regions can exceed your configured limit by the number of active regions.

// Configuration: 100 requests per minute
// Reality with 3 active regions:
// - us-east-1: 100 requests/min
// - eu-west-1: 100 requests/min
// - ap-southeast-1: 100 requests/min
// Total possible: 300 requests/min

This behavior is intentional—global rate limiting would create cross-region latency as each edge location checks with a central counter. For most applications, per-region limits provide sufficient protection while maintaining low latency.

If you need truly global limits, combine Vercel Firewall with application-level checks using a global store like Vercel KV.

Advanced Patterns

Organization-Level Rate Limiting

Use request headers and custom rate limit keys to implement organization-wide limits:

// Dashboard rule
Condition: Request header "x-org-id" exists
Rate Limit ID: "org-api-limit"

// Code implementation
const { rateLimited } = await checkRateLimit('org-api-limit', {
  request,
  rateLimitKey: auth.orgId, // Shared limit across org users
});

if (rateLimited) {
  return new Response(JSON.stringify({
    error: 'Organization rate limit exceeded',
    contact: 'Contact support to increase limits'
  }), { status: 429 });
}

JA4 Fingerprinting

Vercel Firewall includes JA4 TLS fingerprinting for bot detection. This goes beyond IP-based blocking to identify automated clients:

// Dashboard configuration
Rate Limit Key: JA4 Digest
Condition: JA4 fingerprint matches known bot patterns
Action: Challenge or Deny

// Catches:
// - Automated tools using specific TLS libraries
// - Headless browsers with detectable fingerprints
// - Scripts using default HTTP client configurations

Monitoring and Observability

The Firewall dashboard provides traffic insights showing rate limit triggers, blocked requests, and patterns over time. Key metrics to watch:

  • Block rate: Percentage of requests denied by firewall rules
  • Geographic distribution: Where rate limit violations originate
  • Rule effectiveness: Which rules trigger most frequently
  • False positives: Legitimate traffic caught by aggressive rules
// Using the Log action for testing
Action: Log (rather than Deny)
// Allows monitoring rule effectiveness without blocking traffic
// Review logs before switching to Deny action

Cost Implications

Vercel Firewall rate limiting includes 1 million allowed requests across Pro and Enterprise plans. Beyond that, pricing is region-based (approximately $0.50-$1.00 per million additional requests).

Performance benefits over external solutions:

  • Pre-function blocking: Requests never reach serverless functions, saving compute costs
  • Zero latency overhead: Rate limit checks happen within Vercel's CDN infrastructure
  • Regional optimization: Counters are local to edge regions
  • No external dependencies: No Redis, no third-party API calls

For API-heavy applications, edge-level rate limiting can reduce serverless function invocations by 15-35% during traffic spikes, providing significant cost savings on compute bills.

Advertisement

Related Insights

Explore related edge cases and patterns

Next.js
Surface
Vercel Blob Storage: When It Makes Sense (and When It Doesn't)
6 min
Next.js
Deep
Neon on Vercel: The Connection Pooling Maze
7 min
Next.js
Expert
Vercel Billing Demystified: Edge Requests, Function Duration, and ISR Costs
8 min

Advertisement