ASP.NET Core Rate Limiting â€” Protect Your API (2026)

Learn ASP.NET Core rate limiting with practical C# examples. Fixed window, sliding window, token bucket â€” protect your API from abuse today.

Every public API faces the same threat: one badly behaved client can hammer your endpoints, exhaust your database connections, and bring down the service for everyone. ASP.NET Core rate limiting gives you a built-in, production-ready defense against exactly this problem â€” no third-party packages required.

Since .NET 7, the Microsoft.AspNetCore.RateLimiting middleware has shipped as a first-class feature. In this guide, you'll learn how to configure every built-in algorithm, apply limits per-client and per-endpoint, and avoid the mistakes that quietly break rate limiting in production.

Why Rate Limiting in ASP.NET Core Matters

Rate limiting controls how many requests a client can make within a time window. Without it, your API is vulnerable to:

Denial-of-service attacks â€” intentional flooding that crashes your service
Accidental overload â€” a client bug that sends thousands of duplicate requests
Resource exhaustion â€” database connection pools and memory drained by unchecked traffic
Unfair usage â€” one heavy consumer starving out everyone else

Before .NET 7, most teams reached for AspNetCoreRateLimit, a popular NuGet package. That still works, but the built-in middleware is now the recommended approach â€” it's faster, better integrated with minimal APIs and controllers, and maintained by the .NET team.

Getting Started: Add the Rate Limiter Middleware

The rate limiting middleware lives in Microsoft.AspNetCore.RateLimiting, which is included in the ASP.NET Core shared framework. No extra NuGet install is needed for .NET 7 and later.

Here's the minimal setup:

using Microsoft.AspNetCore.RateLimiting;
using System.Threading.RateLimiting;

var builder = WebApplication.CreateBuilder(args);

builder.Services.AddRateLimiter(options =>
{
    options.AddFixedWindowLimiter("fixed", config =>
    {
        config.PermitLimit = 10;
        config.Window = TimeSpan.FromSeconds(10);
        config.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
        config.QueueLimit = 2;
    });

    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
});

var app = builder.Build();

app.UseRateLimiter();

app.MapGet("/api/data", () => Results.Ok(new { message = "Success" }))
    .RequireRateLimiting("fixed");

app.Run();

This creates a fixed window policy named "fixed" that allows 10 requests every 10 seconds per partition. When a client exceeds the limit, they receive a 429 Too Many Requests response. The QueueLimit of 2 means up to 2 excess requests will wait in a queue instead of being rejected immediately.

Understanding the Four Rate Limiting Algorithms

ASP.NET Core ships four algorithms in the System.Threading.RateLimiting namespace. Each fits different use cases.

1. Fixed Window Rate Limiting

The simplest approach. It counts requests within fixed time segments. When the window resets, the count drops to zero.

Problem: Burst traffic at window boundaries. A client can send 10 requests at second 9, then 10 more at second 11 â€” effectively 20 requests in 2 seconds.

options.AddFixedWindowLimiter("fixed", config =>
{
    config.PermitLimit = 100;
    config.Window = TimeSpan.FromMinutes(1);
});

2. Sliding Window Rate Limiting

Divides the window into segments and slides forward, smoothing out the burst problem that fixed windows have.

options.AddSlidingWindowLimiter("sliding", config =>
{
    config.PermitLimit = 100;
    config.Window = TimeSpan.FromMinutes(1);
    config.SegmentsPerWindow = 6; // each segment = 10 seconds
});

With 6 segments, the window slides every 10 seconds. This prevents the boundary-burst problem because old segments roll out gradually rather than all at once.

3. Token Bucket Rate Limiting

Token bucket rate limiting in C# is ideal when you want to allow short bursts while enforcing a steady average rate. Tokens refill at a constant rate, and each request costs one token.

options.AddTokenBucketLimiter("token", config =>
{
    config.TokenLimit = 20;           // max burst size
    config.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
    config.TokensPerPeriod = 5;       // 5 tokens every 10 seconds
    config.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    config.QueueLimit = 2;
    config.AutoReplenishment = true;
});

This allows a burst of up to 20 requests, then settles to a sustained rate of roughly 30 requests per minute. It's the best choice for most public APIs.

4. Concurrency Limiter

Unlike the others, this doesn't count requests over time â€” it limits how many requests are in flight simultaneously. When a request completes, its slot opens up immediately.

options.AddConcurrencyLimiter("concurrent", config =>
{
    config.PermitLimit = 5;
    config.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
    config.QueueLimit = 10;
});

Use this for endpoints that call slow external services or run heavy database queries. It prevents resource exhaustion regardless of request rate.

Per-Client API Rate Limiting with Partitions

The examples above apply a single global limit. In production, you need per-client rate limiting â€” each user gets their own bucket. You achieve this with partitioned rate limiters.

builder.Services.AddRateLimiter(options =>
{
    options.AddPolicy("per-user", context =>
    {
        var userId = context.User?.FindFirst("sub")?.Value
                     ?? context.Connection.RemoteIpAddress?.ToString()
                     ?? "anonymous";

        return RateLimitPartition.GetTokenBucketLimiter(userId, _ => new TokenBucketRateLimiterOptions
        {
            TokenLimit = 20,
            ReplenishmentPeriod = TimeSpan.FromSeconds(10),
            TokensPerPeriod = 5,
            AutoReplenishment = true
        });
    });

    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;

    options.OnRejected = async (context, cancellationToken) =>
    {
        context.HttpContext.Response.ContentType = "application/json";
        await context.HttpContext.Response.WriteAsJsonAsync(new
        {
            error = "Too many requests",
            retryAfter = context.Lease.TryGetMetadata(
                MetadataName.RetryAfter, out var retryAfter)
                ? retryAfter.TotalSeconds
                : 60
        }, cancellationToken);
    };
});

The partition key â€” here, the user's sub claim or their IP address â€” isolates each client's rate limit. User A hitting 20 requests doesn't affect User B at all.

The OnRejected callback customizes the 429 response body with a JSON payload and retryAfter hint, which is critical for well-behaved API clients.

Applying Rate Limits to Controllers and Endpoints

You have three options for applying rate limiting policies to your routes.

Minimal API Endpoints

app.MapGet("/api/products", GetProducts)
    .RequireRateLimiting("per-user");

app.MapPost("/api/orders", CreateOrder)
    .RequireRateLimiting("per-user");

// Exempt health checks from rate limiting
app.MapGet("/health", () => Results.Ok())
    .DisableRateLimiting();

Controller Attributes

[ApiController]
[Route("api/[controller]")]
[EnableRateLimiting("per-user")]
public class ProductsController : ControllerBase
{
    [HttpGet]
    public IActionResult GetAll() => Ok(new[] { "Product1", "Product2" });

    [HttpGet("{id}")]
    [DisableRateLimiting] // exempt this specific action
    public IActionResult GetById(int id) => Ok($"Product {id}");
}

Global Rate Limiting

options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(context =>
{
    var ip = context.Connection.RemoteIpAddress?.ToString() ?? "unknown";

    return RateLimitPartition.GetFixedWindowLimiter(ip, _ => new FixedWindowRateLimiterOptions
    {
        PermitLimit = 200,
        Window = TimeSpan.FromMinutes(1)
    });
});

The global limiter runs before any per-endpoint policy. Use it as a safety net â€” a generous limit that catches only truly abusive traffic â€” and use per-endpoint policies for tighter, targeted limits.

Returning Proper Rate Limit Headers

Well-designed APIs tell clients about their current rate limit status through response headers. Here's how to add standard rate limit headers:

app.Use(async (context, next) =>
{
    await next();

    if (context.Response.Headers.ContainsKey("X-RateLimit-Limit"))
        return;

    var rateLimiterFeature = context.Features.Get<IRateLimiterFeature>();

    if (rateLimiterFeature?.Lease is { } lease)
    {
        if (lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
        {
            context.Response.Headers["Retry-After"] =
                ((int)retryAfter.TotalSeconds).ToString();
        }
    }
});

Best Practices for ASP.NET Core Rate Limiting

After seeing rate limiting misconfigured in dozens of production systems, here are the practices that actually matter:

Use token bucket for public APIs. It handles bursty traffic gracefully while enforcing a steady average â€” the best default for most scenarios.
Always partition by user identity, not just IP. Behind corporate NATs and VPNs, thousands of users share one IP address. Rate limiting by IP alone punishes entire offices for one user's behavior.
Set the global limiter generous, per-endpoint tight. The global limiter is your emergency brake (200-500 requests/minute). Per-endpoint limits are your real policy.
Return Retry-After headers. Well-behaved clients back off automatically. Without the header, they keep hammering and your rejection rate stays high.
Don't rate-limit health check endpoints. Load balancers and orchestrators call /health constantly. Rate limiting them causes false-positive service restarts.
Log rejections. Use the OnRejected callback to log which clients are being throttled. This data is essential for tuning limits and detecting abuse.
Middleware order matters. Place UseRateLimiter() after UseAuthentication() and UseAuthorization() so you have access to the user's identity for per-user partitioning.

Common Pitfalls to Avoid

These are the mistakes that silently break rate limiting:

Forgetting UseRateLimiter(). Registering policies in AddRateLimiter does nothing without the middleware call. The code compiles fine â€” you just have zero protection.
Wrong middleware order. Placing UseRateLimiter() before UseAuthentication() means context.User is null in your partition function. Every request maps to the same anonymous partition.
Using only in-memory rate limiting behind a load balancer. The built-in middleware stores state in memory. If you have 4 app instances, each client effectively gets 4x the intended limit. For distributed scenarios, use Redis-backed rate limiting or move enforcement to your API gateway.
Queue limits too high. A QueueLimit of 100 means 100 requests sitting in memory waiting for a permit. Under sustained load, this consumes threads and memory â€” it's often better to reject immediately with a low queue limit (0-5).
Not testing under load. Rate limiting logic that looks correct in unit tests can fail under concurrency. Use tools like k6 or bombardier to verify behavior at scale.

Distributed Rate Limiting with Redis

For multi-instance deployments, you need shared state. While the built-in middleware doesn't include a Redis provider out of the box, you can implement a custom IRateLimiterPolicy backed by Redis or use a library like RedisRateLimiting:

// Install: dotnet add package RedisRateLimiting

using RedisRateLimiting;
using StackExchange.Redis;

var redis = ConnectionMultiplexer.Connect("localhost:6379");

builder.Services.AddRateLimiter(options =>
{
    options.AddRedisTokenBucketLimiter("redis-token", config =>
    {
        config.ConnectionMultiplexerFactory = () => redis;
        config.TokenLimit = 20;
        config.ReplenishmentPeriod = TimeSpan.FromSeconds(10);
        config.TokensPerPeriod = 5;
    });
});

This ensures all instances of your application share a single rate limit counter per client, giving you accurate enforcement regardless of how many pods or servers are running.

Testing Your Rate Limiting Configuration

Always verify your limits work before deploying. Here's a quick integration test using WebApplicationFactory:

[Fact]
public async Task RateLimiter_Returns429_WhenLimitExceeded()
{
    await using var app = new WebApplicationFactory<Program>();
    using var client = app.CreateClient();

    var tasks = Enumerable.Range(0, 15)
        .Select(_ => client.GetAsync("/api/data"));

    var responses = await Task.WhenAll(tasks);

    var tooMany = responses.Count(r => r.StatusCode ==
        System.Net.HttpStatusCode.TooManyRequests);

    Assert.True(tooMany > 0,
        "Expected at least one 429 response when exceeding rate limit");
}

Conclusion

ASP.NET Core rate limiting is no longer optional for production APIs â€” it's a first-class middleware that takes minutes to set up and prevents hours of downtime. Here are the key takeaways:

Use the built-in Microsoft.AspNetCore.RateLimiting middleware â€” it ships with .NET 7+ and requires no extra packages.
Choose token bucket for public APIs, sliding window for strict quotas, and concurrency limiter for protecting slow downstream services.
Always partition by authenticated user identity, falling back to IP address for anonymous clients.
Return Retry-After headers and meaningful JSON error bodies on 429 responses.
For multi-instance deployments, back your rate limiting with Redis to get accurate distributed counters.

Start with the token bucket example above, tune the limits based on your actual traffic patterns, and add OnRejected logging from day one. Your future self â€” and your on-call team â€” will thank you.

Tags: #rate limiting in ASP.NET Core #API rate limiting C# #rate limiter middleware ASP.NET #token bucket rate limiting C# #sliding window rate limiting #throttle API requests ASP.NET Core

About csharp-coder.com
Your go-to resource for C#, .NET, and modern software development. Follow along for daily tutorials, tips, and real-world examples.

Angular 14 : 404 error during refresh page after deployment

In this article, We will learn how to solve 404 file or directory not found angular error in production. Refresh browser angular 404 file or directory not found error You have built an Angular app and created a production build with ng build --prod You deploy it to a production server. Everything works fine until you refresh the page. The app throws The requested URL was not found on this server message (Status code 404 not found). It appears that angular routing not working on the production server when you refresh the page. The error appears on the following scenarios When you type the URL directly in the address bar. When you refresh the page The error appears on all the pages except the root page. Reason for the requested URL was not found on this server error In a Multi-page web application, every time the application needs to display a page it has to send a request to the web server. You can do that by either typing the URL in the address bar, clicking on the Me...

CSharp-Coder

Search This Blog