Skip to content

Middleware

The backend uses a stack of ASGI middleware to handle cross-cutting concerns like rate limiting, request size validation, caching, and metrics collection. Middleware runs in order from outermost to innermost, with response processing in reverse order.

Middleware Stack

The middleware is applied in this order (outermost first):

  1. RequestSizeLimitMiddleware - Rejects oversized requests
  2. RateLimitMiddleware - Enforces per-user/per-endpoint limits
  3. CacheControlMiddleware - Adds cache headers to responses
  4. MetricsMiddleware - Collects HTTP request metrics

Request Size Limit

Rejects requests exceeding a configurable size limit (default 10MB). This protects against denial-of-service attacks from large payloads.

class RequestSizeLimitMiddleware:
    """Middleware to limit request size, default 10MB"""

    def __init__(self, app: ASGIApp, max_size_mb: int = 10) -> None:
        self.app = app
        self.max_size_bytes = max_size_mb * 1024 * 1024

Requests exceeding the limit receive a 413 response:

{"detail": "Request too large. Maximum size is 10.0MB"}

The middleware checks the Content-Length header before reading the body, avoiding wasted processing on oversized requests.

Rate Limit

The RateLimitMiddleware intercepts all HTTP requests and checks them against configured rate limits. See Rate Limiting for the full algorithm details.

Excluded paths bypass rate limiting:

    # Paths exempt from rate limiting
    EXCLUDED_PATHS = frozenset(
        {
            "/health",
            "/metrics",
            "/docs",
            "/openapi.json",
            "/favicon.ico",
            "/api/v1/auth/login",  # Auth endpoints handle their own limits
            "/api/v1/auth/register",
            "/api/v1/auth/logout",
        }
    )

When a request is allowed, rate limit headers are added to the response. When rejected, a 429 response is returned with Retry-After indicating when to retry.

Cache Control

Adds appropriate Cache-Control headers to GET responses based on endpoint patterns:

        self.app = app
        self.cache_policies: dict[str, str] = {
            "/api/v1/k8s-limits": "public, max-age=300",  # 5 minutes
            "/api/v1/example-scripts": "public, max-age=600",  # 10 minutes
            "/api/v1/notifications": "private, no-cache",  # Always revalidate
            "/api/v1/notifications/unread-count": "private, no-cache",  # Always revalidate
        }

    async def __call__(self, scope: Scope, receive: Receive, send: Send) -> None:
        if scope["type"] != "http":
Endpoint Policy TTL
/api/v1/k8s-limits public 5 minutes
/api/v1/example-scripts public 10 minutes
/api/v1/auth/verify-token private, no-cache -
/api/v1/notifications private, no-cache -

Public endpoints also get a Vary: Accept-Encoding header for proper proxy caching. Cache headers are only added to successful (200) responses.

Metrics

The MetricsMiddleware collects HTTP request telemetry using OpenTelemetry:

            name="http_requests_total", description="Total number of HTTP requests", unit="requests"
        )

        self.request_duration = self.meter.create_histogram(
            name="http_request_duration_seconds", description="HTTP request duration in seconds", unit="seconds"
        )

        self.request_size = self.meter.create_histogram(
            name="http_request_size_bytes", description="HTTP request size in bytes", unit="bytes"
        )

        self.response_size = self.meter.create_histogram(
            name="http_response_size_bytes", description="HTTP response size in bytes", unit="bytes"
        )

        self.active_requests = self.meter.create_up_down_counter(
            name="http_requests_active", description="Number of active HTTP requests", unit="requests"
        )

The middleware tracks:

  • Request count by method, path template, and status code
  • Request duration histogram
  • Request/response size histograms
  • Active requests gauge

Path templates use pattern replacement to reduce metric cardinality:

    def _get_path_template(path: str) -> str:
        """Convert path to template for lower cardinality."""
        # Common patterns to replace

        # UUID pattern
        path = re.sub(r"/[0-9a-f]{8}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{4}-[0-9a-f]{12}", "/{id}", path)

        # Numeric IDs
        path = re.sub(r"/\d+", "/{id}", path)

        # MongoDB ObjectIds
        path = re.sub(r"/[0-9a-f]{24}", "/{id}", path)

        return path

UUIDs, numeric IDs, and MongoDB ObjectIds are replaced with {id} to prevent metric explosion.

System Metrics

In addition to HTTP metrics, the middleware module provides system-level observables:

    # Memory usage
    def get_memory_usage(_: CallbackOptions) -> list[Observation]:
        """Get current memory usage."""
        memory = psutil.virtual_memory()
        return [
            Observation(memory.used, {"type": "used"}),
            Observation(memory.available, {"type": "available"}),
            Observation(memory.percent, {"type": "percent"}),
        ]

    meter.create_observable_gauge(
        name="system_memory_bytes", callbacks=[get_memory_usage], description="System memory usage", unit="bytes"
    )

    # CPU usage
    def get_cpu_usage(_: CallbackOptions) -> list[Observation]:
        """Get current CPU usage."""
        cpu_percent = psutil.cpu_percent(interval=1)
        return [Observation(cpu_percent)]

These expose:

  • system_memory_bytes - System memory (used, available, percent)
  • system_cpu_percent - System CPU utilization
  • process_metrics - Process RSS, VMS, CPU, thread count

Key Files

File Purpose
core/middlewares/__init__.py Middleware exports
core/middlewares/rate_limit.py Rate limiting
core/middlewares/cache.py Cache headers
core/middlewares/request_size_limit.py Request size validation
core/middlewares/metrics.py HTTP and system metrics