RECONINTEL
◄ ALL POSTS
MAY 27, 2026CVSS 6.5 · MEDIUM (NVD) · 7.0 (CVSS v4)6 MIN READ

CVE-2026-48710: Finding Vulnerable Starlette and FastAPI Services on Your Network

A Host header injection flaw in Starlette (the ASGI framework behind FastAPI) lets attackers bypass path-based authentication middleware. The real risk: thousands of AI infrastructure services — vLLM, LiteLLM, MCP servers, Ray Serve — run on Starlette and are often deployed without authentication middleware at all.

The Vulnerability

CVE-2026-48710 (CWE-444) is an HTTP Host header manipulation flaw in Starlette versions before 1.0.1. Starlette reconstructs request URLs by concatenating the Host header with the request path — without validating the Host header against RFC 9112. An attacker can inject path separators into the Host header, causing request.url.path to return a different path than the one actually routed.

  • CVSS: 6.5 Medium (NVD/v3.1) · 7.0 High (X41 advisory / CVSS v4.0)
  • AFFECTED: Starlette 0.8.3 through 1.0.0 — and every framework built on it (FastAPI, etc.)
  • DISCOVERED: January 27, 2026 by X41 D-Sec (funded by OSTIF)
  • PATCHED: Starlette 1.0.1 — released May 21, 2026
  • GHSA: GHSA-86qp-5c8j-p5mr

How the Attack Works

A request to /protected with the Host header set to example.com/health?x= causes Starlette to reconstruct the URL as http://example.com/health?x=/protected. The routing layer processes /protected normally, but any middleware checking request.url.path sees /health — a path it considers safe.

This breaks both allowlist ("only these paths need auth") and denylist ("these paths are protected") middleware patterns. MCP servers are especially vulnerable because the MCP spec mandates unauthenticated OAuth discovery endpoints (/.well-known/oauth-protected-resource), providing a reliable injection path for allowlist bypass. The attack requires raw TCP sockets because standard HTTP clients normalize the Host header, which is why it evaded detection for years.

Why This Matters for Your Network

The medium CVSS rating materially understates the real-world risk. Starlette receives 325 million downloads per week (per Ars Technica) and underpins thousands of production services — not just AI, but any Python web application using FastAPI or raw Starlette. X41 D-Sec, who discovered the flaw, classifies it as critical severity. The vulnerability is particularly dangerous in AI/ML infrastructure, where FastAPI-based services are the default deployment pattern and authentication is often an afterthought:

  • vLLM — LLM inference server, default port 8000
  • LiteLLM — LLM proxy/gateway, default port 4000
  • MCP servers — Model Context Protocol endpoints (JSON-RPC over HTTP)
  • Ray Serve — distributed ML serving, dashboard on 8265
  • BentoML — ML model serving, default port 3000
  • Label Studio, Gradio, MLflow — ML tooling commonly deployed on internal networks

Many of these services are deployed by ML teams on internal networks with minimal security review. They often run with no authentication at all — making the Host header bypass moot but the exposed service itself the real problem. Either way, you need to know what's listening.

Beyond auth bypass, exploitation can lead to SSRF (server-side request forgery) and, in some cases, remote code execution. MCP servers are especially high-value targets because they store credentials for external systems — databases, email accounts, calendars, cloud APIs. An attacker who breaches an MCP server gets the keys to everything it connects to.

What's Already Exposed

X41 D-Sec's internet scan has already identified vulnerable production systems with unauthenticated access to:

  • Biopharma AI — clinical trial databases, M&A data
  • Identity verification — face analysis, live PII, KYB data
  • IoT/Industrial — SSH to devices via bastion hosts, RCE
  • Email/SaaS — full mailbox read/send/delete, S3 export
  • HR/Recruitment — candidate PII, hiring pipeline data
  • Cloud monitoring — AWS topology, distributed traces
  • Cybersecurity tools — asset inventory, live Nuclei scanner access

This is not theoretical. These are real systems found exposed on the public internet.

Investigation Workflow

Unlike appliance CVEs where you're looking for one vendor's product, this is a framework-level flaw. You're scanning for a class of services: anything built on Starlette/FastAPI, especially AI infrastructure that ML teams may have deployed outside normal change management.

RECON CVE Lookup showing CVE-2026-48710 Starlette Host header injection CVSS 6.5 Medium

Start by confirming the advisory in RECON's CVE Lookup. A search for CVE-2026-48710 pulls the full NVD entry — CVSS 6.5, CWE-444, and the Starlette version range — so you know exactly what you're scanning for.

1. Port Scan: Find Python API Services

Scan your internal subnets for common FastAPI/AI infrastructure ports. The targets:

  • 8000 — Uvicorn/FastAPI default, vLLM
  • 8080 — common alternative for HTTP APIs
  • 4000 — LiteLLM proxy
  • 8265 — Ray dashboard
  • 3000 — BentoML, Gradio
  • 5000 — MLflow, Flask (often paired with Starlette)
  • 7860 — Gradio default
  • 9090 — common for internal services

Any open port on a host in your GPU cluster, ML training VLAN, or data science subnet is worth investigating.

RECON Port Scan targeting AI infrastructure ports for CVE-2026-48710 Starlette investigation

Port Scan targeting an internal host. Scan management subnets and GPU clusters for common FastAPI/AI service ports.

2. HTTP Headers: Fingerprint Starlette/Uvicorn

Starlette services behind Uvicorn are straightforward to fingerprint. Look for:

  • server: uvicorn response header — the default ASGI server for FastAPI
  • content-type: application/json on root path with a JSON body
  • • FastAPI's auto-generated /docs (Swagger UI) and /redoc endpoints
  • /openapi.json — FastAPI serves this by default unless explicitly disabled
  • • vLLM: /v1/models endpoint (OpenAI-compatible API)
  • • LiteLLM: /health and /v1/models endpoints
  • • MCP servers: JSON-RPC responses to POST requests

A host returning server: uvicorn with /openapi.json accessible is almost certainly a FastAPI application running on Starlette. If /docs is open, you can read the full API specification — which also tells you whether authentication middleware exists.

RECON HTTP Headers fingerprinting Starlette/Uvicorn service for CVE-2026-48710

HTTP Headers module configured for the target. Server headers and response patterns confirm Starlette/FastAPI instances.

3. TLS Inspect: Check for Exposed HTTPS Services

Many AI services run plain HTTP on internal networks, but some are deployed behind TLS. Pull certificates on any HTTPS port — self-signed certs with generic subjects (like localhost or *.internal) are common on hastily deployed ML services. The certificate's Subject Alternative Names may reveal internal hostnames or service names.

4. DNS: Find Shadow AI Infrastructure

Query internal DNS for common AI infrastructure naming patterns: vllm.*, llm-*, litellm.*, ml-*, gpu-*, inference.*, api-*, mcp-*. Reverse DNS on subnets allocated to GPU nodes or ML teams can surface services deployed outside your asset inventory.

Testing for the Vulnerability

Once you've found Starlette-based services, the BadHost Scanner from X41 D-Sec can test whether specific instances are vulnerable. It operates in three modes:

  • MCP Server mode — targets MCP JSON-RPC endpoints
  • AI Infrastructure mode — auto-discovers vLLM and LiteLLM paths
  • Custom mode — arbitrary Starlette/FastAPI applications

The scanner uses raw TCP sockets to send the malformed Host header — standard HTTP clients like curl or requests normalize the header and can't reproduce the attack. For static analysis, X41 published Semgrep rules and CodeQL queries to detect vulnerable middleware patterns in your own code.

The Reverse Proxy Question

RFC-compliant reverse proxies (nginx, Caddy, Traefik, HAProxy), Cloudflare, and AWS ALBs reject malformed Host headers and will block this attack before it reaches Starlette — but verify your specific configuration. This is likely why the bug went undetected for eight years — most internet-facing FastAPI services sit behind proxies that silently neutralize it.

But that's exactly what makes the remaining exposure dangerous. The services that aren't behind proxies tend to be internal AI infrastructure: GPU cluster endpoints, ML experiment servers, evaluation dashboards — deployed directly by data science teams who skip the proxy layer. These are also the services most likely to hold sensitive model weights, training data, and API credentials.

Detection Gaps

There's no WAF signature or IDS rule for this yet. The root cause is a classic parser disagreement across three components — ASGI servers pass the raw Host header through, Starlette trusts it for URL reconstruction, and middleware authors assume request.url.path reflects the actual request. Each component behaves reasonably in isolation; the vulnerability only emerges from their interaction. Detection relies on identifying vulnerable deployments proactively — which is why network scanning matters more than log analysis here.

Remediation

  1. Update Starlette to 1.0.1+ (and FastAPI to a version that pins Starlette ≥1.0.1). This is the definitive fix.
  2. Audit your middleware. If you use request.url.path for auth decisions in any BaseHTTPMiddleware, switch to scope["path"] or move auth to endpoint-level dependencies (Depends(), Security()). Note: FastAPI's built-in dependency injection auth (Depends(), Security()) uses route matching, not request.url.path, so it is not vulnerable.
  3. Deploy behind a reverse proxy. An RFC-compliant proxy (nginx, Caddy, Traefik) validates Host headers and blocks the attack vector at the network layer.
  4. Inventory your AI services. Many organizations don't know what ML teams have deployed. Check Docker images, Helm charts, and transitive dependencies — old vLLM or LiteLLM containers may pin vulnerable Starlette versions. The bigger risk isn't the auth bypass — it's the services running with no auth at all.
  5. Restrict network access. AI inference endpoints should not be reachable from untrusted networks. Segment GPU clusters and ML services behind firewall rules.

Every tool used in this investigation — port scan, TLS inspect, HTTP headers, DNS — runs from your phone in RECON. Get it on the App Store.

Follow @reconnetops for new CVE investigations.

Sources

By Vladimir Slavin · Founder, RECON · support@slvn.net