Cloudflare Engineering · 30-day report
The single proxy worker: Cloudflare's internal AI stack, built on the platform they ship.
By Ayush Thakur, Scott Roe-Meschke, Rajesh Bhatia · Cloudflare Blog · April 20, 2026
Cloudflare published a rare full-stack tour of how they actually run AI internally — not a vendor pitch, but a 30-day production accounting of the architecture every Cloudflare engineer touches when they prompt anything.
The shape of it: every internal AI request passes through a single proxy Worker that fronts AI Gateway, which fronts both frontier models (~91% of requests) and Workers AI (~9%, for the cost-sensitive long tail). Auth is Cloudflare Access at the edge; portal layer collapses MCP tool schemas via "Code Mode" so token budgets don't explode; agentic state lives in McpAgent and Durable Objects via the Agents SDK; sandboxed code execution runs in dynamic Workers; orchestration uses Cloudflare Workflows; and a Backstage-based knowledge graph plus repo-level AGENTS.md files give every agent the context map it needs.
3,683
Active users · 30 days
295
Teams using agentic tools
The architectural insight that earned the lede: by routing everything through a proxy Worker from day one, they could add per-user attribution, model-catalog management, and permission enforcement after the fact, without changing a single client config. It's a textbook lesson in placing the indirection point before you know what you'll need it for. The same essay quietly admits the cost: portal-level Code Mode requires architectural changes upstream, and the 91/9 split between frontier and in-house inference is an active cost-management lever, not a stable equilibrium.
"Centralizing through a Worker meant we could add per-user attribution, model catalog management, and permission enforcement later without touching any client configs."
Why it matters to anyone shipping agentic systems: the post is one of the few public, instrumented references for what an actual production-grade AI platform looks like once 60% of the company is in it. If you're scaffolding internal AI tooling now, the order of operations is the lesson — proxy first, governance second, models third — not the specific Cloudflare primitives.
Read on the Cloudflare blog →