Vol. I · No. 05 Thursday · April 23, 2026 Morning Edition 12 Spreads

The Agent Backbone.

Platforms harden, compute tightens, and the coding agent quietly graduates from demo to day job. A week where stable releases and datacenter deals mattered more than model scores.

Editor
Aziz · by hand
Window
Last 72 hours
Theme
Agent infra · Compute supply · Shipping
Reading time
~16 minutes
Spread 01 · Tools of the trade For You

Cursor 3 lands, and the IDE stops pretending it isn't an agent.

Cursor's third major version drops the autocomplete pretense entirely. The new agentic default lets the editor plan, edit across files, run tests, and report back — squaring it directly against Claude Code and OpenAI's Codex CLI.

The headline capability isn't the story — every tool claims it now — the commitment is. Cursor is betting the default surface area is a background worker, not a keystroke. For senior engineers, the relevant question is no longer “is autocomplete good” but “how does its plan-and-edit loop behave when you interrupt it mid-refactor.” Worth an afternoon of experimentation before your team's next retro.

01
Spread 02 · Compute · The long capex

Anthropic commits $100B to AWS over the next decade.

Amazon adds $5B immediately and up to $20B more over time. In return, Anthropic locks in five gigawatts of AI capacity and deep use of Trainium2/3 silicon.

5 GW

Two things to read into this. First, frontier labs are now explicit about their compute trajectory in dollar terms — the days of opaque training runs are over, and capex commitments are the real benchmark. Second, Trainium is no longer the “economy option.” A ten-figure commitment from Anthropic is the loudest validation Amazon's silicon has received, and it changes the negotiation posture of every enterprise weighing Trainium vs. H-series GPUs for their inference stack.

Spread 03 · Supply shock For You
Bulletin · Compute rationing

GitHub quietly paused Copilot sign-ups.

The reason isn't demand curves. It's GPUs.

GitHub paused new Copilot sign-ups this week as AI coding loads outran available capacity. Read the signal: the largest coding-assistant provider in the world, sitting on Azure's first-class compute, hit a wall. If Copilot can run out, your self-serve agent PoC running on best-effort quota on whichever hyperscaler you picked has almost certainly already run out — you just haven't noticed because nobody's lining up at your door yet.

Action item: if you have anything agent-adjacent in production, check your per-region reserved capacity this week. The waiting-list phase is now a leading indicator, not a trailing one.

Spread 04 · Models · Open weights For You

$ qwen-3.6-35B-A3B · 73.4% SWE-Bench Verified

Alibaba shipped Qwen 3.6-35B-A3B on April 17. On paper it's a 35B MoE with only 3B active parameters per inference pass. In practice it scored 73.4% on SWE-Bench Verified — which puts an open-weight model in the same postcode as frontier closed-source coders for the first time this year.

active_params = 3_000_000_000
total_params  = 35_000_000_000
sparsity      = 91.4%    # ratio of idle experts per token

swe_bench_verified = 0.734
→ open-weight ceiling moved up ~6pp in 30 days

The architect's read: the MoE cost curve is now sharp enough that running a local 35B-A3B on a single well-provisioned node gives you “good-enough” coding autonomy with no external API at all. Compliance-minded teams should put this on the evaluation shortlist this quarter.

Spread 05 · Open models · On-device
Story 05 · of 10

Gemma 4 arrives Apache-2.0, on-device, agent-ready.

Google DeepMind's most capable open family yet. The headline isn't benchmarks — it's where it runs.

Gemma 4 ships under Apache 2.0 with variants small enough to run inside the AICore developer preview on Android. That's the actual story: agentic skills delivered by a model sitting on the phone, not the datacenter. The Pixel tax goes the other way now — latency and privacy are the selling points, and cloud inference becomes the fallback.

For engineers shipping consumer products, the planning horizon compresses: any feature that depended on “we can always route to the cloud” should be re-evaluated on a 12-month horizon against a local variant of Gemma-class models. The production question is no longer “can we run it on-device” but “why are we still paying for the round-trip.”

Spread 06 · Silicon · Competitive pressure
Story 06

Google splits TPU v8 into two purpose-built chips.

Google Cloud's eighth-generation TPU family ships in two dies — one tuned for training, one for inference — aimed squarely at the part of Nvidia's market that's loudest about lock-in.

8th gen TPU

The “two dies, one family” design matters more than the numbers. A training-dedicated chip admits what everyone has known privately: inference-serving workloads have fundamentally different profiles, and the one-size-fits-all GPU era is ending. Expect cloud pricing pages to get very strange over the next two quarters as providers try to explain four different SKUs per model size to customers who just want a quota.

Spread 07 · Agent frameworks No. 07 / 10 For You

Microsoft Agent Framework goes 1.0 — and means it.

Stable APIs. Long-term support. Both .NET and Python. The enterprise-shaped shoe drops.

The 1.0 tag here is not marketing. Microsoft paired the release with an explicit long-term support commitment, pitching the framework at regulated-industry teams that couldn't stomach a weekly-breaking-change agent library. If you've been holding off on agentic features because the surface area was too young to audit, that excuse just got weaker.

Paired with the Agent Governance Toolkit — sub-millisecond policy engine, cryptographic agent identity, SOC2 / HIPAA / EU-AI-Act mapping — Microsoft is selling a reference stack for teams who answer to GRC. For indie developers and startups, the interesting question is how much of that stack is usable à la carte, without opting into the full Azure gravity well.

The unglamorous work of stabilization is, of course, exactly the work competitors will use to ridicule this as “last year's problem.” Ignore the noise: the teams in production a year from now are the ones choosing boring, documented, SLA-backed frameworks today.

Spread 08 · SaaS integrations
08 / 10

The write-back era.

ChatGPT's Linear, Notion, Dropbox, and Box connectors gained write capability this week — a small patch note with very large implications.

For a year, chat-integrations were read-only tours: “summarize my tickets,” “find that doc.” The write permission flips the relationship. Now the chat interface is a first-class operator of those tools: create a ticket, move it, file a doc, restructure a folder. For any team whose day-to-day is split across those four SaaS surfaces, the meeting-to-ticket round-trip just collapsed.

LinearNotionDropboxBox+write
Spread 09 · Editor updates For You
09 / 10

VS Code's April releases quietly rewrote the terminal.

Two consecutive releases (April 8 and April 15) added a companion app, session-level debugging for agent runs, sharper terminal interaction, and built-in Copilot functionality that used to live in extensions.

Spread 10 · Platform · Cloudflare Agents Week
10 / 10

Cloudflare shipped an entire agent backplane in five days.

Agents Week 2026 (April 13–17) was less a conference than a product dump. The standouts are the ones that compound — a mesh, a sandbox, a router.

Dynamic Workers

Per-agent isolation with lazy cold-start; priced closer to a request than a container.

Sandboxes GA

Full dev-env containers on demand — clone, install, test, discard. The Codex pattern, generalised.

Cloudflare Mesh

Agent-to-agent networking as a first-class primitive. Identities carry across calls.

AI Gateway unified router

14+ providers behind one proxy, with per-user attribution and model catalog enforcement.

The architectural thesis is blunt: agents need a networked substrate, not a function runtime. See today's Architecture in the Wild spread for the internal flavour of the same ideas.

Hacker News · top five

24h window · HN Firebase
01

I am building a cloud

793 points · 404 comments · crawshaw.io

David Crawshaw's manifesto on building a new small-cloud provider from first principles. The technical meat is how much of a hyperscaler you can rebuild from UNIX primitives when you aren't optimising for a Fortune-500 compliance matrix. Reads like an indie hacker's love letter to the operator's mindset the big clouds abandoned years ago.

02

Bitwarden CLI compromised in ongoing Checkmarx supply-chain campaign

268 points · 136 comments · socket.dev

Socket.dev uncovered a malicious version of the Bitwarden CLI package published as part of a broader Checkmarx-targeted supply-chain campaign. The payload exfiltrated credentials at install time. Rotate any secrets you've touched via a CI that installed Bitwarden CLI this week — and double-check your package-pinning posture while you're at it.

03

Show HN: Honker — Postgres NOTIFY/LISTEN semantics for SQLite

146 points · 23 comments · github.com/russellromney/honker

A tiny Go library that layers a pub/sub channel on top of SQLite's WAL so embedded apps can get the LISTEN/NOTIFY pattern they know from Postgres. The interesting detail is how it handles multi-writer coordination without a separate daemon — a nice piece of plumbing for anyone building local-first or edge-deployed services.

04

France confirms data breach at agency that manages citizens' IDs

117 points · 29 comments · techcrunch.com

The French ID-management agency confirmed an intrusion affecting personal data for an as-yet-unspecified subset of citizens. The thread's most interesting turn is the split between “government agencies are an inevitable target” and “this is a governance story, not a technical one” — both true, neither reassuring for any org handling comparable identity records.

05

Incident with multiple GitHub services

21 points · 4 comments · githubstatus.com

GitHub's status page logged a multi-service incident this morning — Actions, Issues, and the API all impacted. Low on points because it was still actively unfolding at publish time, but worth a mention: a GitHub brownout now casts a wide shadow on CI pipelines, agent-PR workflows, and anyone whose deploy path runs through Actions.

Architecture in the Wild · Feature

The AI engineering stack Cloudflare built internally — on the platform it ships.

This is the piece to sit with this week. The authors — platform engineers at Cloudflare — walk through the actual internal AI-engineering stack they use to develop Cloudflare itself, and it happens to be built almost entirely on Workers, AI Gateway, Durable Objects, Sandboxes, and D1. It is simultaneously an engineering memoir and a (very polished) proof-of-dogfood.

The architectural spine is a single-proxy Worker pattern. Every LLM call inside Cloudflare — from IDE completions to MR review agents — routes through one AI Gateway Worker. That gives them per-user attribution without touching client configs, model catalog enforcement, and a single place to slot new providers. Between the lines is the usual lesson: “direct connections don't scale” was learned the hard way.

The second-most interesting idea is how they compress MCP tool schemas. At Cloudflare scale (2,055 services, 375 teams, 3,900 repos), naïvely exposing every tool to every agent blows out the context window. Their Portal-level Code Mode collapses many schemas into two portal-level tools, shrinking context overhead from roughly 15,000 tokens to near-zero per request. If you've ever watched your MCP-heavy agent sessions slow down linearly with tool-catalog size, this is the pattern to study.

“One thing we got right early: routing through a single proxy Worker from day one. The proxy pattern gives you a control plane that direct connections don't.”

Two more details deserve flagging for senior engineers. First, Backstage-as-knowledge-graph: they feed the service catalog, ownership metadata, and dependency graph into agents as structured context — it turns “who owns this?” from a Slack question into a tool call. Second, AGENTS.md files auto-generated per repo act as the bootstrap context every agent reads before touching code; think of it as README.md for machines, versioned alongside source.

The cost-engineering note is quietly the most actionable: 51.47 billion input tokens per month through Workers AI using Kimi K2.5, at ~77% lower cost than frontier for the bulk of non-critical work. Frontier models still handle the sharp-edged tasks. This is the “routing is the feature” thesis made real at production scale — and it's the kind of number that earns you a budget conversation with your CFO.

Read the full piece on blog.cloudflare.com →