Daily Magazine ● Vol. I · No. 7 ● Saturday · April 25, 2026 ● Morning Edition

The Forty‑Billion Tilt.

Google rewrites the cap table at Anthropic, OpenAI ships open weights for the first time in two years, Cognition courts a $25B mark, and Cloudflare publishes the most candid LLM-in-CI deep dive yet — twelve spreads on the day capital and infrastructure both moved.

Issue

No. 07 — Saturday Edition

Published

April 25, 2026

Spreads

10 curated · 5 HN · 1 feature

Anchor

Cloudflare on AI code review

No. 01 / Lead

Capital · Anthropic · April 24

Google’s forty‑billion bet on Claude.For You

Bloomberg confirmed Friday that Alphabet plans to invest up to $40B in Anthropic — $10B upfront, with another $30B unlocked against performance milestones. It’s the largest single check ever written into an AI lab, and it lands the same week Anthropic shipped its Claude Code postmortem and Google rolled Gemini 3.1 Ultra to general availability. The two companies are no longer hedging — they’re paired.

$40B

total commitment

$10B

upfront tranche

$30B

milestone-gated

Read Bloomberg →

Security · Active

No. 02 / Alert

Vercel · OAuth supply chain

The breach keeps finding new accounts.For You

Vercel disclosed Thursday that it has identified a second batch of customer accounts showing signs of compromise — separate from, but in the wake of, the April Context.ai OAuth supply-chain incident. The original entry vector dates back to February: Lumma Stealer malware hit Context.ai, the attackers harvested Google Workspace OAuth tokens, and used them to pivot into Vercel’s internal systems. Two months of dwell time. Non-sensitive env vars — API keys, signing keys, DB credentials — were exposed when not explicitly flagged sensitive. If you ship serverless on Vercel, today is a credential-rotation day.

Read the bulletin →

No. 03 / Open Weights

OpenAI · Apache 2.0

120B

OpenAI ships open weights again.

For the first time since the original GPT-2 release, OpenAI has dropped open-weight models under a license you can actually use commercially: gpt-oss-120b and gpt-oss-20b, both Apache 2.0. The 20B variant is tuned for consumer hardware; the 120B targets a single H100 with offloading. Strong reasoning, native tool use, and a stated benchmark lead over similarly-sized open models. After two years of Llama and DeepSeek setting the open frontier, the incumbent has rejoined the table.

Read the announcement →

No. 04 / Capital

Cognition · Devin · Funding

Devin’s makers are quietly raising at $25B.

A doubled mark in a single quarter, while category leaders consolidate around agents-in-CI.

Cognition AI — the company behind Devin, the autonomous software-engineer agent that demoed and overpromised in equal measure last year — is now in talks with investors for a round that would land its valuation at $25B, more than doubling its previous mark. The pitch has shifted: less “Devin replaces engineers,” more “Devin runs in your CI alongside humans.”

The valuation matters less than the signal. After Cursor’s $2B raise and Anthropic’s new $40B tranche, the market is putting roughly $70B into the “agentic coding” layer of the stack. Either the productivity gains land, or the vintage of 2026 funds the messiest AI correction yet. Place your bets.

Read SiliconANGLE →

No. 05 / Tools

Cursor · CLI · Debug ModeFor You

Cursor ships /debug for the agents that flake.

Three weeks after the Cursor 3 ship — parallel agents, Design Mode, Composer 2 — the team rolled an end-of-week update focused on the boring middle: actually shipping bugs that don’t reproduce. The new /debug command tells the agent to generate hypotheses, drop log statements at runtime, gather evidence from the running process, and only then propose a targeted fix. CLI parity continues to close on the IDE.

cursor /debug "intermittent 500 on /api/orders"

# 1. forming hypotheses (3 candidates)

# 2. instrumenting handlers/orders.ts:42 with telemetry

# 3. running suite — captured 1/100 reproductions

# 4. root cause: db pool race on connection reuse

# 5. proposing patch — review diff

Read changelog →

No. 06 / Compliance

Microsoft · Agents · Governance

Open-source guardrails for the agent era.

Microsoft drops a seven-package toolkit mapping agent telemetry to EU AI Act, HIPAA, and SOC2.

Seven packages, multiple languages

Drops integration adapters for LangChain, OpenAI Agents, Haystack, and Azure AI Foundry — the toolkit is framework-agnostic by intent, not by accident.

ii.

9,500+ tests on day one

Test count is the unusual signal. Microsoft is treating governance as a regulated engineering surface — coverage matters more than DX polish.

iii.

Maps to the regs that matter

Audit trails, policy hooks, and prompt-level telemetry that map line-by-line to EU AI Act Article 13, HIPAA §164.312, and SOC2 CC7.2.

Read the round-up →

No. 07 / Industry

Layoffs · Buyouts · Reallocation

Two big-tech payrolls move against the model spend.

Meta is preparing roughly 8,000 cuts; Microsoft is offering buyouts to about 7% of its U.S. workforce. Both are simultaneously expanding capex on AI infrastructure and AI-focused hires. The pattern is clear and not subtle — the headcount that funded the cloud era is being recycled into the GPU era. Senior engineers in non-AI orgs should read this as a signal about where org charts are headed, not just a quarterly cost line.

~8,000

Meta · planned cuts

~7%

Microsoft U.S. · buyouts

↑ Capex

Both · AI infra

Read the round-up →

No. 08 / Methodology

Anthropic · Evals · Open data

Anthropic publishes the eval set behind its neutrality claim.

Ahead of the U.S. midterms, Anthropic shipped both the safeguards and — more interestingly — the methodology and dataset behind its political-neutrality scoring. Two recent Claude models scored 95% and 96% on its in-house neutrality benchmark. Whether the test is rigorous enough is a fair debate. What’s harder to argue with is that the lab is showing its work.

“Showing the eval set is the part that lets you actually argue with the score.”

Read the writeup →

No. 09 / Models

Google · Gemini 3.1 Ultra · GA

Two million tokens, native everything.

Gemini 3.1 Ultra is now generally available at the 2M-token context that Google previewed in March. The pitch is native multimodality without transcription middlemen — text, image, audio, and video stream into the same context window, no preprocessing required. On benchmarks it shares the top of the table with GPT-5.4 Pro at 57 on the Artificial Analysis Intelligence Index. For codebase-wide refactors it’s now genuinely competitive with Claude on context capacity.

Read the launch post →

stable context window

94.3%

GPQA Diamond

AAII (tied with GPT-5.4 Pro)

No. 10 / Languages

Rust · 1.95.0 · Milestone

Rust passes 1.95 — and the ecosystem stops being a hot take.

Rust 1.95.0 shipped this month, and the routine point release reads as a quiet victory lap. The language now powers parts of the Linux kernel, Firefox’s rendering engine, and Discord’s backend in production — three different organizations with three very different risk profiles all relying on the same toolchain. Async closures stabilized in the prior release made async Rust dramatically more ergonomic; the certification program for regulated industries lands the boring final mile.

Add it to the “languages you can pick without a memo” list. The decade-long argument is over.

Read the release notes →

Hacker News · Top of the front page · Last 24h

From the front page.

Five stories ranked highest on HN this morning, with a brief editor’s note. Anthropic’s $40B headline appeared at #2 but is already this issue’s lead, so we slide one rung down.

New 10 GbE USB adapters are cooler, smaller, cheaper

217 points · 89 comments · jeffgeerling.com

Jeff Geerling benchmarks a new generation of bus-powered USB-C 10 GbE NICs that finally don’t double as space heaters. Sub-$80, smaller than a credit card, and cool enough to leave clipped to a laptop while sustaining line rate. The home-lab and on-set-video crowds were waiting for exactly this part to commoditize.

HN thread → Source →

Sabotaging projects by overthinking, scope creep, and structural diffing

444 points · 109 comments · kevinlynagh.com

Kevin Lynagh on the specific mode of failure where you’re too good at seeing how things could be — and that competence becomes the enemy of shipping. The structural-diffing bit is especially sharp: it’s when you keep rewriting in your head rather than committing what you have. Worth reading the morning before any planning meeting.

HN thread → Source →

My audio interface has SSH enabled by default

259 points · 80 comments · hhh.hn

The author noticed open port 22 on their RØDECaster Duo, logged in over the LAN, and found a full embedded Linux box behind a music gadget. It’s a great write-up about modern consumer hardware quietly being root-by-default — and a reminder that “just an audio interface” is increasingly a misnomer.

HN thread → Source →

Replace IBM Quantum back end with /dev/urandom

170 points · 24 comments · github.com/yuvadm

A demo that swaps a real IBM quantum back-end with a dummy that just returns randomness from /dev/urandom — and shows the noisy quantum hardware is, on many small benchmarks, statistically indistinguishable. It’s a pointed comment on quantum’s current signal-to-noise ratio, not an indictment of the field.

HN thread → Source →

Plain text has been around for decades and it’s here to stay

114 points · 34 comments · aresluna.org

An affectionate, slightly prickly defense of plain text as the format that has outlasted every “rich” replacement. The piece reads less like nostalgia and more like an inventory: what survives on a 30-year horizon, and what doesn’t. Useful framing if you’re picking a format to bet your future tooling on.

HN thread → Source →

Architecture in the Wild · Feature

Orchestrating AI code review at scale.

Cloudflare Engineering · Ryan Skidmore · Published April 20, 2026

Cloudflare didn’t bolt a single LLM onto its CI and call it AI review. Skidmore’s deep-dive walks through a deliberately decomposed system built on OpenCode: up to seven specialized reviewer agents — security, performance, documentation, types, tests, ergonomics, dependencies — each given a tight prompt about what to flag and what to ignore. A coordinator agent dedupes their findings and makes the final approve/reject call. The piece is unusually candid about the production economics: 131,246 reviews in 30 days across 48,095 merge requests, median completion time of 3 minutes 39 seconds, and an honest cost ledger that runs $0.20 for typo fixes and $1.68 for complex refactors. Risk-tiered prompting is treated as a first-class lever, not an afterthought. The system uses prompt caching aggressively — an 85.7% cache hit rate saves “five figures monthly” — and ships circuit breakers and provider-failback chains so that a single LLM-vendor outage doesn’t stall every CI pipeline at Cloudflare. Engineers used the “break glass” override to bypass the agent 0.6% of the time, which is the most useful single number in the entire piece: it’s the empirical false-positive rate, and it’s tiny.

“Telling an LLM what not to do is where the actual prompt engineering value resides.” — Ryan Skidmore, Cloudflare

131,246

Reviews · 30 days

3m 39s

Median completion

$1.19

Avg cost per review

85.7%

Prompt cache hit rate

0.6%

Break-glass override

P99 $4.45

Worst-case spend

Specialized agents

5,169

Repositories covered

Read the deep dive →