News
What's happening in web development and AI.
OpenAI Sites Lets Codex Build, Deploy, and Host Interactive Web Apps From a Prompt
OpenAI pulls Codex into ChatGPT and previews Sites: a plugin that turns a prompt or existing project into a hosted, shareable site or app. Builds are Cloudflare Worker-compatible with D1-style storage, and the partner roster spans Wix, Webflow, Figma, Replit, and Lovable.
Read the full storyClaude Opus 4.8 Lands With Record Coding Scores, Effort Control, and Dynamic Workflows
Just 42 days after 4.7, Opus 4.8 sets a public-model record at 69.2% on SWE-Bench Pro, is ~4x less likely to let its own code flaws slide, and ships at the same price. Plus user-facing effort control and Dynamic Workflows in Claude Code.
Project Glasswing Update: Claude Patches 2,100 Bugs and Mythos Eyes a Public Release
Anthropic's first Glasswing update: Claude Security in public beta has patched 2,100+ enterprise vulnerabilities, with 6,202 severe open-source bugs surfaced at a 90.6% true-positive rate. Mythos-class models—previously held back—now framed for eventual general release.
OpenAI Adds Appshots and Goal Mode to Codex for Multi-Day Agent Runs
Cmd-Cmd on macOS attaches any app window—screenshot plus full text including off-screen content—to a Codex thread. Goal Mode goes GA across app, IDE, and CLI for runs that span hours or days. Plus locked-screen computer use and 4M weekly Codex users.
Chrome DevTools for Agents Hits 1.0, Giving Coding Agents Live Runtime Vision
A stable 1.0 release lets coding agents debug in a live browser, run Lighthouse-style audits as a quality gate, take heap snapshots to catch memory leaks, and validate WebMCP tools in real time instead of guessing from static code.
Google Launches Gemini Spark, an Always-On Personal AI Agent on 3.5 Flash
Gemini Spark runs 24/7 on Google Cloud VMs, powered by Gemini 3.5 Flash. Native integration with Gmail, Docs, Sheets, and Slides; MCP connectors for Canva, OpenTable, and Instacart.
Anthropic Lets Claude Managed Agents Run Inside Your Own Perimeter
Self-hosted sandboxes (public beta) and MCP tunnels (research preview). Claude's orchestration loop stays on Anthropic's side; tool execution and file writes move inside your infrastructure.
Cloudflare Environments Becomes the Runtime for Claude Managed Agents
A Workers-based control plane spins up a fresh, secure sandbox for every Claude agent session. Zero Trust, WAF, and audit logging applied before the agent reaches a tool.
Google I/O 2026 Ships Gemini 3.5 Flash, Antigravity 2.0, and an Agent-First Web
Gemini 3.5 Flash beats Gemini 3.1 Pro on most benchmarks at 4x the speed. Antigravity 2.0 desktop, Managed Agents in the Gemini API, WebMCP origin trial, and 200+ skills. Google didn't ship a product—it shipped a vertical agent stack.
WebMCP Lets Browser Agents Call JavaScript Functions and HTML Forms as Tools
An open web standard that lets sites expose JS functions and HTML forms as MCP tools for in-browser AI agents. The experimental origin trial starts in Chrome 149.
Managed Agents in the Gemini API Spin Up an Antigravity Sandbox in One Call
A single API call provisions an isolated Linux sandbox where the Antigravity agent reasons, runs tools, and executes code. Agents defined as versionable AGENTS.md and SKILL.md files.
Cursor Ships Composer 2.5 and Begins Training a 10x Larger Model on Colossus 2
Composer 2.5 is the new default in Cursor: 69.3 on Terminal-Bench 2.0, 25x more synthetic tasks, $0.50/M input. In parallel, Cursor and SpaceXAI train a 10x larger model from scratch.
Anthropic Launches Claude for Small Business With 15 Ready-to-Run Workflows
A toggle install inside Claude Cowork that drops Claude into QuickBooks, PayPal, HubSpot, Canva, Docusign, and Microsoft 365—plus 15 prebuilt workflows for closing the month and chasing invoices.
Claude Platform on AWS Goes GA With Full Native API Parity
Anthropic's native Claude Platform is now GA through AWS. Auth via IAM, audit via CloudTrail, billing via existing AWS invoice, and day-one access to every native API feature.
OpenAI Ships GPT-Realtime-2, Translate, and Whisper for Live Voice Apps
Three new realtime audio models. GPT-Realtime-2 brings GPT-5-class reasoning to voice with 128K context, parallel tool calls, and a 26-point lift in Zillow's adversarial call benchmark.
Claude Agents Now Dream: Anthropic's Dev Conference Reframes the Race Around the Harness
Four updates to Claude Managed Agents: Multi-Agent Orchestration (~33% cheaper), Memory (Global + Personal markdown), Dreaming (agents review past sessions and rewrite their own memory between runs), and Outcomes.
Anthropic Doubles Claude Code Limits and Lands a 220K-GPU SpaceX Deal
Claude Code's five-hour limits doubled across all tiers. Peak-hour throttling removed. Funded by a SpaceX Colossus 1 deal—300+ MW and 220,000+ NVIDIA GPUs within the month.
Cloudflare Dynamic Workflows Bring Durable Execution to Multi-Tenant Apps
A 300-line MIT library that lets a single Worker dispatch durable workflow runs to per-tenant code. Closes the gap between dynamic deployment and durable execution.
OpenAI Releases GPT-5.5, Its Smartest Model Yet for Coding and Agents
82.7% on Terminal-Bench 2.0, 58.6% on SWE-Bench Pro, 78.7% on OSWorld-Verified—matching GPT-5.4 latency with fewer tokens. API at $5/$30 per 1M tokens with a 1M context window.
OpenAI Brings Workspace Agents to ChatGPT for Teams
Codex-powered shared agents that handle long-running team workflows, runnable in ChatGPT or Slack, governed by org permissions. Five launch-day agent patterns. Free until May 6, then credit-based.
Google Launches Gemini Enterprise Agent Platform for the Agentic Era
Vertex AI becomes the Gemini Enterprise Agent Platform with graph-based ADK, Memory Bank, A2A orchestration, Agent Identity, Model Armor, and an Agent Gallery with validated partner agents.
Google Ships Deep Research and Deep Research Max, Built on Gemini 3.1 Pro
Two autonomous research agents in the Gemini API—one tuned for speed, one for depth via extended test-time compute. Both ship with MCP support and native chart generation.
Claude Design Lets You Prototype Polished UI Through Conversation
A Claude Opus 4.7-powered tool for designs, prototypes, slides, and one-pagers. Brand systems baked in from your codebase, inline edits, and a one-instruction handoff bundle for Claude Code.
OpenAI Introduces GPT-Rosalind, a Frontier Reasoning Model for Life Sciences
A purpose-built reasoning model for biology, drug discovery, and translational medicine, plus a free Life Sciences research plugin for Codex. Beats GPT-5.4 on 6 of 11 LABBench2 tasks.
OpenAI Expands Codex for (Almost) Everything
Background computer use on Mac, an in-app browser, gpt-image-1.5, 90+ plugins mixing skills and MCP, GitHub PR review support, multi-terminal and alpha SSH devboxes, and deeper automations.
Anthropic Ships Claude Opus 4.7 for Harder Coding Work and Sharper Vision
Stronger long-horizon software engineering, higher-resolution vision for screenshots and diagrams, a new xhigh reasoning tier, and API task budgets—pricing unchanged from Opus 4.6.
Meta Launches Muse Spark, Its First Model From the Superintelligence Lab
A multimodal reasoning model with multi-agent orchestration, 10x compute efficiency over Llama 4, and a Contemplating mode that hits 58% on Humanity's Last Exam.
90% of Developers Now Use AI Coding Tools at Work
JetBrains surveyed 10,000 developers worldwide. Claude Code grew 6x in nine months to match Cursor at 18% adoption, Copilot's growth stalled at 29%.
Cursor 3 Rebuilds the IDE Around Agents
A new interface built from scratch around AI agents. Run parallel agents across repos, hand off between local and cloud, compare models with /best-of-n, and annotate UI in Design Mode.
Cloudflare Launches EmDash, a Serverless CMS Built to Replace WordPress
An open-source TypeScript CMS built on Astro that sandboxes plugins in Worker isolates, scales to zero, and ships with MCP and AI agent tooling built in. MIT-licensed.
JetBrains Central Gives Teams a Control Plane for AI Coding Agents
An open system that connects coding agents from any ecosystem — Claude, Codex, Gemini CLI — with governance, execution infrastructure, and shared semantic context.
Claude Code Auto Mode Replaces the Permission Prompt With an AI Classifier
A two-layer classifier system that approves safe actions and blocks dangerous ones, replacing the approval fatigue that pushed developers to skip permissions entirely.
Spline Launches Omma, an AI Canvas That Turns Prompts Into Interactive Web Experiences
A generative AI canvas that unifies 3D, motion, animation, and UI into a single natural language workflow. Build production-ready interactive experiences in minutes.
Claude Can Now Use Your Computer — Dispatch Lets You Assign Tasks From Your Phone
Computer use in Claude Cowork and Claude Code, letting Claude point, click, and navigate your screen. Paired with Dispatch, you can assign work from your phone and walk away.
Google AI Studio Turns Prompts Into Full-Stack Apps With Firebase
The Antigravity coding agent in AI Studio with built-in Firebase integration. Build multiplayer apps, add databases and auth, and connect to real-world services — all from a single prompt.
WordPress.com AI Agents Can Now Create and Manage Content
19 write capabilities added to its MCP integration, letting AI agents draft posts, build pages, manage comments, and organize content — all with approval safeguards.
Cloudflare Workers AI Enters the Large Model Game, Starting With Kimi K2.5
Workers AI now serves frontier-scale open-source models with a 256k context window. Cloudflare cut its own security agent costs by 77% and ships prefix caching and async APIs.
Cursor Ships Composer 2, a Frontier Coding Model That Rivals Anthropic and OpenAI
Cursor builds its own frontier-level coding model with continued pretraining and RL. Scores 61.3 on CursorBench and 73.7 on SWE-bench Multilingual, priced at $0.50/M input tokens.
Next.js 16.2 Ships AI-First Developer Experience
Bundles AGENTS.md in create-next-app for 100% eval pass rates, forwards browser errors to the terminal for agent debugging, and adds experimental Agent DevTools.
Netlify Turns AI Prompts Into Production-Ready Software
Agent Runners let teams start web projects from prompts using Claude Code, Codex, or Gemini CLI — with live apps on production infrastructure in minutes.
Google Unveils Stitch, an AI-Powered Design-to-Code Tool
A Google Labs preview that converts design mockups into production-ready front-end code with surprising fidelity.
GitHub MCP Server Adds Secret Scanning for AI Coding Agents
GitHub's MCP Server can now scan code changes for exposed secrets before committing, letting AI coding agents catch credential leaks in real time.
Vercel Ships a Plugin That Gives Coding Agents Platform Expertise
A plugin for Claude Code and Cursor injects 47 skills and platform-specific knowledge directly into the agent's context.
Perplexity Launches Personal Computer, a 24/7 AI Agent on Your Mac
A Mac mini-based AI agent that orchestrates 20+ models across 400+ apps, positioned as the "serious" alternative with full audit trails and a kill switch.
VS Code Moves to Weekly Releases, Powered by AI Agents
The world's most popular editor shifts from monthly to weekly stable releases, enabled by AI agents handling code review, issue triage, and validation.
Garry Tan Launches gstack, a Curated Set of AI Coding Skills
The YC president open-sources a collection of task-specific prompt sets designed for LLM-powered development workflows.
Cursor Launches Automations, Shifting AI Coding to Fully Autonomous
Event-driven coding agents that trigger from Slack, GitHub, and PagerDuty — no human prompt required. Cursor also built a browser from scratch with zero humans for a week.
OpenClaw Surpasses React as the Most-Starred Project in GitHub History
An AI agent built in an hour with Claude Code reached 250,000 GitHub stars in 60 days, obliterating React's decade-long record. Its creator then joined OpenAI.
Cloudflare Rebuilt Next.js on Vite With AI in One Week
One engineer and Claude AI produced vinext across 800+ sessions in 7 days for $1,100 in API tokens. A drop-in Next.js replacement with 4.4x faster builds and 57% smaller bundles.