Hey Everyone đź‘‹,
John Lindquist here with the 25th issue of AI Dev Essentials!
As always, it's been another packed news week in the AI dev space. The new Meta model is intriguing due to the method in which they're training it. The Google Chrome MCP is the most powerful MCP release to date. OpenAI keeps expanding infrastructure, investing ludicrous amounts of money. Research shows that 90% of developers are using some form of AI to write code, which is a pretty dang high number.
Personally, I've been experimenting with the Chrome DevTools MCP and I'm looking forward to adding that to my Claude Code workshop and replacing a lot of my Playwright MCP usage. I've also been diving deeper into the Claude Code SDK, which had a minor release but added a bunch of types that have been missing in the past, making it much easier to work with. If you're interested in my next workshop, you can sign up at the link below.
Sign up for the next workshop here: https://egghead.io/workshop/claude-code
🎓 New egghead.io Lessons This Week
Claude Code Essentials Course Updates - View Course
New lessons on Claude Code hooks:
- Type-Safe Hooks with Bun - Parse JSON payloads with TypeScript types
- Prompt Rewriting - Create custom shortcuts like
:plan
to transform prompts - Live Data Injection - Build
load()
functions that fetch API data on demand - Template-Driven Solutions - Generate N variations with
v(5)
commands - Hook Guardrails - Block dangerous prompts with exit codes and stderr feedback
These lessons show how to extend Claude Code with custom automation, from simple prompt shortcuts to sophisticated guardrails. Perfect for teams looking to standardize their AI workflows.
🚀 Major Announcements
Meta Unveils Code World Model for Advanced Code Generation
Meta FAIR released Code World Model (CWM) on September 24, 2025, a 32-billion-parameter research model that introduces "world modeling" to understand how code execution affects program state, not just syntax.
Key capabilities:
- World Modeling Innovation: Trained on Python execution traces and Docker environment interactions, recording how each line affects local variable states
- Impressive Benchmarks: 68.6% on LiveCodeBench v5, 76% on AIME 2024, 65.8% on SWE-bench Verified, 96.6% on Math-500
- Three Model Variants: Pretrained (
cwm-pretrain
), supervised fine-tuned (cwm-sft
), and instruction-tuned (cwm
) checkpoints - Neural Debugger: Functions like a neural
pdb
that can be set to any initial frame state for reasoning queries - Hardware Requirements: Needs 160GB combined GPU VRAM for inference
- Research License: Released under FAIR Non-Commercial Research License for academic use
(Meta AI Research, GitHub Repository, MarkTechPost)
I actually really love this idea because I often feed build output, logs, and traces back into the AI to help it understand the code it wrote. Taking that approach when training the model seems like it might even save me that step and make it more intelligent about what the final output would be. The fact that this is only a version one and it's already so performant while being as small as it is is extremely impressive. It's definitely not something to rely on right away—it doesn't match the latest state-of-the-art models—but we'll see if it catches up.
Claude Now Available in Microsoft 365 Copilot
Microsoft announced on September 24, 2025, that Claude Sonnet 4 and Claude Opus 4.1 are now available in Microsoft 365 Copilot, expanding AI model choices for enterprise users beyond OpenAI.
Integration details:
- Model Options: Claude Opus 4.1 powers Researcher agent for complex reasoning, Claude Sonnet 4 available in Copilot Studio
- Enterprise Focus: Building agents with deep reasoning and workflow automation capabilities
- Rollout Strategy: Available through Microsoft's Frontier Program to opted-in Copilot-licensed customers
- Data Considerations: Claude models hosted outside Microsoft environments, subject to Anthropic's Terms of Service
- Strategic Shift: Microsoft embracing multi-model orchestration rather than exclusive OpenAI reliance
(Microsoft 365 Blog, Anthropic Official, TechCrunch, CNBC)
I'm not going to pretend to understand the inner workings of the business contracts between all these powerhouses, but last I checked, Microsoft and OpenAI were pretty tight bed partners. So this announcement of Anthropic stepping in as an option really catches me off guard. Honestly, I prefer Anthropic's models for doing things like writing documents and authoring code, and it's probably the best play in the long run for Microsoft to be flexible enough to support all the different providers. But at a certain point you'd think Microsoft would just start providing its own models?
OpenAI Expands Stargate With Five New AI Data Centers
OpenAI, Oracle, and SoftBank announced five new Stargate AI data center sites on September 23, 2025, accelerating their $500 billion infrastructure commitment ahead of schedule.
Expansion details:
- New Locations: Shackelford County TX, Doña Ana County NM, Lordstown OH, Milam County TX, plus one Midwest site
- Investment Scale: $400+ billion over three years, approaching full $500 billion commitment
- Capacity Growth: Nearly 7 gigawatts planned capacity, progressing toward 10-gigawatt goal
- Partnership Structure: Oracle leading three sites, SoftBank managing two through SB Energy
- Compute Offerings: New compute-intensive features for ChatGPT Pro subscribers with additional fees
- Timeline: Ahead of schedule to meet full commitment by end of 2025
(OpenAI Official, TechCrunch, Bloomberg, CNBC)
Scale, scale, scale. It feels like all the major providers are having infrastructure measuring contests at this point. I know they all think that more equals smarter. Given the global impact of these investments in infrastructure and energy costs, the entire world better hope that's true.
🛠️ Developer Tooling Updates
Cursor Ships Custom Commands and GPT-5-Codex Support
Cursor version 1.6 introduced custom slash commands on September 12, 2025, allowing developers to create reusable prompts in .cursor/commands/[command].md
files, plus integration with OpenAI's GPT-5-Codex model.
New features:
- Custom Commands: Define reusable prompts in
.cursor/commands/
directory, accessible via/
in Agent input - GPT-5-Codex Integration: Access to OpenAI's latest coding-optimized model (announced September 23)
- Command Examples: Light code reviews, test runners with auto-fix, PR comment resolution
- Website Evolution: Positioning as "the new way to build software" beyond just an IDE
- Developer Experience: Commands work with CLI integration for automated workflows
(Cursor Changelog, Cursor Documentation, Cursor Blog)
When I first used Cursor, I used to create a slash workflows directory and then @include the workflow file in the agent to essentially trigger that workflow. If you've been to any of my Cursor workshops, you saw how I set that up. Commands are essentially the evolution of having a workflow prompt you can keep in your project, share with the team, and trigger to take action. While I do love the idea of commands and the fact that they're coming to Cursor—they're available in many tools now, such as Claude Code and others—my favorite part is that everyone is standardizing on using the slash as a way to reference prompts that should take action. It's a great developer experience and I'm glad that everyone is on board with that now.
Chrome DevTools MCP Brings Browser Automation to AI Agents
Google launched Chrome DevTools MCP on September 23, 2025, enabling AI coding agents to control Chrome's debugging and automation capabilities through the Model Context Protocol.
MCP capabilities:
- Automation Control: Programmatic handling of clicks, form fills, dialogs, and navigation
- Performance Analysis: Record traces and extract optimization insights from web apps
- Advanced Debugging: Analyze network requests, console messages, screenshots, script evaluation
- Browser Emulation: Test with CPU slowdown, network throttling, various screen sizes
- Modern App Support: Specifically designed for complex single-page applications
- AI Integration: Works with Claude, Cursor, and any MCP-compatible tools
(Chrome Developers Blog, GitHub Repository)
Chrome exposing an MCP server is essentially AI web debugging. I know this is only a public preview, but as a sign of things to come, allowing any of the AI agents to check in with Chrome to see what's happening in browser windows is going to allow agents to do a lot more work and check their work—everything from debugging console logs and catching errors to inspecting layouts or traces to find performance issues. This is a huge first step towards allowing agents to better automate web development and I'm all here for it.
VS Code Introduces Auto Mode for Intelligent Model Selection
Microsoft released Auto Mode for VS Code on September 24, 2025, automatically selecting the optimal AI model based on task complexity and context.
Auto Mode features:
- Smart Model Routing: Automatically chooses between different AI models based on query complexity
- Context-Aware Selection: Considers file type, project context, and task requirements
- Cost Optimization: Uses lighter models for simple tasks, powerful models for complex ones
- Seamless Experience: No manual model switching required during development
- Performance Balance: Optimizes for both speed and accuracy based on needs
(VS Code Blog, Microsoft Developer Blog)
I know that VS Code started out a little bit behind, but they sure are catching up. If you haven't had a chance to try VS Code in lieu of the other AI editors out there, I strongly recommend installing it and checking out their plans to see if it's a good fit for you. They're doing an amazing job at responding to community feedback, and you can see the passion they're putting into all the new updates.
Vercel Releases Open Source Coding Agent Template
Vercel launched an open-source coding agent template on September 23, 2025, powered by AI SDK, AI Gateway, and Sandbox, providing infrastructure for building custom coding agents.
Template features:
- Multi-Agent Support: Works with Claude, Codex, Cursor, and opencode models
- Isolated Sandboxes: Each task runs in its own secure environment
- OpenAI-Compatible Gateway: Unified API for different model providers
- Parallel Execution: Run multiple agent tasks simultaneously
- GitHub Integration: Ready for CI/CD workflows and automation
- Cloud Execution: Keep local machines safe from unwanted changes
(GitHub Repository, Vercel Blog)
It's fascinating to watch the wave of containerized agents that you can run from the browser or trigger via webhooks. OpenAI's Codex has done it for a long time, and there are many others out there. Typically people compare these to pull request reviews that just look through code changes for bugs or syntax issues. But these containerized approaches are much more powerful because they allow for code execution and inspection that just looking at the code can't achieve.
Seeing Vercel execute on this is particularly interesting because they own the stack that their apps and services are hosted on. They can containerize on a platform where they're already building the applications and respond to errors that happen in the container. This really takes it to that next step of AI agents having fixes for you when things go wrong. Because Vercel has the logs, build errors, history, deployments, and everything in place where they can reference so many aspects of your project, they have an incredible amount of context to leverage. I'm going to keep a close eye on this one as I think Vercel is in an awesome spot to really execute on coding agents as part of their infrastructure—not just as an add-on that does a simple code review on GitHub.
🤖 AI Ecosystem Updates
Google's 2025 DORA Report: 90% of Developers Now Using AI
Google released the 2025 DORA Report in August 2025, revealing that 90% of software professionals use AI tools, with complex impacts on productivity and delivery stability.
Key findings:
- Adoption Rate: 90% of developers use AI (up from 76% in 2024), with 65% reporting "heavy" usage
- Productivity Impact: 80%+ report increased productivity, 59% report improved code quality
- Trust Gap: 30% have little to no trust in AI-generated code despite widespread adoption
- Delivery Metrics: AI improves throughput (reversing 2024's negative trend) but increases instability
- Time Investment: Developers spend median of 2 hours daily working with AI tools
- Success Factors: Platform engineering and strong foundations critical for AI value realization
(Google Cloud Blog, Google Developers Blog, The Register)
The 30% trust gap is the most interesting finding here. We're all using AI, but a third of us don't trust what it produces. I'm happy to see devs are still using their brains instead of blindly trusting the output.
DeepSeek Releases V3.1-Terminus With Agent Improvements
DeepSeek launched V3.1-Terminus on September 22, 2025, addressing user feedback with improved language consistency and stronger agent performance.
Model improvements:
- Language Consistency: Fixed Chinese/English mix-ups and random character generation
- Code Agent Upgrades: Enhanced performance for coding tasks and tool use
- Search Agent Enhancement: Better web search integration and result synthesis
- Stability Improvements: More reliable outputs for production use
- API Compatibility: Maintains compatibility with existing DeepSeek integrations
(DeepSeek Official, DeepSeek API)
Their code agent improvements are noticeable, though still not quite at Claude or GPT-5 level for complex tasks... Definitely worth keeping an eye on though.
OpenAI Study Reveals ChatGPT Usage Patterns
OpenAI released a large-scale study on September 22, 2025, analyzing how people use ChatGPT across personal and professional contexts.
Study highlights:
- Broad Adoption: Consumer adoption expanded beyond early tech adopters
- Value Creation: Significant economic value through both personal and professional use
- Usage Patterns: Diverse applications from coding to creative writing to research
- Professional Integration: Increasing use in workplace workflows and decision-making
- Learning Applications: Heavy use for education and skill development
AI tools are becoming essential in every field of work. The shift from "early adopter toy" to "professional necessity" happened faster than I would have predicted. I still haven't quite got my parents to fully embrace AI, but I'm working on it!
⚡ Quick Updates
Chan Zuckerberg Initiative Launches Educational AI Tools
- Knowledge Graph: AI-powered educational content organization
- Claude Connector: Direct integration with Claude for educational workflows
- Evaluators: Tools for assessing AI-generated educational content
- Trust Foundation: Building reliable AI tools teachers can depend on
(CZI Blog)
Vercel Ship AI 2025 Conference - October 23
- Expert Speakers: Teams from Anthropic, Slack, and Google DeepMind
- Topics: AI SDK, AI Gateway, and AI Cloud infrastructure
- Format: Virtual and in-person options available
- Early Bird: Tickets available with discount pricing
Sam Altman Announces Compute-Intensive Pro Features
- New Offerings: Compute-heavy features initially Pro-subscriber only
- Additional Fees: Some features will cost extra beyond Pro subscription
- Long-term Vision: Drive intelligence costs down over time
- Experimentation: Testing limits of current model capabilities
✨ Workshop Spotlight (Early Bird Pricing Ends Tonight 🚨)
Claude Code Power User Workshop - Friday, October 3rd
Following four sold-out sessions, the next Claude Code Power User workshop is scheduled for next Friday, October 3rd at 9am-2pm PDT.
What we'll cover:
- Building AI pipelines with multiple Claude instances
- MCP workflows (now with the Google Chrome MCP!)
- Context engineering for complex projects
- Real-world automation workflows
- Live Q&A and troubleshooting
Hope to see you there!
Register here: https://egghead.io/workshop/claude-code
Read this far? Share "AI Dev Essentials" with a friend! - https://egghead.io/newsletters/ai-dev-essentials
- John Lindquist