Hey Everyone 👋,
John Lindquist here with the 26th issue of AI Dev Essentials!
This week was absolutely packed with major releases across the AI development landscape. Claude Sonnet 4.5 dropped with impressive benchmarks, claiming the crown for best coding model available. OpenAI released Sora 2, their next-generation text-to-video model that powers a new TikTok-style social app for AI-generated videos with synchronized audio. The iOS app has already become one of the most downloaded on the App Store. Beyond these headline releases, Cursor shipped hooks for agent lifecycle control, OpenAI launched their commerce protocol with Stripe, and we saw significant updates from DeepSeek, Cerebras, and Google. The pace of innovation continues to accelerate, and the tooling is getting more sophisticated by the day.
On a personal note, I've been running Sonnet 4.5 through the ringer with advanced scenarios and exploring SDK customizations to gain ultimate control of agentic workflows through hooks. My major focus has been developing Claude Code agents that leverage the new Chrome DevTools MCP. Bespoke agents that inspect and analyze your projects through Chrome, automatically suggesting fixes for performance and SEO, then testing the changes. It's all really pretty incredible and I'm excited to teach it in my workshop tomorrow.
🚨 LAST CHANCE to grab tickets for tomorrow's workshop! 🚨 Register NOW
🎓 New egghead.io Lessons This Week
Claude Code Essentials Course Updates - View Course
Three new lessons on PreToolUse hooks for controlling Claude Code's tool execution:
- Block Tool Commands Before Execution - Intercept Bash, Write, or Edit commands before they execute using exit code 2 to block unwanted operations and enforce workflow conventions
- Guide Claude with Rich PreToolUse Feedback - Return structured JSON from PreToolUse hooks to deny commands with clear reasons and suggestions, creating a self-correcting feedback loop where Claude learns your conventions in real-time
- Enforce Global Rules with User-Level Hooks - Move PreToolUse hooks to ~/.claude/settings.json to enforce conventions like "always pnpm, never npm" across every project automatically from one central location
These lessons show how to build guardrails around Claude Code's tool usage—from simple command blocking to sophisticated self-correcting systems. PreToolUse hooks let you enforce conventions without constant manual intervention, whether you're preventing destructive operations or teaching Claude your team's preferred tools.
🚀 Major Announcements
Claude Sonnet 4.5: A New Standard for AI Coding
Anthropic released Claude Sonnet 4.5 on September 29, 2025, claiming it's the best coding model available. The benchmarks support this—it achieved 77.2% on SWE-bench Verified in standard runs (82.0% with test-time compute), outperforming GPT-5 Codex (74.5%) and Gemini 2.5 Pro (67.2%).
Performance highlights:
- SWE-bench Verified Leadership: 77.2% standard runs, 82.0% with parallel test-time compute
- 30-Hour Autonomous Coding: Built Slack-like chat app with ~11,000 lines of code unassisted
- OSWorld Benchmark: 61.4% (up from Sonnet 4's 42.2% four months prior)
- Computer Use Excellence: More than 3x as skilled at browser navigation compared to October 2024
- Reasoning & Math Gains: Substantial improvements across reasoning, math, and domain-specific tasks
- Pricing: $3/$15 per million tokens (same as Sonnet 4)
New Claude Code features:
- Checkpointing: Auto-saves state before each change, rewind with Esc twice or
/rewindcommand - VS Code Extension (Beta): Dedicated sidebar with inline diffs, real-time change viewing, automatic diagnostic sharing
- Subagent Support: Hooks and long-running autonomous task capabilities
- Imagine with Claude: 5-day research preview generating software on-the-fly (Max users only)
Industry validation from Cursor CEO Michael Truell: "state-of-the-art coding performance," and Windsurf CEO Jeff Wang: "new generation of coding models." Anthropic also notes it's "the most aligned frontier model we've ever released" with reduced sycophancy, deception, and power-seeking behaviors.
(Anthropic Official, Anthropic Claude Code Features, TechCrunch, VentureBeat, AWS Blog)
I've noticed a mixed reaction from the community. Some saying it's the best thing since sliced bread, others claiming it's no better than Sonnet 4.0. But expert opinions from people building professional agentic tooling, like the teams at Cognition Labs (they build Devin and Windsurf), have been extremely impressed. Their opinions matter much more to me than random comments on social media. Personally, I've had a lot of luck with it. It's completed tasks with minimal prompts and followed instructions really well. Overall, I'm thrilled and looking forward to pushing it even further.
Sora 2 Launches with Synchronized Audio and Social Features
OpenAI released Sora 2 on September 30, 2025, their next-generation text-to-video model that powers a new TikTok-style social app. Sora 2 generates short AI videos with synchronized audio (dialogue and sound effects) from text prompts, packaged in a vertical "For You" feed with likes, comments, and remixing capabilities. The iOS app became one of the most downloaded apps on the App Store within days of launch.
What is Sora 2:
- Text-to-Video Model: Generate short videos with both visuals and synchronized audio from text prompts
- Social Video App: TikTok-style vertical feed with swipes, likes, comments, and remix features
- Digital Cameos: Users can register their likeness (voice + facial scan) to be inserted into AI-generated scenes with permission and notifications
- Invite-Only Launch: Currently iOS-only, available in US and Canada
Key improvements over Sora 1:
- Synchronized Audio: Integrated audio and lip-syncing that matches visuals, unlike earlier silent versions
- Better Physical Realism: More convincing motion, object interactions, and physical dynamics
- Enhanced Controllability: Greater precision and control over generated content
- Stricter Guardrails: Robust filters, identity consent mechanisms, restrictions on public figures without permission
(OpenAI Official Blog, TechCrunch, VentureBeat, Wired, The Verge)
I've generated about 100 videos, and I feel like 1 out of 10 are really impressive. But there seems to be a lot of limitations, especially around the length of the prompt and the ability to customize. And it unfortunately hasn't been able to create a video of me dunking a basketball, which I could send to my wife and brag about.
Claude Now Available in Slack
Anthropic announced on October 1, 2025, that Claude is now available directly in Slack through two integration methods, bringing AI assistance to where teams already collaborate.
Integration features:
- Slack App Marketplace: Available for paid Slack plans with DM chat, @mentions in threads, AI assistant panel
- MCP Integration: Claude Team and Enterprise plans can connect via Model Context Protocol
- Full Capabilities: Web search, document analysis, connected tools access
- Deep Workspace Context: Search channels, DMs, and files for meeting prep, project updates, documentation
- Permission-Aware: Respects existing Slack access controls
(Anthropic Official, Slack Blog)
This just barely released, so I haven't had a chance to check it out. I'll save judgment. I'm also not really someone who lives in Slack. I tolerate it and only check when necessary. For teams that do live in Slack though, having Claude available where they already work seems convenient.
Claude Launches "Built with Sonnet 4.5" Challenge
Anthropic is running a week-long challenge from October 1-7, 2025, inviting developers to showcase projects built with Claude Sonnet 4.5, with four winners receiving one year of Claude Max 20x and $1k in API credits.
Contest categories:
- "Keep Coding" Award: Most technically impressive implementation
- "Keep Researching" Award: Most compelling exploration of a topic
- "Keep Learning" Award: Best educational application
- "Keep Creating" Award: Most artistic use-case
Entry requirements: Quote post the announcement tweet by October 7 with project details, screenshots/demos, and explanation of how it was built with Sonnet 4.5. Winners announced by October 10.
Put on your thinking caps (or ask AI to come up with ideas for you), and go win some money! I bet the exposure would be pretty amazing and might even land you a job somewhere...
🛠️ Developer Tooling Updates
Cursor 1.7 Ships with Agent Lifecycle Hooks
Cursor released version 1.7 on September 29, 2025, introducing Hooks—a beta system for observing, controlling, and extending the agent loop using custom scripts.
New features:
- Hooks (Beta): Script every part of agent lifecycle for auditing, command blocking, secret redaction
- Agent Autocomplete: Suggestions appear while typing prompts, Tab to accept
- Shareable Deeplinks: Share prompts with others via simple links
- Team-wide Rules: Shared rules across all projects, including Bugbot for code reviews
- Menubar Support: Check agent status directly from menubar
- Browser Control (Early Preview): Agent can take screenshots, improve UI, debug client issues with Sonnet 4.5
(Cursor Official Changelog, Cursor Deeplinks Docs, InfoQ, Cursor Community Forum)
I think my favorite new feature is the Agent Autocomplete feature in the agent text box. There's just something about hitting tab and selecting a suggested text that just feels so good. I'm going to have to dive into the new Cursor hooks features and compare them to Claude Code hooks. It's actually making me want to spin up new Cursor workshops to dive into these more advanced workflows.
OpenAI and Stripe Launch Agentic Commerce Protocol
OpenAI and Stripe co-developed the Agentic Commerce Protocol (ACP) on September 29, 2025, a new open standard for AI-driven commerce that enables programmatic transactions between buyers, AI agents, and businesses.
Protocol features:
- Open Standard: Co-developed with Stripe for agent-initiated commerce flows
- Shared Payment Tokens (SPT): Initiate payments without exposing buyer credentials
- Instant Checkout in ChatGPT: Available now for US Etsy sellers (US ChatGPT users)
- Shopify Integration: Coming soon for 1M+ merchants including Glossier, Vuori, Spanx, SKIMS
- Broad Support: Early partners include Microsoft Copilot, Anthropic, Perplexity, Vercel, Lovable, Replit
- Merchant Control: Full control over products, branding, and fulfillment
(Stripe Official Newsroom, Stripe Blog, OpenAI Official Blog, Shopify News, TechCrunch)
This definitely smells like ChatGPT is going to start producing ads. They've mentioned that the agents will start sending you things unprompted, and they're setting up ways for learning from users and finding what to sell them. You can imagine how much they could challenge Instagram. When you're doom-scrolling through Instagram, you always see things you want to buy. How much more targeted will ChatGPT be able to sell you things based on all the questions you've asked about mental health and fashion? I bet they'll make a bajillion dollars by keeping ChatGPT free and selling people's stuff through conversations.
GLM-4.6 Released with 200K Context Window
Z.ai released GLM-4.6 on September 30, 2025, an open-source 355B-parameter coding model with a 200K token context window and improved agentic capabilities.
Model improvements:
- Near Sonnet 4 Parity: 48.6% win rate vs Claude Sonnet 4 on CC-Bench real-world coding tests
- Token Efficiency: Completes tasks with ~15% fewer tokens than GLM-4.5
- Expanded Context: 200K token input window (up from 128K)
- Cost Advantage: $0.60/$2 per 1M tokens (5x cheaper than Claude Sonnet 4.5)
- Open Weights: MIT license for local deployment via vLLM and SGLang
- Agent Integration: Works with Claude Code, Cline, Roo Code, Kilo Code
(Z.ai Official Blog, Z.ai Documentation, South China Morning Post, MarkTechPost)
This model is so stinking cheap that it's definitely something I've been experimenting with as well. It's kind of a fallback model in case I start running into limits with Claude Code. It integrates into Claude Code perfectly as far as calling MCPs and subagents and everything. I don't have a final verdict on it, but it's been working pretty well. I did have some scenarios where I think it could have thought through some of the issues a bit deeper, where I'm sure Sonnet 4.5 would have solved it. But other than that, for the price and performance, it's been pretty remarkable.
DeepSeek V3.2-Exp Cuts API Costs in Half
DeepSeek released V3.2-Exp on September 29, 2025, an experimental model introducing DeepSeek Sparse Attention (DSA)—a new architecture designed to optimize training and inference efficiency for long-context operations.
Technical improvements:
- Sparse Attention Architecture: DSA mechanism focuses compute where it matters most
- 50%+ Cost Reduction: API pricing down to $0.028 per million input tokens (cache hits)
- V3.1-Terminus Foundation: Built on proven architecture with efficiency improvements
- Open Weights: Released on Hugging Face under MIT License
- Multi-Platform: Available on App, Web, and API
(DeepSeek Official Docs, TechCrunch, VentureBeat)
DeepSeek keeps on improving. I've just been so busy with everything else I've mentioned that I haven't even had a spare second to test it out. But it's definitely high on my list, and I love seeing the open source models continue to grow.
⚡ Quick Updates
Cerebras Raises $1.1B at $8.1B Valuation
- Series G Funding: $1.1B led by Fidelity Management & Research Company and Atreides Management
- Post-money Valuation: $8.1 billion
- Performance Claims: Up to 21x faster inference than Nvidia DGX B200 on specific workloads
- IPO Status: Delayed due to CFIUS regulatory review
(Cerebras Press Release, Business Wire, TechCrunch)
Google Adds Visual Search to AI Mode
- Visual Search Fan-out: Runs multiple related queries in background for better understanding
- Multimodal Integration: Combines Google Search, Lens, and Image search with Gemini 2.5
- Shopping Focus: Initial emphasis on product discovery with 50B+ product listings
- US English Rollout: Available to US English users
(Google Blog, Google Shopping Blog, 9to5Google)
Anthropic Publishes Context Engineering Guide
- New Framework: "The art and science of curating what goes into the limited context window"
- Agent Definition: "LLMs autonomously using tools in a loop"
- Strategic Focus: Managing context effectively in agentic workflows
Google AI Studio Offers Free Voice Agent Development
- Live API Access: Build voice AI agents with Gemini Live for free
- Generous Limits: 60 model requests per minute, 1,000 requests per day
- Voice Features: Voice activity detection, tool use, function calling, session management
- Consumer Access: Gemini Live with camera and screen sharing free on Android and iOS
(Google AI Developers: Live API, Google AI Developers: Pricing)
LlamaIndex Releases Semantic Filesystem Tools
- Claude Code Integration: Semantic search over any documents via CLI
- Persistent Workspaces: Manually or agentically create indexes over file subsets
- Dynamic Indexing: Agents can create and use indexes instead of rebuilding each time
- Tool Chaining: Works with grep, cat, and other CLI tools
✨ Workshop Spotlight
🚨 LAST CHANCE 🚨 Claude Code Power User Workshop - TOMORROW, October 3rd
Following four sold-out sessions, the next Claude Code Power User workshop is scheduled for TOMORROW, October 3rd at 9am-2pm PDT.
🚨 This is your absolute last chance to grab tickets before the workshop! 🚨
What we'll cover:
- Building AI pipelines with multiple Claude instances
- MCP workflows and server development
- Context engineering for complex projects
- Real-world automation workflows
- Live Q&A and troubleshooting
Hope to see you there!
Register NOW: https://egghead.io/workshop/claude-code
Read this far? Share "AI Dev Essentials" with a friend! - https://egghead.io/newsletters/ai-dev-essentials
- John Lindquist