AI Dev Essentials #26: Claude Sonnet 4.5, Sora 2, and New Cursor Hooks

Hey Everyone 👋,

John Lindquist here with the 26th issue of AI Dev Essentials!

This week was absolutely packed with major releases across the AI development landscape. Claude Sonnet 4.5 dropped with impressive benchmarks, claiming the crown for best coding model available. OpenAI released Sora 2, their next-generation text-to-video model that powers a new TikTok-style social app for AI-generated videos with synchronized audio. The iOS app has already become one of the most downloaded on the App Store. Beyond these headline releases, Cursor shipped hooks for agent lifecycle control, OpenAI launched their commerce protocol with Stripe, and we saw significant updates from DeepSeek, Cerebras, and Google. The pace of innovation continues to accelerate, and the tooling is getting more sophisticated by the day.

On a personal note, I've been running Sonnet 4.5 through the ringer with advanced scenarios and exploring SDK customizations to gain ultimate control of agentic workflows through hooks. My major focus has been developing Claude Code agents that leverage the new Chrome DevTools MCP. Bespoke agents that inspect and analyze your projects through Chrome, automatically suggesting fixes for performance and SEO, then testing the changes. It's all really pretty incredible and I'm excited to teach it in my workshop tomorrow.

🚨 LAST CHANCE to grab tickets for tomorrow's workshop! 🚨 Register NOW

🎓 New egghead.io Lessons This Week

Claude Code Essentials Course Updates - View Course

Three new lessons on PreToolUse hooks for controlling Claude Code's tool execution:

Block Tool Commands Before Execution - Intercept Bash, Write, or Edit commands before they execute using exit code 2 to block unwanted operations and enforce workflow conventions
Guide Claude with Rich PreToolUse Feedback - Return structured JSON from PreToolUse hooks to deny commands with clear reasons and suggestions, creating a self-correcting feedback loop where Claude learns your conventions in real-time
Enforce Global Rules with User-Level Hooks - Move PreToolUse hooks to ~/.claude/settings.json to enforce conventions like "always pnpm, never npm" across every project automatically from one central location

These lessons show how to build guardrails around Claude Code's tool usage—from simple command blocking to sophisticated self-correcting systems. PreToolUse hooks let you enforce conventions without constant manual intervention, whether you're preventing destructive operations or teaching Claude your team's preferred tools.

🚀 Major Announcements

Claude Sonnet 4.5: A New Standard for AI Coding

Anthropic released Claude Sonnet 4.5 on September 29, 2025, claiming it's the best coding model available. The benchmarks support this—it achieved 77.2% on SWE-bench Verified in standard runs (82.0% with test-time compute), outperforming GPT-5 Codex (74.5%) and Gemini 2.5 Pro (67.2%).

Performance highlights:

SWE-bench Verified Leadership: 77.2% standard runs, 82.0% with parallel test-time compute
30-Hour Autonomous Coding: Built Slack-like chat app with ~11,000 lines of code unassisted
OSWorld Benchmark: 61.4% (up from Sonnet 4's 42.2% four months prior)
Computer Use Excellence: More than 3x as skilled at browser navigation compared to October 2024
Reasoning & Math Gains: Substantial improvements across reasoning, math, and domain-specific tasks
Pricing: $3/$15 per million tokens (same as Sonnet 4)

New Claude Code features:

Checkpointing: Auto-saves state before each change, rewind with Esc twice or /rewind command
VS Code Extension (Beta): Dedicated sidebar with inline diffs, real-time change viewing, automatic diagnostic sharing
Subagent Support: Hooks and long-running autonomous task capabilities
Imagine with Claude: 5-day research preview generating software on-the-fly (Max users only)

Industry validation from Cursor CEO Michael Truell: "state-of-the-art coding performance," and Windsurf CEO Jeff Wang: "new generation of coding models." Anthropic also notes it's "the most aligned frontier model we've ever released" with reduced sycophancy, deception, and power-seeking behaviors.

(Anthropic Official, Anthropic Claude Code Features, TechCrunch, VentureBeat, AWS Blog)

I've noticed a mixed reaction from the community. Some saying it's the best thing since sliced bread, others claiming it's no better than Sonnet 4.0. But expert opinions from people building professional agentic tooling, like the teams at Cognition Labs (they build Devin and Windsurf), have been extremely impressed. Their opinions matter much more to me than random comments on social media. Personally, I've had a lot of luck with it. It's completed tasks with minimal prompts and followed instructions really well. Overall, I'm thrilled and looking forward to pushing it even further.

Sora 2 Launches with Synchronized Audio and Social Features

OpenAI released Sora 2 on September 30, 2025, their next-generation text-to-video model that powers a new TikTok-style social app. Sora 2 generates short AI videos with synchronized audio (dialogue and sound effects) from text prompts, packaged in a vertical "For You" feed with likes, comments, and remixing capabilities. The iOS app became one of the most downloaded apps on the App Store within days of launch.

What is Sora 2:

Text-to-Video Model: Generate short videos with both visuals and synchronized audio from text prompts
Social Video App: TikTok-style vertical feed with swipes, likes, comments, and remix features
Digital Cameos: Users can register their likeness (voice + facial scan) to be inserted into AI-generated scenes with permission and notifications
Invite-Only Launch: Currently iOS-only, available in US and Canada

Key improvements over Sora 1:

Synchronized Audio: Integrated audio and lip-syncing that matches visuals, unlike earlier silent versions
Better Physical Realism: More convincing motion, object interactions, and physical dynamics
Enhanced Controllability: Greater precision and control over generated content
Stricter Guardrails: Robust filters, identity consent mechanisms, restrictions on public figures without permission

(OpenAI Official Blog, TechCrunch, VentureBeat, Wired, The Verge)

I've generated about 100 videos, and I feel like 1 out of 10 are really impressive. But there seems to be a lot of limitations, especially around the length of the prompt and the ability to customize. And it unfortunately hasn't been able to create a video of me dunking a basketball, which I could send to my wife and brag about.

Claude Now Available in Slack

Anthropic announced on October 1, 2025, that Claude is now available directly in Slack through two integration methods, bringing AI assistance to where teams already collaborate.

Integration features:

Slack App Marketplace: Available for paid Slack plans with DM chat, @mentions in threads, AI assistant panel
MCP Integration: Claude Team and Enterprise plans can connect via Model Context Protocol
Full Capabilities: Web search, document analysis, connected tools access
Deep Workspace Context: Search channels, DMs, and files for meeting prep, project updates, documentation
Permission-Aware: Respects existing Slack access controls

(Anthropic Official, Slack Blog)

This just barely released, so I haven't had a chance to check it out. I'll save judgment. I'm also not really someone who lives in Slack. I tolerate it and only check when necessary. For teams that do live in Slack though, having Claude available where they already work seems convenient.

Claude Launches "Built with Sonnet 4.5" Challenge

Anthropic is running a week-long challenge from October 1-7, 2025, inviting developers to showcase projects built with Claude Sonnet 4.5, with four winners receiving one year of Claude Max 20x and $1k in API credits.

Contest categories:

"Keep Coding" Award: Most technically impressive implementation
"Keep Researching" Award: Most compelling exploration of a topic
"Keep Learning" Award: Best educational application
"Keep Creating" Award: Most artistic use-case

Entry requirements: Quote post the announcement tweet by October 7 with project details, screenshots/demos, and explanation of how it was built with Sonnet 4.5. Winners announced by October 10.

(Contest Official Rules)

Put on your thinking caps (or ask AI to come up with ideas for you), and go win some money! I bet the exposure would be pretty amazing and might even land you a job somewhere...

🛠️ Developer Tooling Updates

Cursor 1.7 Ships with Agent Lifecycle Hooks

Cursor released version 1.7 on September 29, 2025, introducing Hooks—a beta system for observing, controlling, and extending the agent loop using custom scripts.

New features:

Hooks (Beta): Script every part of agent lifecycle for auditing, command blocking, secret redaction
Agent Autocomplete: Suggestions appear while typing prompts, Tab to accept
Shareable Deeplinks: Share prompts with others via simple links
Team-wide Rules: Shared rules across all projects, including Bugbot for code reviews
Menubar Support: Check agent status directly from menubar
Browser Control (Early Preview): Agent can take screenshots, improve UI, debug client issues with Sonnet 4.5

(Cursor Official Changelog, Cursor Deeplinks Docs, InfoQ, Cursor Community Forum)

I think my favorite new feature is the Agent Autocomplete feature in the agent text box. There's just something about hitting tab and selecting a suggested text that just feels so good. I'm going to have to dive into the new Cursor hooks features and compare them to Claude Code hooks. It's actually making me want to spin up new Cursor workshops to dive into these more advanced workflows.

OpenAI and Stripe Launch Agentic Commerce Protocol

OpenAI and Stripe co-developed the Agentic Commerce Protocol (ACP) on September 29, 2025, a new open standard for AI-driven commerce that enables programmatic transactions between buyers, AI agents, and businesses.

Protocol features:

Open Standard: Co-developed with Stripe for agent-initiated commerce flows
Shared Payment Tokens (SPT): Initiate payments without exposing buyer credentials
Instant Checkout in ChatGPT: Available now for US Etsy sellers (US ChatGPT users)
Shopify Integration: Coming soon for 1M+ merchants including Glossier, Vuori, Spanx, SKIMS
Broad Support: Early partners include Microsoft Copilot, Anthropic, Perplexity, Vercel, Lovable, Replit
Merchant Control: Full control over products, branding, and fulfillment

(Stripe Official Newsroom, Stripe Blog, OpenAI Official Blog, Shopify News, TechCrunch)

This definitely smells like ChatGPT is going to start producing ads. They've mentioned that the agents will start sending you things unprompted, and they're setting up ways for learning from users and finding what to sell them. You can imagine how much they could challenge Instagram. When you're doom-scrolling through Instagram, you always see things you want to buy. How much more targeted will ChatGPT be able to sell you things based on all the questions you've asked about mental health and fashion? I bet they'll make a bajillion dollars by keeping ChatGPT free and selling people's stuff through conversations.

GLM-4.6 Released with 200K Context Window

Z.ai released GLM-4.6 on September 30, 2025, an open-source 355B-parameter coding model with a 200K token context window and improved agentic capabilities.

Model improvements:

Near Sonnet 4 Parity: 48.6% win rate vs Claude Sonnet 4 on CC-Bench real-world coding tests
Token Efficiency: Completes tasks with ~15% fewer tokens than GLM-4.5
Expanded Context: 200K token input window (up from 128K)
Cost Advantage: $0.60/$2 per 1M tokens (5x cheaper than Claude Sonnet 4.5)
Open Weights: MIT license for local deployment via vLLM and SGLang
Agent Integration: Works with Claude Code, Cline, Roo Code, Kilo Code

(Z.ai Official Blog, Z.ai Documentation, South China Morning Post, MarkTechPost)

This model is so stinking cheap that it's definitely something I've been experimenting with as well. It's kind of a fallback model in case I start running into limits with Claude Code. It integrates into Claude Code perfectly as far as calling MCPs and subagents and everything. I don't have a final verdict on it, but it's been working pretty well. I did have some scenarios where I think it could have thought through some of the issues a bit deeper, where I'm sure Sonnet 4.5 would have solved it. But other than that, for the price and performance, it's been pretty remarkable.

DeepSeek V3.2-Exp Cuts API Costs in Half

DeepSeek released V3.2-Exp on September 29, 2025, an experimental model introducing DeepSeek Sparse Attention (DSA)—a new architecture designed to optimize training and inference efficiency for long-context operations.

Technical improvements:

Sparse Attention Architecture: DSA mechanism focuses compute where it matters most
50%+ Cost Reduction: API pricing down to $0.028 per million input tokens (cache hits)
V3.1-Terminus Foundation: Built on proven architecture with efficiency improvements
Open Weights: Released on Hugging Face under MIT License
Multi-Platform: Available on App, Web, and API

(DeepSeek Official Docs, TechCrunch, VentureBeat)

DeepSeek keeps on improving. I've just been so busy with everything else I've mentioned that I haven't even had a spare second to test it out. But it's definitely high on my list, and I love seeing the open source models continue to grow.

⚡ Quick Updates

Cerebras Raises $1.1B at $8.1B Valuation

Series G Funding: $1.1B led by Fidelity Management & Research Company and Atreides Management
Post-money Valuation: $8.1 billion
Performance Claims: Up to 21x faster inference than Nvidia DGX B200 on specific workloads
IPO Status: Delayed due to CFIUS regulatory review

(Cerebras Press Release, Business Wire, TechCrunch)

Google Adds Visual Search to AI Mode

Visual Search Fan-out: Runs multiple related queries in background for better understanding
Multimodal Integration: Combines Google Search, Lens, and Image search with Gemini 2.5
Shopping Focus: Initial emphasis on product discovery with 50B+ product listings
US English Rollout: Available to US English users

(Google Blog, Google Shopping Blog, 9to5Google)

Anthropic Publishes Context Engineering Guide

New Framework: "The art and science of curating what goes into the limited context window"
Agent Definition: "LLMs autonomously using tools in a loop"
Strategic Focus: Managing context effectively in agentic workflows

(Anthropic Engineering Blog)

Google AI Studio Offers Free Voice Agent Development

Live API Access: Build voice AI agents with Gemini Live for free
Generous Limits: 60 model requests per minute, 1,000 requests per day
Voice Features: Voice activity detection, tool use, function calling, session management
Consumer Access: Gemini Live with camera and screen sharing free on Android and iOS

(Google AI Developers: Live API, Google AI Developers: Pricing)

LlamaIndex Releases Semantic Filesystem Tools

Claude Code Integration: Semantic search over any documents via CLI
Persistent Workspaces: Manually or agentically create indexes over file subsets
Dynamic Indexing: Agents can create and use indexes instead of rebuilding each time
Tool Chaining: Works with grep, cat, and other CLI tools

(LlamaIndex GitHub: semtools)

✨ Workshop Spotlight

🚨 LAST CHANCE 🚨 Claude Code Power User Workshop - TOMORROW, October 3rd

Following four sold-out sessions, the next Claude Code Power User workshop is scheduled for TOMORROW, October 3rd at 9am-2pm PDT.

🚨 This is your absolute last chance to grab tickets before the workshop! 🚨

What we'll cover:

Building AI pipelines with multiple Claude instances
MCP workflows and server development
Context engineering for complex projects
Real-world automation workflows
Live Q&A and troubleshooting

Hope to see you there!

Register NOW: https://egghead.io/workshop/claude-code

Read this far? Share "AI Dev Essentials" with a friend! - https://egghead.io/newsletters/ai-dev-essentials

John Lindquist

https://egghead.io