AI Dev Essentials #11: Gemini 2.5 Stable, MCP goes mainstream

John Lindquist
Instructor

John Lindquist

Hey Everyone πŸ‘‹,

John Lindquist here with the eleventh issue of AI Dev Essentials! I've been digging more and more into using Claude Code to spin up projects. Here's a couple of fun ideas I've been putting together using only Claude Code:

  • dotagent (https://www.npmjs.com/package/dotagent) - A tool for syncing rules/instructions across Cursor/Claude/Windsurf/etc.
  • chromancer (https://www.npmjs.com/package/chromancer) - A CLI for interacting with Chrome. It wraps around Claude Code to generate and save automated browser workflows.
  • Script Kit (https://www.scriptkit.com/) - I added MCP support so Cursor/etc can invoke your scripts.

There's obviously no way I could've done this much work in the past week without using AI agents (mostly Claude Code).

I'm still deterimining guidelines for Cursor vs Claude Code. Cursor recently introduced a new Ultra plan that's $200/month and has 20x more usage than Pro which should allow you to run o3 (an awesome model) full time. My quick, too-early take:

  • Cursor gives you more control and customizations. It's perfect for "focused" work where you want to be an active participant in the process.
  • Claude Code is great for "automated" work where you just want to leave the desk and come back to to review a feature.

As always, the model used will always be more important than the tool. So since Cursor can also use Claude Opus/Sonnet and has backgound agents + bugbot, it's just a more fully-featured offering which can only continue to improve. I love CLIs (I'm mean I really, really love CLIS), but there always comes a point where you need to open a GUI. Anyway, I'll be continuing to use both for now.

I was also just accepted into the OpenAI Open Source fund which granted me $5000 in credits to use the Codex CLI (OpenAI's version of Claude Code), so I should have strong opinions on that very soon. I love OpenAI's Codex through the web interface where it can spin up your project and submit PRs, so I have high hopes for the CLI.

I've honestly had more fun programming over the past couple of months than I've had during my entire 20+ year career!

πŸŽ“ New egghead.io Lessons This Week

The Building Local AI Agents with Ollama and the AI SDK course is now live! I've released the first 5 lessons covering:

  • Setting up Ollama for local AI development
  • Building your first agent with the AI SDK
  • Implementing tool calling and function execution
  • Managing agent state and conversation flow
  • Deploying local agents with persistent memory

Check out the course at https://egghead.io/courses/scripting-local-language-models-with-ollama-and-the-vercel-ai-sdk~gmi9k


πŸ€– Model & Platform Updates

Gemini 2.5 Family Goes Stable

Google has restructured their Gemini 2.5 lineup with significant improvements:

Gemini 2.5 Pro (Stable, no changes from 06-05)

  • Now the stable long-term support model
  • Maintains existing pricing structure
  • "gemini-2.5-pro" model name for production use

Gemini 2.5 Flash (Stable, updated pricing)

  • Simplified pricing based on developer feedback
  • Performance improvements from 05-20 variant
  • Now the go-to model for cost-conscious applications

NEW: Gemini 2.5 Flash-Lite (Preview)

  • Small reasoning model at just $0.10/1M input, $0.40/1M output
  • Pushes the cost/intelligence frontier
  • Ideal for high-volume applications requiring reasoning capabilities

(Google Blog, Google Cloud Blog)

The introduction of Flash-Lite is particularly exciting. We're seeing Google compete aggressively on the cost front while maintaining reasoning capabilities. This could be a game-changer for applications that need to process massive amounts of data with some level of intelligence. Honestly, I'm kinda glad they've shipped 2.5 so they can focus on whatever 3.0 is going to be πŸ˜…

OpenAI's Open Source Model Progress Update

At Y Combinator's AI startup school, Sam Altman provided an update on OpenAI's open-source model initiative, first announced in March 2025. The model, which will be OpenAI's first open-weight release since GPT-2 in 2019, is being developed with reasoning capabilities similar to o3-mini and is aimed for release in early summer.

Key details:

  • Text-in, text-out model designed for high-end consumer hardware
  • Highly permissive license with few usage restrictions
  • Toggle-able reasoning capabilities
  • Led by VP of Research Aidan Clark

(TechCrunch, Reuters, VentureBeat)

OpenAI has been gathering community feedback since March, and Altman himself admitted they've been "on the wrong side of history" regarding open source. This shift is clearly influenced by the success of Meta's Llama models (over 1 billion downloads) and pressure from competitors like DeepSeek. I'm looking forward to seeing how this balances OpenAI's commercial interests with community needs.

Revolutionary AI Performance in Medical Research

A combination of GPT-4.1, o3-mini-high, and Gemini 2.0 Flash achieved superhuman performance in systematic medical reviews (medRxiv):

  • Completed 12 work-years of Cochrane reviews in just 2 days
  • 96.7% sensitivity vs 81.7% for human reviewers
  • Found 54 additional eligible studies missed by original authors
  • Generated new statistically significant findings in 2 reviews

The system correctly identified all originally included studies while discovering critical findings humans missed, like preoperative immune-enhancing supplementation reducing hospital stays by one day.

This is the kind of AI application that genuinely excites me. It's not replacing researchers but augmenting them to find insights that would otherwise be missed. The fact that blinded reviewers sided with the AI 69.3% of the time when there were disagreements speaks volumes.


πŸ”§ Developer Tooling & MCP Updates

MCP Goes Mainstream

The Model Context Protocol is having a moment:

Cloudflare Open Sources use-mcp

  • React library connecting to any MCP server in 3 lines of code
  • Supports SSE and Streamable HTTPS remote servers
  • Handles OAuth flow automatically
  • Part of the official MCP organization for reference implementation

(Cloudflare Blog, GitHub)

OpenAI Adds MCP Support to ChatGPT

  • Limited initially to search and fetch tools for deep research
  • Remote MCP with OAuth support
  • Documentation available at platform.openai.com/docs/mcp

Google Cloud Run MCP Tutorial

  • Deploy remote MCP servers in under 10 minutes
  • Comprehensive guide for production deployments

(Google Cloud Blog)

Block's MCP Best Practices

  • Lessons from building 60+ MCP servers
  • Design patterns and architecture recommendations

(Block Engineering Blog)

MCP is becoming the standard for AI tool integration. The fact that major players are all adopting it suggests we're moving toward a more interoperable AI ecosystem. I'll definietly have more educational materials around MCPs soon!

OpenAI's Prompt Management Revolution

OpenAI introduced Prompts as an API primitive:

  • Centrally manage, version, and optimize prompts
  • Use across Playground, API, Evals, and Stored Completions
  • Preconfigured with tools, models, and messages
  • New "Optimize" button in Playground for API optimization
  • Replaces the old presets system

(OpenAI Community, Documentation)

Finally! Prompt management has been a pain point for production AI applications. This brings prompts to the same level as other API resources, making it easier to maintain consistency across teams and applications.

Developer Resources

models.dev - Open Source AI Model Database

  • Comprehensive pricing and limits information
  • Powers opencode under the hood
  • Community-contributed with API access
  • Star and contribute at models.dev

(Official Site, GitHub)

Container-use Now Works with OpenAI Codex CLI

  • Solomon Hykes confirmed integration
  • Enables containerized development workflows

(GitHub)

Giving AI Agents enough control to rm -rf is why the community is so bullish on containers for everything!


πŸ’» Cursor Corner

Cursor Ultra Plan Launches at $200/month

Cursor introduced a new Ultra tier with 20x more usage than Pro:

  • Addresses heavy user demands for unlimited-style usage
  • Positioned between Pro and enterprise offerings
  • Reflects the growing appetite for AI-assisted development at scale

(Cursor Blog, TechCrunch)

This is exactly what I've been hoping for! The mental overhead of tracking usage on Pro was real. While $200/month isn't cheap, for professional developers who live in Cursor, it's a no-brainer. I'm curious to see if this pressures other tools to offer similar "all you can eat" plans. There's a bit of buzz around how ambigous the messaging/pricing and it's obviously frustrating for anyone trying to budget out AI costs.

πŸš€ Quick Updates

  • OpenAI Codex Best-of-N: Generate multiple solutions simultaneously, rolling out to Pro, Enterprise, Team, Edu, and Plus users (Changelog)
  • Claude Code "ultrathink" Trigger: Use thinking triggers for better results, with "ultrathink" being the most powerful (Anthropic Docs)
  • Google Labs Flow Updates: Complete tutorial thread on ingredients, frames, camera controls, and scene builder for Veo 3 (Google Blog, FAQ)
  • ElevenLabs v3: Revolutionary emotion control with audio tags like [excited], [whispers], [sighs], [laughs] (ElevenLabs v3, Blog)

πŸ”’ Security & Best Practices

The Lethal Trifecta Warning

Simon Willison echoed many of my thoughts around MCP+Security. Definitely worth a read: (Blog Post):

The Lethal Trifecta = Private Data + Untrusted Content + External Communication

Any AI system with all three components is vulnerable to data exfiltration attacks. This is especially critical for MCP users who might inadvertently combine:

  • File system access (private data)
  • Web browsing (untrusted content)
  • API calls or messaging (external communication)

Mitigation Strategies:

  • Never combine all three capabilities in one agent
  • Implement strict sandboxing between components
  • Use human-in-the-loop verification for sensitive operations
  • Audit your MCP tool combinations carefully

This is the kind of security thinking we need more of. As we give AI agents more capabilities, we need to think adversarially about how they could be exploited. The MCP ecosystem makes it trivially easy to create vulnerable combinations.


⚑ AI Ecosystem Observations

Google Cloud IAM Outage Cascade

A major lesson in infrastructure dependencies:

  • Google Cloud IAM failure cascaded to Cloudflare, Anthropic, Spotify, Discord, and Replit
  • Highlighted single points of failure in the AI infrastructure stack
  • Key takeaway: Implement graceful fallbacks and avoid single-provider dependencies

(Full Analysis)

Did you feel this? I sure did. That was a strange couple of hours where I felt lost without my AI agents... I'm just glad it was a simple failure and not a more serious issue.

AI Agents Playing Diplomacy

Fascinating behavioral patterns emerged:

  • Claude: Couldn't lie, was exploited by other players
  • Gemini 2.5 Pro: Nearly conquered Europe with tactical brilliance
  • o3: Orchestrated secret coalitions, backstabbed allies, won

These experiments reveal fundamental differences in how models approach strategy and deception. When choosing models for your applications, consider not just capabilities but behavioral tendencies. I'm sure we'll see more about this in the future.

AGI as Product Experience

Logan Kilpatrick's perspective on AGI resonated strongly this week:

  • AGI won't be a single model breakthrough
  • Product experience integrating memory, reasoning, and context matters more
  • Small model improvements + great design = "AGI moment" for users
  • The narrative is shifting from model-centric to product-centric

(Source)


🚨 URGENT: Cursor Workshop Early Bird Ends TOMORROW! 🚨

Turn Failures into Fuel with Cursor - Save $49 NOW!

⏰ LAST CHANCE: Early bird pricing expires in less than 24 hours!

I'm genuinely excited to share this workshop with you. After months of pushing the boundaries with AI coding tools, I've distilled everything into a 6-hour intensive that will transform how you work with Cursor (and any AI coding assistant).

Why This Workshop is Different:

  • Real workflows, not tutorials: Learn the exact processes I use daily to build production apps
  • Failure-focused approach: Turn every AI mishap into a learning opportunity
  • Tool-agnostic principles: Master concepts that work across Cursor, Windsurf, Copilot, and beyond
  • Live debugging sessions: Watch me handle real errors and stuck agents in real-time

What You'll Master: βœ… Project generation that actually works (with proper structure and best practices) βœ… Context curation for massive codebases without token overload βœ… Agent behavior patterns - know when they're stuck BEFORE wasting credits βœ… Composer pipelines that automate your entire workflow βœ… Git + AI strategies for safe experimentation βœ… Multi-file refactoring without the chaos

Proven Results:

"Take John's cursor workshop it is πŸ”₯ HIGHLY Recommend" β€”David Wells

"I used the skills you taught me to effectively one-shot OAuth issuer support in the Epic Stack. So cool!" β€”Kent C. Dodds

"John is the cursor god" β€”Sunil Pai

Workshop Details:

  • When: Friday, June 27, 2025, 9:00 AM - 2:00 PM (PDT)
  • Format: Live Zoom workshop with Q&A throughout
  • Includes: Recording access, workshop materials, and my personal Cursor rules

πŸ’° Pricing:

  • ~~Regular Price: $249~~
  • 🎯 EARLY BIRD: $200 (Save $49!)
  • ⚠️ Price increases TOMORROW at midnight!

Perfect for:

  • Developers frustrated with AI tools that "almost" work
  • Teams wanting to standardize their AI workflows
  • Anyone who's felt the pain of agents modifying the wrong files
  • Developers ready to 10x their productivity with AI

➑️ See Full Workshop Details

➑️ SECURE YOUR EARLY BIRD SPOT NOW β†’

P.S. - Team training packages available. Get your whole team up to speed with group rates!

P.P.S. - Seriously, don't wait. The $49 savings ends tomorrow and I'd hate for you to miss out. This workshop will pay for itself in saved time within the first week.


That's all for this week! Feel free to reply directly to this email with any questions or feedback.

Keep on shipping! πŸš€

John Lindquist

egghead.io

Share with a coworker