AI Dev Essentials #37: Gemini 3 Flash & GPT-5.2-Codex Arrive

John
Instructor

John

Hey Everyone πŸ‘‹,

John Lindquist here with the 37th issue of AI Dev Essentials!

Google released Gemini 3 Flash globally, delivering Pro-level performance at Flash speed. OpenAI launched GPT-5.2-Codex, their most advanced agentic coding model with a 400K context window. Anthropic made Agent Skills an open standard with Microsoft, Cursor, and others already adopting it, plus launched a Skills Directory with partners like Vercel, Figma, Notion, and Stripe.

On the tooling front, NVIDIA dropped Nemotron 3 Nano with a 1M context window, Cline moved to Vercel AI Gateway cutting errors by 24%, and OpenCode launched their Desktop beta. Google also introduced CC, an experimental AI productivity agent in Gmail, and integrated Opal (their vibe-coding tool) directly into Gemini.

On a personal note, I spoke at Microsoft AI Dev Days last week and shipped a nice little agent scripting tool called mdflow. Let's start this off with a sweet little deal...


🎁 egghead.io Lifetime Sale πŸ‘€

Since I'm not teaching a workshop this month, I wanted to do something a little different and open up a sweet little bundle:

egghead Lifetime. Pay once, keep it for life.

Plus, during this sale you get two full workshop recordings:

  • Cursor Power User workshop recording ($375 value)
  • Claude Code Power User workshop recording ($375 value)

That's $750 in bonus training along with everything else on egghead. If you want to end the year strong and start 2026 with momentum, this is your next step.

πŸ‘‰ Get egghead Lifetime here


πŸš€ mdflow Launch

I launched mdflow this week. It's a scripting language for AI agents built on executable markdown. You write instructions and code blocks in a .md file, and mdflow runs them in sequence, passing context between steps. Think of it as literate programming meets agent orchestration. It's perfect for creating repeatable workflows that AI agents can follow, or for making your documentation actually runnable. The syntax is just markdown, so there's no new language to learn.


πŸš€ Major Announcements

Breaking Google Releases Gemini 3 Flash with Pro-Level Performance

Google released Gemini 3 Flash on December 17, 2025, making it the default model in the Gemini app globally. The model delivers "frontier intelligence built for speed" with Pro-level performance at Flash-level latency.

Benchmark Performance:

  • GPQA Diamond: 90.4% (PhD-level reasoning)
  • Humanity's Last Exam: 33.7% without tools (3x the previous Flash model's 11%)
  • MMMU-Pro: 81.2% (state-of-the-art, comparable to Gemini 3 Pro)
  • Video-MMMU: 87.6% (multimodal video understanding)
  • SWE-bench Verified: 78% for agentic coding (outperforms Gemini 3 Pro's 72.8%)

Technical Specs:

  • Context Window: 1 million tokens input, 64K output
  • Speed: 3x faster than Gemini 2.5 Pro, 218 output tokens/second
  • Token Efficiency: Uses 30% fewer tokens than 2.5 Pro on average
  • Thinking Levels: Four modes (minimal, low, medium, high) for controlling reasoning depth

Pricing:

  • Input: $0.50 per 1M tokens
  • Output: $3.00 per 1M tokens
  • Context Caching: 90% cost reduction for repeated tokens
  • Batch API: 50% cost savings for async processing

Availability: Gemini app (default), Google AI Studio, Vertex AI, Gemini API, Gemini CLI, Google Antigravity, and AI Mode in Google Search.

(Google Blog, Google Developers Blog, TechCrunch, SiliconANGLE)

πŸ’¬ Flash 3 is my normal life "daily driver". It's the default model on Android for all of my dictated actions. It's my default model on gemini for replacing "searching". I'm still deeply in love with Opus 4.5 for development, but I'm experimenting with Opus 4.5 being a "manager" and multiple Flash 3s being the "workers". Nothing to share yet, but I'll keep you posted on how it turns out.


OpenAI Releases GPT-5.2-Codex for Agentic Coding

OpenAI launched GPT-5.2-Codex on December 18, 2025, describing it as "the most advanced agentic coding model yet for complex, real-world software engineering."

Technical Capabilities:

  • Context Window: 400,000 tokens with 128,000 max output tokens
  • Native Compaction: Optimized for long-horizon work through context compaction
  • Windows Support: First OpenAI model trained to operate in Windows environments
  • Enhanced Security: Strongest cybersecurity capabilities of any OpenAI model

Benchmark Performance:

  • SWE-Bench Pro: 56.4% accuracy (highest of all coding models)
  • Terminal-Bench 2.0: 64% accuracy

Real-World Validation: Security engineer Andrew MacPherson used GPT-5.1-Codex-Max to discover three React Server Components vulnerabilities (CVE-2025-55183, CVE-2025-55184, CVE-2025-67779).

Availability: All Codex surfaces for paid ChatGPT users. API access coming in the following weeks.

(OpenAI Blog, OpenAI System Card, SiliconANGLE)

πŸ’¬ The Opus 4.5 challenger arrives! I've heard great things from friends about it, but since it just dropped yesterday and I'm deep in Claude Code work, I haven't had a chance to spin up Codex, etc and give it a fair shake. Initial anectodal reports that "it's really smart in different ways", so I think deciding when to use which model might have to be learned trait based on your specific requirements. I'll have to post more about mdflow and how it can bring together multiple models/agents into a single prompt...


Breaking Anthropic Makes Agent Skills an Open Standard

Anthropic announced on December 18, 2025 that Agent Skills is now an open standard, with a new Skills Directory launching at claude.com/connectors.

Open Standard Adoption:

  • Microsoft: Adopted within VS Code and GitHub
  • Coding Agents: Cursor, Goose, Amp, and OpenCode already supporting skills
  • Directory Partners: Atlassian (Jira, Confluence), Canva, Figma, Notion, Cloudflare, Zapier, Stripe, Vercel

Skills Directory Features:

  • Browse and enable skills at claude.com/connectors
  • One-click installation for verified partner skills
  • Skills work across Claude.ai, Claude Desktop, and Claude Code

(Anthropic - Skills Open Standard, Claude Blog - Skills Directory, Claude Blog - Introducing Skills)

πŸ’¬ We all win here! I've grown to love skills over the past couple of months. It's a great way of lazy-loading a ton of relevant context for specific tasks and they're completely decoupled from the tools that need them. So it's awesome to see the AI ecosystem adopt them and I'm sure we'll see an explosions of shared skills in 2026.


Tool Claude Code Chrome Extension Enables Browser Testing

Anthropic released Claude Code integration with Chrome on December 18, 2025, enabling a complete build-test-verify workflow directly from the terminal.

Capabilities:

  • Test code directly in the browser to validate work
  • Read client-side errors via console logs
  • Access DOM state, network requests, and page content
  • Interact with authenticated web apps

Workflow: Build with Claude Code in your terminal, test in the browser with the Chrome extension, debug using console logs. Launch with /chrome command.

(Claude Code Docs - Chrome, Claude Blog - Chrome Pilot)

πŸ’¬ What distinguishes this from other solutions is how you get to use "Real Chrome" and not an isolated playwright browser. I've had a lot of luck using the Claude Chrome extension for interacting with real web apps. I can't wait to hook it up to Claude Code for dev projects.


πŸ› οΈ Developer Tooling Updates

Tool xAI Launches Grok Voice Agent APIs

xAI released the Grok Voice Agent API on December 17, 2025, enabling developers to build voice agents with sub-second response times.

Technical Specs:

  • Latency: Average time-to-first-audio under 1 second (5x faster than closest competitor)
  • Benchmark: #1 on Big Bench Audio
  • Languages: 100+ with automatic detection
  • Pricing: $0.05 per minute flat rate

Available Voices: Sal, Rex, Eve, Leo, Ara, Mika, Valentin. Developers can prompt auditory cues like [whisper], [sigh], and [laugh].

(xAI Documentation, LiveKit Partnership)


Tool NVIDIA Releases Nemotron 3 Nano with 1M Context

NVIDIA released Nemotron 3 Nano on December 15, 2025, a hybrid reasoning model with 1M token context window.

Architecture:

  • Parameters: 3.2B active (31.6B total) with hybrid Mamba-Transformer MoE
  • Context: 1M tokens (68.2% accuracy at full context on RULER)
  • Throughput: Up to 3.3x higher than comparable open-source models

Open Source: Weights, datasets, training recipes, 3T pre-training tokens. Runs on 24GB RAM via GGUF.

(NVIDIA Research, Hugging Face Blog)


Tool Cline Moves to Vercel AI Gateway

Cline announced their migration to Vercel AI Gateway on December 16, 2025, reporting significant improvements.

Performance: Error rates down 24.3%, P99 streaming latency improved 10-14%, automatic failover during provider outages.

Scale: 1M+ developers and 4M+ installations.

(Vercel Blog, Cline Blog)


Tool OpenCode Desktop Launches in Beta

OpenCode released their Desktop application in beta on December 15, 2025.

Features: Native cross-platform app (Tauri + SolidJS), multi-project workspace, persistent sessions, embedded terminal, diff viewers. Free models included or connect any provider.

Install: brew install --cask opencode-desktop on macOS.

(OpenCode Official, GitHub)


Tool Claude Code Ships Terminal Rendering Overhaul

Anthropic announced on December 17, 2025 that Claude Code's terminal rendering was rewritten, reducing flickering by ~85%.

Improvements: Fixed diff view on resize, 3x better memory usage for large conversations, fixed rendering during token streaming.

December 16 Updates: Syntax highlighting for diffs, prompt suggestions, first-party plugins marketplace, shareable guest passes.

(ClaudeLog Changelog)

πŸ’¬ We'll see if this sticks. I just read that they're rolling it back temporarily due to some edge cases in some environments.


πŸ€– AI Ecosystem Updates

Tool Google Labs Launches CC AI Productivity Agent

Google Labs released CC on December 16, 2025, an experimental AI agent delivering "Your Day Ahead" briefings via Gmail, Calendar, and Drive.

Capabilities: Morning briefings, email drafting, calendar link creation, web search context, cross-document connections.

Limitations: Cannot send emails or reschedule meetings autonomously.

Availability: Early access for Google AI Ultra and paid Gemini subscribers (US/Canada).

(Google Blog, 9to5Google)

πŸ’¬ This is actually pretty awesome. I definitely recommend giving it a shot to see a clean summary of the emails of your day that you might not otherwise read.


Tool Gemini Deep Research Adds Visual Reports

Google enhanced Deep Research on December 16, 2025 with AI-generated images, charts, and interactive simulations.

Capabilities: Custom visuals from research, interactive variable adjustment, reports in Canvas. Exclusive to AI Ultra ($250/month).

(Google Blog, 9to5Google)

πŸ’¬ Some of the graphs and charts they produce are super clean. I hope they expose ways to have full control over the visualizations and allow users to customize them further.


Tool NotebookLM Integration Rolling Out in Gemini

Google began rolling out NotebookLM integration in Gemini on December 13-17, 2025.

How It Works: Attach notebooks as context via the plus menu. Gemini responds based on curated sources rather than web search.

(9to5Google, Android Central)

πŸ’¬ This is the integration I've been waiting for! Being able to use NotebookLM as context in Gemini means I can leverage curated research directly.


Ecosystem OpenAI Launches ChatGPT App Store

OpenAI launched the ChatGPT App Store on December 18, 2025, opening to 800+ million users.

Partners: Booking.com, Spotify, Dropbox, DoorDash, Zillow. Built on MCP. Submit apps through OpenAI Developer Platform.

(TechCrunch, Engadget)

πŸ’¬ This is a big opportunity for devs who want to jump in and try to claim one of those "first to the app store" spots.


Research Apple Releases SHARP for Instant 3D Photo Conversion

Apple published SHARP on December 11, 2025, converting single 2D photos to 3D in under one second.

Technical: Single feedforward pass, 3D Gaussian at 100+ fps, 25-34% LPIPS improvement, three orders of magnitude faster than previous methods. Open source under Apache 2.0.

(arXiv, Apple ML Research, GitHub)


⚑ Quick Updates

Gemini 2.5 Flash Native Audio Upgrade

  • Release: December 12, 2025
  • Function Calling: ComplexFuncBench Audio score of 71.5%
  • Instruction Following: 90% adherence rate (up from 84%)
  • Live Translation: Streaming speech-to-speech in 70+ languages

(Google Blog)

Standard JSON Schema Specification

  • Created by: Zod, Valibot, and ArkType creators
  • Purpose: Unified JSON Schema with preserved type inference
  • AI SDK 6 Beta: Native jsonSchema helper function

(Standard JSON Schema)

Google Agent Development Kit for TypeScript

  • Release: December 17, 2025
  • Version: ADK TypeScript v0.2.0
  • Features: Code-first multi-agent systems, type-safe development

(Google Developers Blog)

React Grab Agent Integration

  • Feature: Click any React element, copy exact source location
  • Agents: Works with Claude Code, Cursor, OpenCode, Codex, Gemini
  • Install: npx @react-grab/cli@latest

(GitHub)


πŸ“’ Workshop Update

Due to the raging success of this past year's AI Workshops, I'm moving them to a monthly cadence under dev.build to help with bookkeeping. There are no workshops left this year, but we're planning 1 per month next year with the first tentatively planned for Jan 30. We'll announce more soon.


Happy Holidays πŸŽ…!

This is the last newsletter of the year (I really hope no one is releases anything interesting next week...). I'll be on vacation with my family and loved ones next week and I sincerely hope you and yours have a wonderful holiday season! See you in 2026!

Read this far? Share "AI Dev Essentials" with a friend! β€” egghead.io/newsletters/ai-dev-essentials

β€” John Lindquist

egghead.io


Did you enjoy this issue? Share it with a friend.

Β© 2025 John Lindquist β€’ egghead.io

Share with a coworker