AI Dev Essentials #15: Kimi K2, Groq Speed, and the Future of Dev Productivity

Hey Everyone 👋,

John Lindquist here with the 15th issue of AI Dev Essentials!. A new Open Source model named "Kimi K2" is taking the AI world by storm, and I'm HERE FOR IT! I hooked Kimi K2 up to Groq inside of Cline and my jaw hit the floor when I saw the sheer speed and intelligence absolutely demolishing tasks. I honestly feel like it has 4x'ed my productivity. My biggest concern is going to be hitting limits (both on tokens and on cost), but just knowing that this is what the near future of development looks like is so frikin' exciting.

After a lot of thought and discussions with trusted friends here's a quick update from previous announcements:

All Upcoming AI Courses, Tips, and Tricks will be published on egghead.io 😎

After announcing "CursorPro.ai" 2 weeks ago, I received a truckload of questions around "What about Claude Code?", "Is it only Cursor?", "When will it be finished?", format, pricing, etc.

To keep it simple, I will publish all upcoming AI courses as paid courses on egghead.io. So all you have to do is subscribe (or Re-subscribe 😘) to egghead.io and you'll receive all the latest AI Dev Essential lessons the moment they're released. This will immensely streamline the publishing process so you can get the latest tutorials as soon as possible. We'll be increasing the subscription price in the near future, so consider locking-in at the current price as the new "early bird" pricing:

✅ Subscribe here: https://egghead.io/pricing

And yes, I'll focus much more on Claude Code, Context Engineering, scripting AI tools, etc, but still allow myself breathing room to cover some of the Cursor best practices and workflows.

Tomorrow, I'l publish my first batch of lessons caled "Create a Markdown-Based Knowledge Base with MCPs" and I can show how to hook it into both Claude Code and Cursor without having to dance around the "CursorPro.ai" naming.

🚀 Major Announcements

Kimi K2: The Open-Source Giant That Changes Everything

Moonshot AI just dropped Kimi K2, and it's not just another model release - it's a paradigm shift. This 1 trillion parameter MoE model with 32B active parameters is now the open-source champion, outperforming Claude 4 and Gemini 2.5 on key benchmarks while being completely open and accessible.

The numbers are staggering:

Architecture: 1 trillion total parameters with 32B active parameters per token
Performance: LiveCodeBench v6 score of 53.7 (vs Claude Opus 4 at 47.4)
Cost: $0.15/M input tokens, $2.50/M output tokens (100x cheaper than Claude Opus 4)
Context: 128K-token context window with stable performance
Training: Pre-trained on 15.5T tokens with zero instability
License: Modified MIT License - truly open source

(Official GitHub Repository, MarkTechPost Coverage, Hugging Face Weights)

I'm so extremely excited for this as I already said above.. It'll be fascinating to watch what sort of impact this has on the community at large.

Groq + Kimi K2: The Speed Revolution

Groq integrated Kimi K2 onto their LPU infrastructure, achieving 185 tokens per second - significantly faster than the 16.2 tokens/second average across all providers. While not the claimed 1,000 tokens/second, it's still the fastest Kimi K2 inference available.

Groq's official announcement confirms "YOLO Launch Kimi K2 is now in preview on GroqCloud at 185 tokens/sec. Build fast." For context, Groq has achieved up to 800+ tokens/second with other models like Llama 3, showing the potential for future optimizations.

(Groq Official Announcement, Groq Documentation, Community Discussion)

There's just something special about watching code appear faster than you can read it. One of my biggest pain points with using any sort of AI tooling Is falling out of the zone where I have to keep on context switching between all the various tasks I'm working on because things are happening in the background. If we can finally get to the point where The models can generate quality code fast enough to keep it in the foreground so that I can stay in the zone and focused on the task, then that'll completely change the way I think about. Using all of the tools and the way I approach all of my projects, it's really a drastic shift. From the way I've thought about using all these tools in the past.

Cognition Acquires Windsurf: The Strategic Masterstroke

Cognition (makers of Devin AI) officially acquired Windsurf on Monday, July 14, 2025, following a whirlwind weekend deal. The acquisition includes Windsurf's IP, trademark, brand, and all remaining employees after Google hired key executives in a $2.4B reverse-acquihire.

Key deal details:

Business acquired: $82M ARR with 350+ enterprise customers
Team structure: 100% of employees participate financially with all vesting cliffs waived
Timeline: First call Friday evening, agreement signed Monday morning
Context: Came after OpenAI's failed $3B acquisition offer expired

(Cognition Official Blog, TechCrunch Coverage, VentureBeat Analysis)

The way this deal was structured shows real respect for the team that built Windsurf. Instead of the usual "acqui-hire and gut the product" approach, they're treating the team as equals. This is how you build long-term value in the AI space. Most of the people I've talked to see this as a perfect pairing between two companies. So I'm really happy from a personal level that the people involved in this on the The windsurf side are being respected and will get to work what they want to work on.

OpenRouter's Kimi K2 Pricing Discounts

OpenRouter is offering Kimi K2 through their platform at a discounts through a paid tier and a free tier, though the actual pricing differs from initial claims. Current verified pricing shows competitive but not revolutionary rates.

Actual pricing on OpenRouter:

Paid tier: $0.14/M input tokens, $2.49/M output tokens (via Targon)
Free tier: $0 for both input/output (via Chutes, Parasail) with rate limits
Context: Up to 65,536 tokens on free tier

Note: While marketed as aggressive pricing, these rates are close to Moonshot's direct pricing of $0.15/$2.50.

(OpenRouter Kimi K2 Page, Free Endpoint)

I think everyone's going to race to subsidize Kimi to get them hooked into their ecosystem. This is a fairly standard strategy once a new model comes out. Get them locked in and then raise the prices later. But we'll see what happens. Open router has been pretty awesome in the past.

🛠️ Developer Tooling & Ecosystem Updates

Gemini Embeddings Go General Availability

Google's Gemini Embedding model (gemini-embedding-001) became generally available in July 2025 following an experimental phase since March. The model leads the MTEB multilingual leaderboard with a score of 68.32, a margin of +5.81 over competing models.

Key specifications:

100+ languages supported with 2048 max input tokens
Matryoshka Representation Learning - scalable from 3072 down to 768 dimensions
Pricing: $0.15 per million input tokens (50% less for batch mode)
MTEB score: 68.32 on multilingual benchmark
Free tier available via Google AI Studio

(Google Developers Blog, Google AI Documentation, TechCrunch Coverage)

The pricing is aggressive - $0.15 per million tokens makes it very competitive. I'm already seeing developers switching from OpenAI's embeddings for cost-sensitive applications.

Deepgram Saga: Voice Control for Your Entire Stack

Deepgram launched Saga on July 8, 2025 - marketed as "The Voice OS for Developers." Important clarification: This is NOT an open-source model but a commercial voice interface built on Deepgram's proprietary APIs.

Core capabilities:

Voice interface for development workflows via MCP
Tool integration with Cursor, Windsurf, Linear, Asana, Jira, Slack
Commercial product using Deepgram's speech-to-text and text-to-speech APIs
Pricing: Pay-as-you-go based on WebSocket connection time
$200 free credit for new users

(Deepgram Official Announcement, SiliconANGLE Coverage, Product Page)

I tested this out for a while and I'm excited about their ambitious goals, but it does still feel like alpha software and it's nowhere near polished enough to replace SuperWhisper for me. I love the idea of always live typing your raw dictation while suggesting potential rewrites in panels so that you can dictate while watching rewrites come in, and either do your raw text or a polished version. And I also love the idea of hooking into MCPs so that you're essentially building your own voice assistants. Unfortunately, everything was so buggy for me, I just kind of gave up on it for a while. And we'll see how it evolves, and I'll keep on checking in on it.

Claude Code vs Kimi K2: The Cost Showdown

A head-to-head comparison between Claude 4 and Kimi K2 on identical tasks, and the results are eye-opening:

Claude 4: 2 rounds, $0.88 spent
Kimi K2: One-shot completion, $0.05 spent
Cost difference: ~13x cheaper

For reference, current pricing comparison:

Kimi K2: $0.15/M input, $2.50/M output tokens
Claude Opus 4: $15/M input, $75/M output tokens (100x/30x more expensive)
GPT-4.1: $2/M input, $8/M output tokens (13x/3x more expensive)

(Thread, Moonshot AI Pricing)

I know Kimi is still super new, and I'll be watching all the benchmarks and the cost comparisons. And this one stuck out to me even if it's just a small sample size. But still if if this trend holds, I'm definitely looking forward to saving money. 😇

Gemini CLI Public Roadmap: Transparency in Action

The Gemini CLI team made their development roadmap public on July 15, 2025, revealing their Q3 2025 priorities. With 60,000 stars and 1,000 open issues, the community engagement has been remarkable.

Key workstreams for Q3 2025:

Community and Automation: Repository workflow automation and OpenTelemetry observability support
Engineering Excellence: Enhanced testing frameworks and release processes
Extensibility: Internal and external plugin systems
Model Quality: 7 initiatives focused on improving model performance (25% complete)
Performance Optimization: Speed improvements across the CLI

(Public Roadmap Mark McD Announcement)

This level of transparency is refreshing. Making roadmaps public forces accountability and lets the community see where priorities really lie. With 1,000 open issues, they're clearly listening to feedback and iterating quickly.

💼 AI Ecosystem & Business Moves

AI Companies Score Major Defense Contracts

Anthropic and xAI both secured $200M ceiling contracts with the Pentagon on July 14, 2025, through the Chief Digital and Artificial Intelligence Office (CDAO). These two-year prototype agreements mark a significant shift in government AI adoption.

Contract details:

Amount: Up to $200M over two years per company
Type: Prototype other transaction agreements
Anthropic: Claude Gov models for classified networks, risk-forecasting research (follows their June policy change allowing intelligence applications)
xAI: Launched "Grok for Government" offering Grok 4, Deep Search models, and custom models for classified environments
Applications: Healthcare, fundamental science, national defense, and intelligence

(Bloomberg Report, DefenseScoop Coverage, Washington Post Analysis, xAI Official Announcement)

I honestly don't know what to say here. In fact, I'm not sure my opinion even matters... I think we've all seen enough sci-fi movies to know how this all plays out.

⚡ Quick Updates

OpenAI delays open-weight model: Sam Altman announced delays for safety testing, no timeline given (Tweet)
Steve's Guide to Claude Code: Builder.io's Steve published an awesome guide to Claude Code (Thread)
zshy bundler-free TypeScript: colinhacks (of "Zod" fame) released zshy for bundler-free TypeScript library builds (Launch)
Superwhisper v2.0: New design, Parakeet voice model (I ABSOLUTELY ADORE THIS MODEL, IT'S SO FAST!) (Update)
n8n MCP integration: Prajwal Tomar's guide to n8n MCP for workflow automation (Thread)

🚨 REGISTER NOW FOR FRIDAY'S WORKSHOP 🚨

✨ Live Workshop: Unlock Cursor's Full Potential ✨

Full details: https://egghead.io/workshop/cursor

Learn the latest power user workflows to maximize your productivity. Join me live as I walk through the current best practices in AI Dev landscape.

When: Friday, July 18, 2025
- 5:00 AM - 10:00 AM (PDT)
- 🇬🇧 1:00 PM - 6:00 PM (UTC+1)
- 🇪🇺 2:00 PM - 7:00 PM (UTC+2)
Where: Zoom
Investment: $249.00

➡️ Register Now

Limited spots available. Secure yours today!

Thanks for reading! If you have any questions or feedback, feel free to reply directly to this email.

Read this far? Share AI Dev Essentials with a coworker or friend!

John Lindquist
egghead.io