Migrating to AI SDK v5: A Story of Tool Streaming, Caching, and Type Safety
We migrated BrainGrid's core AI infrastructure from AI SDK v4 to v5. Here's how it went.
Every developer knows that SDK migrations are like home renovations, they always take longer than expected, uncover hidden problems, and make you question your life choices halfway through. But sometimes, the end result makes it all worthwhile.
This month, we migrated BrainGrid's core AI infrastructure from AI SDK v4.3.16 to v5.0.0-beta.25. We knew it wouldn't be a walk in the park. SDK migrations never are, especially when you're dealing with beta versions. But three features made this migration impossible to ignore:
- Tool streaming - Our users were tired of staring at blank screens while AI agents worked
- Provider options for tool caching - Faster responses and lower costs? Yes please
- Better TypeScript support - Even if it meant touching every file
For context, BrainGrid is an AI-powered platform that helps developers turn messy ideas into crystal-clear specs that AI coding assistants can actually implement. We analyze your codebase, ask the right clarifying questions, and break requirements down into atomic, verifiable, AI-ready tasks. Each task becomes a precise prompt with full context, so your AI IDE (Cursor, Claude Code, etc.) gets it right the first time. Behind the scenes, we're orchestrating multiple specialized agents with dozens of tools—which is why this SDK migration touched everything.
Anyhow, here's how the migration went.
Why We Knew This Would Be Hard (But Did It Anyway)
Let's be honest: nobody migrates to a beta SDK for fun. We had 14 tool definitions, several streaming event handlers, and a complex agent system that our users depend on every day. Breaking any of it wasn't an option.
But our users had a legitimate complaint. When the BrainGrid agent started thinking or composing a requirement, they'd see... nothing. Just a blinking cursor. Was it thinking? Had it crashed? Was it writing War and Peace? Nobody knew until the tool input was finally done.
The Agent experience waiting for a tool call to finish
Meanwhile, paying less is nice. Every tool call meant sending the full tool definitions to the API. With complex tools, that's thousands of tokens per request. Anthropic's new cache control feature promised to fix this, but it required v5's provider options support.
So we made the call: temporary migration pain for permanent user gains.
The Migration Journey
1. The Great API Rename
The first surprise came immediately. Every single tool definition needed surgery:
// Before (v4) const readWebPageTool = tool({ description: 'Reads and analyzes web content', parameters: z.object({ url: z.string().url(), extractImages: z.boolean().optional(), }), execute: async args => { // Tool logic here }, }); // After (v5) const readWebPageTool = tool({ description: 'Reads and analyzes web content', inputSchema: z.object({ // 👈 renamed from 'parameters' url: z.string().url(), extractImages: z.boolean().optional(), }), execute: async args => { // Tool logic here }, });
Not catastrophic, but we had 14 tools across our agent system. That's 14 careful edits, 14 places to potentially break something. But in reality it was actually the easy part.
Then came the tool calls themselves:
// Before if (chunk.type === 'tool-call') { const toolArgs = chunk.args; // 👈 'args' // Process tool call } // After if (chunk.type === 'tool-call') { const toolArgs = chunk.input; // 👈 now 'input' // Process tool call }
And don't forget about token limits:
// Before streamText({ model: anthropic('claude-4-sonnet'), maxTokens: 4096, // 👈 'maxTokens' // ... }); // After streamText({ model: anthropic('claude-4-sonnet'), maxOutputTokens: 4096, // 👈 'maxOutputTokens' // ... });
Each change was small, but they added up. Fast.
2. Type System Overhaul
This is where things got interesting. The v5 SDK introduced stricter, more accurate types. Great for catching bugs, painful for migration.
Our entire conversation system was built on the old message types:
// Before (v4) import { Message as AIMessage } from 'ai'; interface Conversation { messages: AIMessage[]; } // After (v5) import { ModelMessage } from 'ai'; interface Conversation { messages: ModelMessage[]; }
But that was just the beginning. The new ModelMessage
type revealed a fundamental assumption in our code: we assumed message content was always a string.
// Our token calculator before migration function calculateTokens(message: AIMessage): number { const content = message.content as string; // 🚨 Danger! return tokenizer.encode(content).length; }
In v5, message content can be:
- A simple string:
"Hello world"
- An array of parts:
[{ type: 'text', text: 'Hello' }, { type: 'image', image: '...' }]
- Complex content objects
Our token calculator would crash on anything but strings. We built a helper to handle all cases:
export function extractTextContent(content: unknown): string { if (typeof content === 'string') { return content; } if (Array.isArray(content)) { return content .filter(part => part.type === 'text') .map(part => part.text) .join(' '); } if (content && typeof content === 'object' && 'text' in content) { return content.text; } return ''; }
But the stricter typing went beyond just content handling. The v5 SDK also made streamText
much more strict about message interfaces. Before, we could accidentally pass malformed message objects and get cryptic runtime errors—including one memorable bug where we were accidentally sending the tool name instead of the expected message content. The old SDK would accept it and produce bizarre, hard-to-debug behavior.
Now, TypeScript catches these interface mismatches at compile time:
// This would have failed silently in v4, causing weird runtime bugs const messages: ModelMessage[] = [ { role: 'assistant', content: toolName, // 🚨 TypeScript now catches this mistake }, ]; // v5 forces us to be explicit and correct const messages: ModelMessage[] = [ { role: 'assistant', content: message.content, // ✅ Proper message content }, ];
The silver lining? This exposed multiple real bugs. We'd been undercounting tokens for complex messages for months, and had subtle message formatting issues that occasionally caused confusing AI responses.
3. Streaming Protocol Redesign
Remember those users staring at blank screens? This is where we fixed that. But first, we had to rewrite how we handled streaming.
Every chunk type changed:
// Before (v4) for await (const chunk of stream) { if (chunk.type === 'text-delta') { content += chunk.textDelta; } } // After (v5) for await (const chunk of stream) { if (chunk.type === 'text') { content += chunk.text; } }
But the real win was tool streaming. Now we could show tool cards the instant an agent started using a tool:
// When a tool-call chunk arrives if (chunk.type === 'tool-call') { setTemporaryStreamMessage(prev => [ ...prev, { type: 'tool_call', tool_call: { id: chunk.toolCallId, name: chunk.toolName, arguments: chunk.input, loading: true, // Shows spinner immediately }, }, ]); }
Users now see a card appear instantly when the agent starts using a tool. No more mystery. No more "is it frozen?" support tickets.
4. Control Flow Changes
Here's where we almost shot ourselves in the foot. The old maxSteps
parameter got a makeover:
// Before (v4) const result = await generateText({ model: anthropic('claude-4-sonnet'), maxSteps: 25, // ... }); // After (v5) const result = await generateText({ model: anthropic('claude-4-sonnet'), stopWhen: stepCountIs(25), // ... });
Looks simple enough. But this change hid a critical shift in behavior. It turns out stepCountIs(n)
doesn't set a maximum number of steps; it requires the agent to run for exactly n
steps. What was maxSteps
now behaved like minSteps
.
This turned out to be a blessing in disguise. By forcing us to be explicit about the step count, we fixed a subtle issue where agents could occasionally run longer than needed. After adjusting our default step counts, the agents' behavior became smoother and more predictable, which was a great improvement for our complex workflows.
// In our BaseAgent class maxSteps = 5, // 👈 Used to be 25
The new stopWhen
API is actually more powerful. We can now stop on specific conditions:
stopWhen: [ stepCountIs(maxSteps), hasToolCall('generate_clarifying_questions'), // Stop when clarification needed ];
This feature pleasantly surprised us, as we previously had to meticulously prompt engineer to ensure the agent stopped immediately after invoking the generate_clarifying_questions
tool.
5. Enabling Tool Definition Caching
This was the feature that made the customers happy. With v5's provider options, we could finally use Anthropic's cache control on tool definitions:
const readWebPageTool = tool({ description: 'Reads and analyzes web content', inputSchema: z.object({ url: z.string().url(), extractImages: z.boolean().optional(), }), providerOptions: { anthropic: { cacheControl: { type: 'ephemeral' }, // 👈 Cache this tool definition }, }, execute: async args => { // Tool logic }, });
For frequently-used tools, this means the tool definition is cached on Anthropic's servers. Instead of sending thousands of tokens for complex tool definitions on every request, we send them once and they are cached automatically.
The Results
First and Foremost: Everything Still Works
Let's celebrate the most important achievement: after all these changes, BrainGrid works exactly as it did before. Every agent, every tool, every workflow operates seamlessly. No regressions. No "we'll fix that in v2" compromises.
This might sound like table stakes, but if you've done major migrations, you know it's not. Maintaining 100% compatibility while overhauling the foundation is like changing a car's engine while driving.
But Now It's Better
Here's what our users notice:
-
Instant feedback: Tool cards appear the moment an agent starts working. No more guessing.
-
Faster responses: Cached tool definitions mean less data to send, faster processing.
-
More reliable: The stricter types caught edge cases we didn't know existed.
The agent now shows the tool card right away, so they user knows whats going on.
When multiple tool calls happen, the agent still shows the cards right away
The numbers tell the story:
- Support tickets about "frozen" UI: 0 (down from 1-3 per week)
- Average tool execution time: 17% faster
- API costs from tool definitions: reduced by 7%
- Type-related bugs caught: 7 (including that token calculator)
Lessons for Fellow Engineers
After a couple days of migration work, here's what we learned:
1. Pin Your Beta Versions
"ai": "5.0.0-beta.25" // Not "^5.0.0-beta.25"
Beta versions can have breaking changes between releases. Pin the exact version and upgrade deliberately.
2. Read the Source, Not Just the Docs
The migration guide covered the basics, but real apps have edge cases. When in doubt, read the SDK source code. It's surprisingly readable and answered questions the docs didn't.
3. Test with Production-Like Scenarios
Our unit tests passed. Our integration tests passed. But they all used simple, single-tool scenarios. We could definitely have missed the maxSteps
issue. When it's about AI always run manual tests before shipping to production.
4. Migration Guides Show the Happy Path
Real migrations are messier. Budget time for:
- Edge cases the guide doesn't mention
- Updating related code that depends on the old behavior
- Testing scenarios you forgot existed
- Rolling back if something goes catastrophically wrong
5. Document Everything
We kept a migration log every change, every surprise, every "wait, why does this work now?" moment. This blog post started as those notes. Your future self will thank you.
Was It Worth It?
Absolutely.
Our users get instant feedback when agents work. Our infrastructure costs dropped noticeably. Our code is more type-safe and maintainable.
Yes, it took a couple of days instead of an afternoon. Yes, we discovered bugs we didn't know existed. Yes, we questioned our sanity around the second day.
But that's engineering. We don't migrate SDKs because it's easy. We do it because our users deserve better, our infrastructure demands it, and sometimes the beta version has exactly what we need.
Just remember to pin your dependencies.
BrainGrid is the AI-powered planning platform that helps developers turn messy thoughts in to AI-ready requirements and agent tasks. We migrated to AI SDK v5 so our agents could show their work in real-time. See it in action.
Ready to stop babysitting your AI coding assistant?
Join the waitlist for BrainGrid and experience AI-powered planning that makes coding agents actually work.
Join the Waitlist