11/14/2025


🎉

We've been busy shipping updates to make testing, observability, and reviews more powerful in Freeplay. From deeper trace visibility to AI-assisted reviews, here's what's new.

Here's the full rundown: Advanced Traces with Tool & Agent Spans

Granular trace visibility — Freeplay now captures individual tool calls and agent steps as distinct spans with configurable kind and name properties, timestamps, and full input/output bodies. Better agent debugging — See exactly what's happening at each step of your agent workflows, making it easier to identify bottlenecks and optimize performance.

New Platform Integrations Streamlined logging across frameworks — We've launched integrations for:

LangGraph — Native support for LangGraph workflows Vercel AI SDK — Seamless integration with Vercel's AI toolkit Google ADK — Direct logging from Google's Agent Development Kit

These integrations make it easier to get your AI applications into Freeplay with minimal setup. Multimodal Dataset History

Multi-turn conversations with images — Datasets now support multimodal outputs, letting you create test cases with images and other media across multiple conversation turns. Richer test scenarios — Build comprehensive test suites that reflect real-world multimodal interactions.

Smarter Review Queues We've supercharged review queues with workflow improvements and AI assistance: Enhanced workflow controls (video walkthrough)

Auto-assignment of review tasks to team members Auto-updating status tracking Automatic completion marking New keyboard shortcuts for faster reviews

Easier data curation (video walkthrough)

Add to review queues from session view — Click any completion or agent interaction and add it directly to a review queue with assignee and status selection. Add to datasets with full curation — When adding completions to datasets, you can now update data, add multimodal inputs/outputs, and curate golden answers—all in one flow.

🤖 Review Agent (Beta)

AI-powered review insights — Our Review Agent analyzes your review queues to automatically identify common themes and patterns in your data. Currently available to select teams — We're testing this with early customers and excited to roll it out more broadly soon. Interested? Let us know!

Under the Hood

Released freeplay-python SDK v0.5.4 with improved package management and documentation Enhanced multimodal data handling across the platform Performance improvements for trace processing

Questions or want early access to the Review Agent? Reach out—we'd love to hear your feedback!