Google Agent Observability and Evaluation with Freeplay
See Google’s Guide here. We provide an end-to-end integrated workflow for building and optimizing AI agents with Google’s ADK. With ADK and Freeplay your whole team can easily collaborate to iterate on agent instructions (prompts), experiment with and compare different models and agent changes, run evals both offline and online to measure quality, monitor production, and review data by hand. Key benefits of connecting Freeplay:- Simple observability - focused on agents, LLM calls and tool calls for easy human review
- Online evals/automated scorers - for error detection in production
- Offline evals and experiment comparison - to test changes before deploying
- Prompt management - supports pushing changes straight from the Freeplay playground to code
- Human review workflow - for collaboration on error analysis and data annotation
- Powerful UI - makes it possible for domain experts to collaborate closely with engineers
Getting Started
Below is a guide for getting started with Freeplay and ADK. You can also find a full sample ADK agent repo here.Create a Freeplay Account
Sign up for a free Freeplay account . After creating an account, you can define the following environment variables:Use Freeplay ADK Library
Install the Freeplay ADK library:Observability
Freeplay’s Observability feature gives you a clear view into how your agent is behaving in production. You can dig into to individual agent traces to understand each step and diagnose issues:

Prompt Management (optional)
Freeplay offers native prompt management , which simplifies the process of version and testing different prompt versions. It allows you to experiment with changes to ADK agent instructions in the Freeplay UI, test different models, and push updates straight to your code, similar to a feature flag. To leverage Freeplay’s prompt management capabilities alongside ADK, you’ll want to use the Freeplay ADK agent wrapper.FreeplayLLMAgent extends ADK’s base LlmAgent class, so instead of having to hard code your prompts as agent instructions, you can version prompts in the Freeplay application.

System Message
This corresponds to the “instructions” section in your code.Agent Context Variable
Adding the following to the bottom of your system message will create a variable for the ongoing agent context to be passed through:History Block
Click new message and change the role to ‘history’. This will ensure the past messages are passed through when present.
FreeplayLLMAgent:
social_product_researcher is invoked, the prompt will be retrieved from Freeplay and formatted with the proper input variables.
Evaluation
Freeplay enables you to define, version, and run evaluations from the Freeplay web application. You can define evaluations for any of your prompts or agents by going to Evaluations -> “New evaluation”.
Dataset Management
As you get data flowing into Freeplay, you can use these logs to start building up datasets to test against on a repeated basis. Use production logs to create golden datasets or collections of failure cases that you can use to test against as you make changes.

