Leveraging an iterative flywheel

You’ve created your first prompt and integrated observability. Congratulations! However, this is just the beginning! The real power of Freeplay comes from iteration and evaluating those iterations. To unlock this powerful workflow, you need to:

Create datasets - Datasets allow you repeatedly test against the same inputs and compare results to both a ground truth expected output and evaluation scores
Create evaluations - Evaluations allow you to score and label records manually (human labels) or automatically (auto-categorization, LLM-as-Judge evaluations)
Run a test - Tests leverage your evaluations and datasets to compare changes to prompts and model settings.
Deploy - When you make a change to a prompt template in Freeplay and deploy it, observability sessions allow you to track the performance of this version and even compare it against other versions.
Review - Review sessions allow you to compare the performance of different versions of your prompt template and identify areas for improvement.

Every review session and test reveal insights, each insight drives improvements, and every improvement compounds over time making your system better and better!

Work Your Way

There’s no prescribed order—work the way that makes sense for your workflow. Discover an edge case in production? Create a new dataset to test it. Want to measure a different quality dimension? Add another evaluation. Trying out a radically different approach? Test prompt variations side-by-side to see which performs better. The system adapts to how you work, not the other way around. This iterative approach is how teams systematically improve their AI application’s quality over time. Small improvements compound, edge cases get handled, and your prompts get better with every cycle.

Getting started

Quick start

Core Concepts

AI Framework integrations

Practical Guides

Recipes

Security & Compliance

Resources

Work Your Way

Getting started

Quick start

Core Concepts

AI Framework integrations

Practical Guides

Recipes

Security & Compliance

Resources

​Work Your Way

Work Your Way