AI Features - Freeplay Introduction

AI Insights

Surface patterns and root causes from your evaluation data, human reviews, and test runs

Model-Graded Evaluations

Score individual completions and traces using LLMs to evaluate your AI outputs at scale

Eval Creation Assistant

Create better evaluation criteria with AI-powered suggestions and prompt drafts for LLM judges

Auto-Categorization

Classify logs to reveal usage patterns and understand how users interact with your AI

Prompt Optimization

Get AI-generated suggestions for improved prompts based on your production data

Overview

All AI features in Freeplay work by calling LLM APIs to analyze your data. They are designed to work with different models and to use your API keys and model preferences, based on your account settings.

Managing AI feature settings

Disabling specific features

Individual AI features can be controlled through their respective configuration:

Model-graded evaluations: Disable per evaluation by turning off or setting sample rate to zero
Eval Creation Assistant: This is an on-demand feature that only runs when creating evals
Auto-categorization: Disable per auto-category by turning off or setting sample rate to zero
Prompt optimization: This is an on-demand feature that only runs when triggered
Review Insights: Runs automatically during review; disable via the Insights toggle in Project Settings > AI Features
Evaluation Insights: Runs weekly; disable via the Insights toggle in Project Settings > AI Features

Cost considerations

AI features consume tokens from the selected LLM provider. Costs depend on:

Which features you use and how frequently
The volume of data being analyzed
The models being used (more capable models typically cost more)

When Freeplay Keys are enabled, Freeplay covers the cost of AI feature usage. When using your own API keys, costs are billed directly to your provider account. Token usage for AI features is tracked separately from your application’s LLM usage and is visible in the Usage dashboard. If you’re using your own API keys, monitor this usage and consider adjusting feature sampling rates if costs are higher than expected.

Model-Graded Evaluations — Configure LLM judges for automated scoring
Auto-Categorization — Set up automatic content classification
Review Queues — Organize human review workflows and surface insights
Model Management — Configure LLM providers and API keys

Review Queues

Overview

⌘I

Documentation Index

AI Insights

Model-Graded Evaluations

Eval Creation Assistant

Auto-Categorization

Prompt Optimization

​Overview

​Managing AI feature settings

​Disabling specific features

​Cost considerations

​Related pages

Overview

Managing AI feature settings

Disabling specific features

Cost considerations

Related pages