Create, iterate, and test prompts in the UI

Freeplay's UI allows you to to iterate, evaluate, and test prompts without altering your application

Getting Started from Freeplay's UI

This guide will walk you through creating your first prompt in Freeplay and running ad hoc test cases.


1. Create Your Project

Start with the default project, or by creating a new project in Freeplay. Projects help you organize prompts, datasets, and evaluations by feature or use case.

To create a project, log into Freeplay and click "Create Project", and give it a descriptive name like "Customer Support" or "Content Assistant". Once created, note your project ID from the project overview—you'll need this for SDK integration later.


2. Create A Prompt Template

A Prompt Template(s) is made up a model, prompt messages, input variables, tools, and an optional schema. The prompt playground allows you to iterate on these components and quickly test the results against test cases. It also allows you to test versions side by side and view changes between versions.

Launch the Prompt editor

From your project, click "Create prompt template" to open the prompt playground or select "Prompt templates" in the main navigation and create a new template to open the prompt editor.

Select Your Model

The model selector at the top of the page will provide options like OpenAI (GPT-4, GPT-3.5), Anthropic (Claude), and Google (Gemini) for Freeplay App users. On self hosted instances, or if you run out of credits, you may need to configure models in the application settings.

Add Messages and Variables

Freeplay prompt templates consist of messages (which are primarily static) and dynamic input variables. Segmenting variables from prompts in this manner provides benefits that are unique to Freeplay:

  • Rapidly build datasets by converting live observability sessions into test cases
  • Online and offline evaluations can reverence input variables
  • Quickly test prompt performance against various test cases in the playground
  • Run tests against multiple prompt and model changes

Message types

Freeplay provides several message types, each serving a different purpose. These messages also support advanced messaging syntax via mustache syntax.

System Message

Sets the AI's behavior and personality. Think of this as your base instruction that defines how the AI should act throughout the conversation. For example: "You are a helpful AI assistant that provides clear and accurate information."

User Message

Represents the input from your end user. This is where you'll typically use variables (which are specified in mustache syntax) to make your prompt dynamic.

Example user message:
Create a fictitious album name for {{artist}}

Assistant Message

Pre-filled AI responses that serve as few-shot examples. Use these to show the AI the style and format you want in its responses.

History

A special message type within Freeplay that represents conversation history. This makes it much easier to review and understand the context flowing through your LLM system, especially in multi-turn conversations.

Configure Advanced Settings

Beyond messages, you can fine-tune your prompt's behavior. Set temperature, max tokens, and other model parameters as needed. You can also add tools for function calling or enable structured outputs to ensure consistent response formats.

Test in the Playground

Before saving anything, test your prompt right in the playground by enter values for your variables in the left column and, clicking "Run". Each of these test runs can be saved to create your first dataset. This tight feedback loop lets you iterate quickly and refine your prompt until it feels right.

Save Your Prompt

Once you're satisfied with your initial version, click "Save" and name your prompt template. Add a version name and description to help your team understand what changed—this becomes especially valuable as you iterate over time.

From here, if you wish to continue in the UI, most teams jump straight into creating a dataset, an evaluation, and then running a test to measure performance.

Alternatively, you can set up observability to understand how your application is performing for users and create datasets from actual user and agent sessions.