Getting Started with Freeplay

Welcome to Freeplay! This guide will help you set up Freeplay and start analyzing your LLM interactions in minutes.

Prerequisites

Before diving in, make sure you have:

1. A Freeplay Account

New users: Sign up at app.freeplay.ai
Some Enterprise users: Access your instance at <subdomain>.freeplay.ai

💡
Have set up questions? Contact us at [email protected] and we'll help you get set up.

Create Your First Project

Projects in Freeplay organize your prompts and LLM interactions. Think of a project as a container for all the prompts that power a specific product or set of features in your application.

Step 1: Navigate to Projects

From your dashboard, click the New Project button in the top right corner.

Step 2: Configure Your Project

Project Name: Choose something descriptive (e.g., "Customer Support Bot", "Product Description Generator")
Visibility:
- Private: Only you and invited team members can access
- Public: All organization members can view and contribute

Best Practice: Create separate projects per product. For example, keep your "Email Generator" separate from your "Code Review Assistant" for better organization and cleaner analytics.

Integrate Your Project

Install the Freeplay SDK

Freeplay offers native SDKs for Python, Node.js, and Java (for use with any JVM language). Don't see an SDK you need? Please reach out at [email protected].

Install the SDK using the commands below:

pip install freeplay

npm install freeplay

<!-- Add the Freeplay SDK to your pom.xml -->
<dependency>
    <groupId>ai.freeplay</groupId>
    <artifactId>client</artifactId>
    <version>x.x.xx</version>
</dependency>

3 ways to integrate

There are 3 ways to get started in Freeplay

Freeplay prompt management - set up prompts in the Freeplay UI and download them to your application. (Recommended)
Manage prompts in code - store prompts in code and sync to Freeplay (Flexible)
Lightweight Observability (Fastest)

Freeplay works best when the structure of your prompts are reflected in the platform. Prompts provide the structure for building datasets, writing targeted evaluations, and running experiments. Therefore, options 1 & 2 will unlock the greatest number of features up front, while option 3 allows you to get data flowing into the system quickly—though you'll need to add prompt structure later to realize the platform's full potential.

Regardless of how you get started you can decide later how you want to do prompt management in the longer term.

Here’s more detail on each getting started option

Option 1: Freeplay prompt management

Step 1: Create the prompt template

To create your first prompt template in the UI go to Prompts in the main menu and click “Create Prompt Template”.

The prompt editor will help you This will pop open a prompt editor and you can draft your first prompt.

Freeplay leverages mustache syntax to provide a templating structure for writing prompts.

(more details on the components of prompt templates here)

In this case I’ve created a prompt with a single input variable: artist

[Optional] Step 2: Hook into observability

If you already have an application you can start to hook in Freeplay for prompt management and observability. If not feel free to skip this step!

The basic steps of a Freeplay integration look like this.

Fetch the prompt from Freeplay.

formatted_prompt = fpClient.prompts.get_formatted(
    project_id=project_id,
    template_name="album-bot",
    environment="latest",
    variables={"artist": "Taylor Swift"}
  )

Utilizing the prompt as a helpful data object, call your LLM provider directly.

chat_response = openai_client.chat.completions.create(
    model=formatted_prompt.prompt_info.model,
    messages=formatted_prompt.llm_prompt,
    **formatted_prompt.prompt_info.model_parameters
)

And finally record the interaction back to Freeplay.

from freeplay import RecordPayload

session = fp_client.sessions.create()

payload = RecordPayload(
    project_id=project_id
    all_messages=all_messages,
    inputs=prompt_vars,
    session_info=session, 
    prompt_version_info=formatted_prompt.prompt_info,
    call_info=CallInfo.from_prompt_info(formatted_prompt.prompt_info, start_time=start, end_time=end, usage=UsageTokens(chat_response.usage.prompt_tokens, chat_response.usage.completion_tokens)), 
)
# record the LLM interaction
fpClient.recordings.create(payload)

From there you’ll be able to see data flowing into Freeplay in the Observability tab.

For full integration details see our SDK documentation.

Option 2: Sync prompts to Freeplay from code

Step 1: Push your prompt to Freeplay programmatically

First push your prompt to Freeplay with the following SDK method

curl -X POST "https://api.freeplay.ai/api/v2/projects/<project-id>/prompt-templates/name/<template-name>/versions" \
  -H "Authorization: Bearer <YOUR_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "template_messages": [
      {
        "role": "system",
        "content": "some content here with mustache {{variable}} syntax"
      }
    ],
    "provider": "openai",
    "model": "gpt-4.1",
    "llm_parameters": {
      "temperature": 0.2,
      "max_tokens": 256
    },
    "version_name": "dev-version",
    "version_description": "Development test version with mustache variable"
  }'

This will create a new prompt template in the system. You can view it by going to the Prompt section in your project.

[Optional] Step 2: Hook into Observability

If you already have an application you can start to hook in Freeplay for prompt management and observability. If not feel free to skip this step!

If you want to switch over to Freeplay for prompt management follow the observability steps above!

If you’d prefer to manage your prompts in code then continue to build your prompts how you have been, but after each LLM call add a record call back to Freeplay.

fp_client.recordings.create(
    RecordPayload(
        project_id=project_id,
        all_messages=messages,
        inputs={'keyA': 'valueA'},
        prompt_info=PromptInfo(
            prompt_template_id=template_id,
            prompt_template_version_id=new_version_id
        )
    )
)

Option 3: Lightweight Observability

If you don’t want to move your prompts into Freeplay just yet, you can still start to get data flowing into the system.

Step 1: Hook onto Observability

Leave your LLM application just as it is but after each LLM interaction make a record call to Freeplay.

# Continue with you application as is
prompt = [{"role": "system", "content": "generate an album name for Taylor Swift"}]

chat_response = openai_client.chat.completions.create(
    model="gpt-4.1",
    messages=prompt
)

# Add a record call to Freeplay
fp_client.recordings.create(
    RecordPayload(
        project_id={{Add your Proje}},
        all_messages=messages,
        inputs={'artist': 'Taylor Swift'}, # Optional,
        call_info=CallInfo(provider='openai', model='gpt-4.1', # Optional
    )
)

The minimum you need to record to Freeplay are the message and a project id. However the more information you provide the more useful the observability data will be. See the full set of options here In this case we’ve added two additional fields

Inputs: by specifying which parts of the prompt are dynamic inputs this helps add structure for the next step
Call Info: specifying the model and the provider

Go to the Observability tab in the Freeplay UI to view your recorded data!

[Optional] Step 2: Convert the logged data to a prompt template

While this step is technically optional, it is necessary to set up the proper structure for the next section- running your first test.

Find one of your logs in the Observability tab and open it.

You’ll see a message in the UI to convert the session to a prompt template in Freeplay.

Click “Save to template” to open the prompt editor with the logged completion loaded in. We can even re-run it from right here in the editor.

We will now want to turn the messages into an templated prompt. In this situation Taylor Swift is a dynamic value that is going to change each time we invoke the prompt so we are going to replace the hard coded text with a dynamic variable called artist.

Now instead of that prompt being hardcoded to Taylor Swift, we’ve turned it into a template such that we can pass in any artist (like Justin Beiber). Once you’ve properly created a prompt template you’ll hit save and then you’ll want to update your observability record calls to link to that prompt template id.

fp_client.recordings.create(
    RecordPayload(
        project_id=project_id,
        all_messages=messages,
        inputs={'artist': 'Taylor Swift'}, # Optional,
        call_info=CallInfo(provider='openai', model='gpt-4.1', # Optional,
        prompt_info=PromptInfo(
            prompt_template_id='9b3441d8-f07d-4f0c-cc687-f04b21f92496',
            prompt_template_version_id='8a3441d8-f07d-4f0c-b72d-f04b21f92496'
        )
    )
)

Running Your First Test

The power of Freeplay really shines when you have a repeatable iteration loop. Tests are a key part of that loop. Let’s set up our first test.

Step 1 Create a dataset

Datasets in Freeplay give you a repeatable collection of inputs to test prompt changes against.

To create a new dataset navigate to the Datasets tab and click “Create Dataset” in the top right.

You’ll give your dataset a name and description as well as decide which prompt(s) you want this dataset to be compatible with.

Once we’ve create a dataset there are a number of ways to populate it with examples including

Step 2: Create an Eval

Evaluations in Freeplay give you a mechanism to score the quality of your LLM systems. Freeplay’s evaluation offerng is extensive and flexible (see this guide for full details).

But for now we will create a single LLM-as-Judge evaluator by going to the Evaluations tab and clicking “New Evaluation” in the top right.

Select a target and the type of evaluation you want to create.

Give your evaluation a name and description and decide what scoring scale you want to use.

Freeplay’s agent will draft an evaluation for you, but it’s fully customizable from here. Once you’re happy with it hit Save.

Step 3: Run a Test

Test Runs in Freeplay give you a repeatable way to quantify changes to your LLM system. Tests can be executed via the UI or via the SDK.

To run from the UI go to Tests and click “New Test”. Select what prompt version and dataset you want to test.

Test will break down a prompt performance by cost, latency and evaluation scores.

Tests become even more poweful when comparing multiple versions of a prompt. To add a comparison click “Add Comparison”. Use this feature to compare how different models, or updates perform for your test cases.

For any test you can also dive into the row level details by navigating over to the Test Cases tab within the test.

Now you have the insight to make quantifiable deployment decision in a repeatable way!

Full Integration (Recomended)

Getting Started with Freeplay

Prerequisites

1. A Freeplay Account

Have set up questions? Contact us at [email protected] and we'll help you get set up.

Create Your First Project

Step 1: Navigate to Projects

Step 2: Configure Your Project

Integrate Your Project

Install the Freeplay SDK

3 ways to integrate

Option 1: Freeplay prompt management

Option 2: Sync prompts to Freeplay from code

Option 3: Lightweight Observability

Running Your First Test