Recording Completions - Freeplay Introduction

Record LLM interactions to the Freeplay server for observability and evaluation. All methods associated with recordings are accessible via the client.recordings namespace

Methods Overview

Method Name	Parameters	Description
`create`	RecordPayload	Log your LLM interaction
`update`	`completion_id:` string, `feedback:` dict	Allows users to log additional feedback or information to a completion after it has already been recorded.

Record an LLM Interaction

Log your LLM interaction with Freeplay. This is assuming your have already retrieved a formatted prompt and made an LLM call as demonstrated in the Prompts Section

from freeplay import Freeplay, RecordPayload, CallInfo, ResponseInfo, SessionInfo, CallInfo, UsageTokens

## PROMPT FETCH

# set the prompt variables

prompt_vars = {"keyA": "valueA"}

# get a formatted prompt

formatted_prompt = fpClient.prompts.get_formatted(project_id=project_id,
template_name="template_name",
environment="latest",
variables=prompt_vars)

## LLM CALL

# Make an LLM call to your provider of choice

start = time.time()
chat_response = openaiClient.chat.completions.create(
model=formatted_prompt.prompt_info.model,
messages=formatted_prompt.llm_prompt,
\*\*formatted_prompt.prompt_info.model_parameters
)
end = time.time()

# add the response to your message set

all_messages = formatted_prompt.all_messages(
{'role': chat_response.choices[0].message.role,
'content': chat_response.choices[0].message.content}
)

## RECORD

# create a session

session = fpClient.sessions.create()

# build the record payload

payload = RecordPayload(
project_id=project_id,
all_messages=all_messages,
inputs=prompt_vars,
session_info=session.session_info,
prompt_version_info=formatted_prompt.prompt_info,
call_info=CallInfo.from_prompt_info(formatted_prompt.prompt_info, start_time=start, end_time=end, usage=UsageTokens(chat_response.usage.prompt_tokens, chat_response.usage.completion_tokens)),
response_info=ResponseInfo(
is_complete=chat_response.choices[0].finish_reason == 'stop'
)
)

# record the LLM interaction

fpClient.recordings.create(payload)

### Using OpenAI or Anthropic’s Batch APIs

Some LLM providers offer a batch method of generating completions. If you are using the Batch API, you can log your results to Freeplay with api_style="batch". This parameter is needed to calculate accurate costs for batch API usage, which are often significantly lower than regular completions. The general flow for logging batch data looks like:

Create batch file with Freeplay completion tracking For each input, format your prompt using Freeplay’s template, then create a completion record with api_style='batch' in the CallInfo. Store the returned completion_id to update the completion in Freeplay once the batch request has completed.
Submit batch to OpenAI and poll for completion Upload your batch file to OpenAI, create the batch request, and poll until the batch status is “completed”.
Update Freeplay with results Once complete, read the batch output file and update each Freeplay completion using the completion_id from the response to match it back to the original request.

Here is a full code example for reference. Note: Only use the batch api_style if you are calling a batch LLM API. Setting this parameter incorrectly may result in incorrect costs displaying in Freeplay!

usage=UsageTokens(prompt_tokens=123, completion_tokens=456), api_style='batch'
) ```
</CodeGroup>

## Recording Tools

You can record tool calls and their associated schemas for both OpenAI and Anthropic. These recorded completions and tool schemas can be viewed in the observability tab.

<CodeGroup>
```python python
from freeplay import Freeplay, RecordPayload, ResponseInfo, CallInfo
from openai import OpenAI

## FETCH PROMPT

# get your formatted prompt from freeplay including any associated tool schemas

question = "What is the latest AI news?"
prompt_vars = {"question": question}
formatted_prompt = fpClient.prompts.get_formatted(project_id=project_id,
template_name="NewsSummarizerFuncEnabled",
variables=prompt_vars,
environment="latest")

## LLM CALL

# make your llm call with your tool schemas passed in

s = time.time()
openaiClient = OpenAI(api_key=openai_key)
chat_response = openaiClient.chat.completions.create(
model=formatted_prompt.prompt_info.model,
messages=formatted_prompt.llm_prompt,
\*\*formatted_prompt.prompt_info.model_parameters,
tools=formatted_prompt.tool_schema
)
e = time.time()

## RECORD

# create a session

session = fpClient.sessions.create()

# Append the response to the messages

messages = formatted_prompt.all_messages(chat_response.choices[0].message)

# Optionally prep Freeplay parameters

# Provide token use

token_usage = UsageTokens(
prompt_tokens=chat_response.usage.prompt_tokens,
completion_tokens=chat_response.usage.completion_tokens
)

# Provide timing information

call_info = CallInfo.from_prompt_info(
prompt_info=formatted_prompt.prompt_info,
start_time=start,
end_time=end,
usage=usage
)

# Provide finish reason

response_info = ResponseInfo(
is_complete=chat_response.choices[0].finish_reason == 'stop'
)

## RECORD

# Record to Freeplay

record_response = fpclient.recordings.create(
RecordPayload(
project_id=project_id,
all_messages=messages,
session_info=session.session_info,
inputs=prompt_vars,
prompt_version_info=formatted_prompt.prompt_info,
call_info=call_info,
tool_schema=formatted_prompt.tool_schema, # Optionally record the tool schema as well
response_info=response_info,
)
)

Updating a Completion

Freeplay allows you to update a completion once it has already been recorded to. This can be useful to add client evals or additional messages to the completion. To do this, you will need to have the project_id and completion_id. The completion_id is returned via recordings.create. The code example below shows this in action:

new_messages: Optional[List[Dict[str, Any]]] = None eval_results:
Optional[Dict[str, Union[bool, float]]] = None ```
</CodeGroup>

<CodeGroup>
```python python
#############################################################################
# UPDATE THE COMPLETION - Use the record_response to get the completuon_id
#############################################################################
from freeplay.resources.recordings import RecordUpdatePayload

# Update the completion with customer feedback

# This uses the update method to attach feedback to a specific completion

fp_client.recordings.update(
RecordUpdatePayload(
project_id=PROJECT_ID,
completion_id=final_completion_id,
eval_results={
"accuracy": accuracy,
"precision": precision,
"recall": recall,
"f1": f1,
},
)
)

Calling Any Model

Freeplay allows you to record LLM interactions from any model or provider, including hosts or formats Freeplay doesn’t natively support in our application (see that list here). When calling other models, you’ll retrieve a Freeplay prompt template in our common format and reformat it as needed for the LLM you want to use. Here is an example of calling Mistral 7B hosted on BaseTen

from freeplay import Freeplay, RecordPayload, ResponseInfo, CallInfo
from freeplay.llm_parameters import LLMParameters
import os
import requests
import time

# retrieve your prompt template from freeplay
prompt_vars = {"keyA": "valueA"}
prompt = fpClient.prompts.get(project_id=project_id,
                                        template_name="template_name",
                                        environment="latest")
# bind your variables to the prompt
formatted_prompt = prompt.bind(prompt_vars).format()
# customize the messages for your provider API
# In this case, mistral does not support system messages
# we need to merge the system message into the initial user message
messages = [{'role': 'user',
             'content': formatted_prompt.messages[0]['content'] + '' + formatted_prompt.messages[1]['content']}]

# make your LLM call to your custom provider
# call mistral 7b hosted with baseten
s = time.time()
baseten_url = "https://model-xyz.api.baseten.co/production/predict"
headers = {
    "Authorization": "Api-Key " + baseten_key,
}
data = {'messages': messages}
req = requests.post(
    url=baseten_url,
    headers=headers,
    json=data
)
e = time.time()

# add the response to ongoing list of messages
resText = req.json()
messages.append({'role': 'assistant', 'content': resText})

# create a freeplay session
session = fpClient.sessions.create()

# Construct the CallInfo from scratch
call_info=CallInfo(
        provider="mistral",
        model="mistral-7b",
        start_time=s,
        end_time=e,
        model_parameters=LLMParameters(
            {"paramA": "valueA", "paramB": "valueB"}
        ),
        usage=UsageTokens(prompt_tokens=123, completion_tokens=456)

# record the LLM interaction with Freeplay
payload = RecordPayload(
    project_id=project_id,
    all_messages=messages,
    inputs=prompt_vars,
    session_info=session,
    prompt_version_info=prompt.prompt_info,
    call_info=call_info,
    response_info=ResponseInfo(is_complete=True)
)

fpClient.recordings.create(payload)

Record Custom Model Parameters

The majority of critical model parameters like temperature and max_tokens can be configured in the Freeplay UI. However, if you are using additional parameters these can still be recorded during the Record call and will be displayed in the UI alongside the Completion.

from freeplay import Freeplay, RecordPayload, ResponseInfo, CallInfo

# create a session which will create a UID

session = fpClient.sessions.create()

# build call info from scratch to log additional params

# get the base params

start_params = formatted_prompt.prompt_info.model_parameters

# set the additional parameters

additional_params = {"presence_penalty": 0.8, "n": 5}

# combine the two parameter sets

all_params = {**start_params, **additional_params}
call_info = CallInfo(
provider=formatted_prompt.prompt_info.provider,
model=formatted_prompt.prompt_info.model,
start_time=s,
end_time=e,
model_parameters=all_params # pass the full parameter set
)

# record the results

payload = RecordPayload(
project_id=project_id,
all_messages=all_messages,
inputs=prompt_vars,
session_info=session.session_info,
prompt_version_info=prompt_info,
call_info=call_info,
response_info=ResponseInfo(
is_complete=chat_response.choices[0].finish_reason == 'stop'
)
)
completion_info = fpClient.recordings.create(payload)

​Methods Overview

​Record an LLM Interaction

​### Using OpenAI or Anthropic’s Batch APIs

​Updating a Completion

​Calling Any Model