Record LLM interactions to Freeplay for observability and evaluation.
Record LLM interactions to the Freeplay server for observability and evaluation. All methods associated with recordings are accessible via the client.recordings namespace
Log your LLM interaction with Freeplay. This is assuming your have already retrieved a formatted prompt and made an LLM call as demonstrated in the Prompts Section
Copy
Ask AI
from freeplay import Freeplay, RecordPayload, CallInfo, SessionInfo, CallInfo, UsageTokens## PROMPT FETCH# set the prompt variablesprompt_vars = {"keyA": "valueA"}# get a formatted promptformatted_prompt = fp_client.prompts.get_formatted( project_id=project_id, template_name="template_name", environment="latest", variables=prompt_vars)## LLM CALL# Make an LLM call to your provider of choicestart = time.time()chat_response = openai_client.chat.completions.create( model=formatted_prompt.prompt_info.model, messages=formatted_prompt.llm_prompt, **formatted_prompt.prompt_info.model_parameters)end = time.time()# add the response to your message setall_messages = formatted_prompt.all_messages({ 'role': chat_response.choices[0].message.role, 'content': chat_response.choices[0].message.content})## RECORD# create a sessionsession = fp_client.sessions.create()# build the record payloadpayload = RecordPayload( project_id=project_id, all_messages=all_messages, inputs=prompt_vars, session_info=session.session_info, prompt_version_info=formatted_prompt.prompt_info, call_info=CallInfo.from_prompt_info( formatted_prompt.prompt_info, start_time=start, end_time=end, usage=UsageTokens( prompt_tokens=chat_response.usage.prompt_tokens, completion_tokens=chat_response.usage.completion_tokens ) ))# record the LLM interactionfp_client.recordings.create(payload)
Some LLM providers offer a batch method of generating completions. If you are using the Batch API, you can log your results to Freeplay with api_style="batch". This parameter is needed to calculate accurate costs for batch API usage, which are often significantly lower than regular completions. The general flow for logging batch data looks like:
Create batch file with Freeplay completion tracking For each input, format your prompt using Freeplay’s template, then create a completion record with api_style='batch' in the CallInfo. Store the returned completion_id to update the completion in Freeplay once the batch request has completed.
Submit batch to OpenAI and poll for completion Upload your batch file to OpenAI, create the batch request, and poll until the batch status is “completed”.
Update Freeplay with results Once complete, read the batch output file and update each Freeplay completion using the completion_id from the response to match it back to the original request.
Here is a full code example for reference.Note: Only use the batch api_style if you are calling a batch LLM API. Setting this parameter incorrectly may result in incorrect costs displaying in Freeplay!
You can record tool calls and their associated schemas for both OpenAI and Anthropic. These recorded completions and tool schemas can be viewed in the observability tab.
When using function calling or tool use, pass the tool_schema parameter in your RecordPayload.
This should be a list of tool/function definitions that were available to the model. The schema can be retrieved from
formatted_prompt.tool_schema if defined in your prompt template, or you can pass your own tool definitions in the same
format you use to pass them to the model.
The example below shows the default approach: tool calls are recorded as part of the completion messages. When you call formatted_prompt.all_messages(), the LLM’s tool call output and subsequent tool results are concatenated into the message history alongside other messages.
For more granular tool observability, you can also create explicit tool spans using traces with kind='tool'. This renders tool arguments and results as separate spans in the trace view, which is useful for debugging complex agent workflows. See Tool Calls for both approaches.
Copy
Ask AI
from freeplay import Freeplay, RecordPayload, CallInfofrom openai import OpenAI## FETCH PROMPT# get your formatted prompt from freeplay including any associated tool schemasquestion = "What is the latest AI news?"prompt_vars = {"question": question}formatted_prompt = fp_client.prompts.get_formatted( project_id=project_id, template_name="NewsSummarizerFuncEnabled", variables=prompt_vars, environment="latest")## LLM CALL# make your llm call with your tool schemas passed ins = time.time()openai_client = OpenAI(api_key=openai_key)chat_response = openai_client.chat.completions.create( model=formatted_prompt.prompt_info.model, messages=formatted_prompt.llm_prompt, **formatted_prompt.prompt_info.model_parameters, tools=formatted_prompt.tool_schema)e = time.time()## RECORD# create a sessionsession = fp_client.sessions.create()# Append the response to the messagesmessages = formatted_prompt.all_messages(chat_response.choices[0].message)# Optionally prep Freeplay parameters# Provide token usetoken_usage = UsageTokens( prompt_tokens=chat_response.usage.prompt_tokens, completion_tokens=chat_response.usage.completion_tokens)# Provide timing informationcall_info = CallInfo.from_prompt_info( prompt_info=formatted_prompt.prompt_info, start_time=start, end_time=end, usage=usage)## RECORD# Record to Freeplayrecord_response = fpclient.recordings.create( RecordPayload( project_id=project_id, all_messages=messages, session_info=session.session_info, inputs=prompt_vars, prompt_version_info=formatted_prompt.prompt_info, call_info=call_info, tool_schema=formatted_prompt.tool_schema # Optionally record the tool schema as well ))
Freeplay allows you to update a completion once it has already been recorded. This can be useful to add client evals or additional messages to the completion. To do this, you will need to have the project_id and completion_id. The completion_id is returned via recordings.create. The code example below shows this in action:
Copy
Ask AI
############################################################################## UPDATE THE COMPLETION - Use the record_response to get the completion_id#############################################################################from freeplay.resources.recordings import RecordUpdatePayload# Update the completion with customer feedback# This uses the update method to attach feedback to a specific completionfp_client.recordings.update( RecordUpdatePayload( project_id=PROJECT_ID, completion_id=final_completion_id, eval_results={ "accuracy": accuracy, "precision": precision, "recall": recall, "f1": f1, }, ))
Freeplay allows you to record LLM interactions from any model or provider, including hosts or formats Freeplay doesn’t natively support in our application (see that list here). When calling other models, you’ll retrieve a Freeplay prompt template in our common format and reformat it as needed for the LLM you want to use.Here is an example of calling Mistral 7B hosted on BaseTen
Copy
Ask AI
from freeplay import Freeplay, RecordPayload, CallInfofrom freeplay.llm_parameters import LLMParametersimport osimport requestsimport time# retrieve your prompt template from freeplayprompt_vars = {"keyA": "valueA"}prompt = fpClient.prompts.get(project_id=project_id, template_name="template_name", environment="latest")# bind your variables to the promptformatted_prompt = prompt.bind(prompt_vars).format()# customize the messages for your provider API# In this case, mistral does not support system messages# we need to merge the system message into the initial user messagemessages = [{'role': 'user', 'content': formatted_prompt.messages[0]['content'] + '' + formatted_prompt.messages[1]['content']}]# make your LLM call to your custom provider# call mistral 7b hosted with basetens = time.time()baseten_url = "https://model-xyz.api.baseten.co/production/predict"headers = { "Authorization": "Api-Key " + baseten_key,}data = {'messages': messages}req = requests.post( url=baseten_url, headers=headers, json=data)e = time.time()# add the response to ongoing list of messagesresText = req.json()messages.append({'role': 'assistant', 'content': resText})# create a freeplay sessionsession = fpClient.sessions.create()# Construct the CallInfo from scratchcall_info=CallInfo( provider="mistral", model="mistral-7b", start_time=s, end_time=e, model_parameters=LLMParameters( {"paramA": "valueA", "paramB": "valueB"} ), usage=UsageTokens(prompt_tokens=123, completion_tokens=456) )# record the LLM interaction with Freeplaypayload = RecordPayload( project_id=project_id, all_messages=messages, inputs=prompt_vars, session_info=session.session_info, prompt_version_info=prompt.prompt_info, call_info=call_info)fpClient.recordings.create(payload)
The majority of critical model parameters like temperature and max_tokens can be configured in the Freeplay UI. However, if you are using additional parameters these can still be recorded during the Record call and will be displayed in the UI alongside the Completion.
Copy
Ask AI
from freeplay import Freeplay, RecordPayload, CallInfo# create a session which will create a UIDsession = fpClient.sessions.create()# build call info from scratch to log additional params# get the base paramsstart_params = formatted_prompt.prompt_info.model_parameters# set the additional parametersadditional_params = {"presence_penalty": 0.8, "n": 5}# combine the two parameter setsall_params = {**start_params, **additional_params}call_info = CallInfo( provider=formatted_prompt.prompt_info.provider, model=formatted_prompt.prompt_info.model, start_time=s, end_time=e, model_parameters=all_params # pass the full parameter set)# record the resultspayload = RecordPayload( project_id=project_id, all_messages=all_messages, inputs=prompt_vars, session_info=session.session_info, prompt_version_info=prompt_info, call_info=call_info)completion_info = fpClient.recordings.create(payload)