Auto-categorization

Auto-categorization in Freeplay provides automated tagging of your application logs to help you make sense of your data.

Understanding Auto-Categorization

Auto-categorization provides teams with an additional layer of intelligence about their AI systems by automatically tagging incoming logs with specified categories. This feature adds valuable context to your production data, helping product and engineering teams understand how their AI applications are being used and where improvements are needed.

Why Use Auto-Categorization

For teams building AI products, understanding usage patterns and identifying trends is essential. Auto-categorization reveals what types of questions users ask, which product areas generate the most activity, and where your system might need attention.

When combined with evaluations, auto-categorization helps pinpoint exactly which types of inputs challenge your system. Tracking categories over time reveals usage trends, helps identify emerging patterns, and provides product teams with actionable insights about feature adoption and user behavior.

Using Auto-categorization

Auto-categorization works at both the agent and completion level. Start by defining category types that align with your business needs—such as product areas (API/SDK, Observability, Prompt Management) or user intent types (Technical Support, Billing, Product Information).

Creating Effective Categories

To set up a new category, follow these steps:

  1. Name your categorization scheme - Choose a descriptive name for the overall categorization (e.g., "Product Area")
  2. Select relevant references - Choose which variables (input, output, history) the LLM should consider when categorizing
  3. Define individual categories - Add specific categories with clear, distinct descriptions
  4. Configure multi-category options - Decide whether items can be tagged with multiple categories or just one

Best Practices

Keep descriptions clear and distinct - Category names are limited to 32 characters and descriptions to 500 characters. Make each category clearly distinguishable:

  • Category: "API/SDK"
  • Description: "Questions about API endpoints, authentication, SDK installation, code integration, or programmatic access to the platform"

Use variable references strategically - While you can reference variables like input or output in your descriptions, these won't be directly injected. Instead, they guide the LLM's attention to relevant parts of the interaction.

Using Categories in Practice

Once configured, categories appear in the evaluation panel of completions and traces. The ✨ icon reveals the LLM's reasoning for each categorization. Teams can approve correct categorizations for human validation or manually override incorrect classifications.

Observability & Monitoring

The Observability dashboard transforms your categories into actionable intelligence. Monitor category distributions to understand usage patterns, track feature adoption, or identify areas needing attention. The stacked bar chart visualization makes it easy to see category breakdowns over time.

Creating Targeted Datasets and Review Queues

Auto-categorization streamlines the process of creating focused subsets for testing and review. By building targeted datasets, teams can create curated and focused datasets that can be used to test specific types of inputs. For review queue creation, this allows product and engineering teams to collaborate and focus on specific parts of the product for review. This helps the team concentrate and focus in order to improve quality of the system.
To create and curate these review queues and datasets, simply use observability to filter by the specific categories of the auto-categorization, select all the completions, and add to a dataset or review queue!

Implementation Tips

Start with broad categories - Begin with high-level categorizations before creating more granular subcategories. For example:

  • Documentation - "Questions about finding, understanding, or using product documentation and guides"
  • Account Management - "Login issues, password resets, user permissions, or team access concerns"
  • Performance Issues - "Reports of slow response times, timeouts, high latency, or system availability problems"

Collaborate across teams - Product and engineering teams can jointly define categories to ensure they capture both technical and business-relevant insights.