Improved
10/17/2025
 14 days ago by Morgan Cox
We're helping you close the data flywheel faster. You can now curate golden dataset samples directly from production data, letting you build test cases with verified results right from real-world usage.
We've also shipped new dataset CRUD endpoints to support CI/CD workflows—making it simple to create and update datasets, manage test runs, and specify which evals to run against a test set via the API for targeted testing.
Read the details here.
