API Reference

Benchmark: Data abstractions for packaging annotated samples into reusable benchmarks and CSV-backed suites.
Critico: Orchestrator that aggregates evaluator results and reports back to the RELAI platform.
Evaluator: Base classes and built-in evaluators for rubric, format, style, and annotation scoring.
Maestro: Optimization engine that tunes agent configs and structure based on evaluation feedback.
Mockers: Persona and mock tool definitions for simulating MCP interactions during tests.
Simulator: Decorators and simulator runtimes to replay agent flows in controlled environments.
Types: Core data models (RELAISample, SimulationTape, logs) shared across simulation and evaluation.