The first step to reliable agents
Eval (Evaluate agent responses) is the entry point to Rippletide. Before adding memory or decision runtime, start by testing what your agent already does. Rippletide evaluates your agent’s responses against expected answers, detects hallucinations, and gives you a clear pass/fail report. Once you know where your agent stands, you can move to the Context Graph for persistent memory and Decision runtime for deterministic decision-making.How It Works
Define expected Q&A pairs
Provide questions and their expected answers in a
qanda.json file, a Pinecone index, or a PostgreSQL database.Send questions to your agent
Rippletide sends each question to your agent’s endpoint and collects the response.
Compare and score
Each response is compared against the expected answer. The evaluation engine checks for factual accuracy, hallucinations, and completeness.
View results
Get a summary with total tests, pass/fail count, duration, and a link to the detailed dashboard at trust.rippletide.com.
Evaluation Criteria
Each agent response is evaluated on:| Criterion | What it checks |
|---|---|
| Factual accuracy | Does the response match the expected answer’s facts? |
| Hallucination detection | Does the response contain information not present in the knowledge base? |
| Completeness | Does the response cover all key points from the expected answer? |
Evaluation report
After evaluation, each response gets:- A label (
passorfail) - A justification explaining the verdict
- A list of facts extracted from the response, each labeled as correct or hallucinated
Two Ways to Evaluate
CLI
Run evaluations from the terminal with an interactive UI, real-time progress, and template support. See the CLI guide →API Endpoints
The evaluation API is separate from the SDK API:| SDK API | Evaluation API | |
|---|---|---|
| Purpose | Create agents, manage knowledge, chat | Evaluate agent responses |
| Base URL | https://agent.rippletide.com/api/sdk | Evaluation server |
| Auth | x-api-key header | x-api-key header |