Skip to main content

The first step to reliable agents

Eval (Evaluate agent responses) is the entry point to Rippletide. Before adding memory or decision runtime, start by testing what your agent already does. Rippletide evaluates your agent’s responses against expected answers, detects hallucinations, and gives you a clear pass/fail report. Once you know where your agent stands, you can move to the Context Graph for persistent memory and Decision runtime for deterministic decision-making.

How It Works

1

Define expected Q&A pairs

Provide questions and their expected answers in a qanda.json file, a Pinecone index, or a PostgreSQL database.
2

Send questions to your agent

Rippletide sends each question to your agent’s endpoint and collects the response.
3

Compare and score

Each response is compared against the expected answer. The evaluation engine checks for factual accuracy, hallucinations, and completeness.
4

View results

Get a summary with total tests, pass/fail count, duration, and a link to the detailed dashboard at trust.rippletide.com.

Evaluation Criteria

Each agent response is evaluated on:
CriterionWhat it checks
Factual accuracyDoes the response match the expected answer’s facts?
Hallucination detectionDoes the response contain information not present in the knowledge base?
CompletenessDoes the response cover all key points from the expected answer?
A response passes when it is factually accurate and free of hallucinations. It fails when it contains fabricated information or contradicts the expected answer.

Evaluation report

After evaluation, each response gets:
  • A label (pass or fail)
  • A justification explaining the verdict
  • A list of facts extracted from the response, each labeled as correct or hallucinated
{
  "label": "fail",
  "justification": "The response claims free shipping on all orders, which contradicts the knowledge base.",
  "facts": [
    { "fact": "Returns accepted within 30 days", "label": "correct" },
    { "fact": "Free shipping on all orders", "label": "hallucination" }
  ]
}

Two Ways to Evaluate

CLI

Run evaluations from the terminal with an interactive UI, real-time progress, and template support. See the CLI guide →

API Endpoints

The evaluation API is separate from the SDK API:
SDK APIEvaluation API
PurposeCreate agents, manage knowledge, chatEvaluate agent responses
Base URLhttps://agent.rippletide.com/api/sdkEvaluation server
Authx-api-key headerx-api-key header
See the API Reference for full endpoint documentation.

What’s next?

Now that you can evaluate your agent, the next step is to give it memory: