Skip to main content

Welcome to OpenSearch AgentHealth

Open-source AI agent observability and evaluation platform. Built on OpenTelemetry standards with zero vendor lock-in.

Building AI agents is easy. Knowing why they fail is hard.

Traditional APM tools were designed for request-response services—not autonomous agents that reason, plan, and execute multi-step workflows. When your agent makes unexpected decisions, standard metrics and traces leave you blind to why.

OpenSearch AgentHealth enables Eval-Driven Development—build, observe, improve, repeat. Instrument your agents with OpenTelemetry from day one, capture full decision sequences (not just outputs), and use AI-as-a-Judge to score trajectories. Teams using this approach ship production agents in weeks instead of months.


Agent Health Architecture

ComponentDescription
Agent HealthCore platform consisting of UI (web interface for traces and evaluations), CLI/SDK (programmatic instrumentation and evaluation), and Server (orchestration and data routing)
Agent ConnectorsProtocol adapters for different agent types (AG-UI, Claude Code, Custom)
OTEL CollectorOpenTelemetry collector that receives traces, metrics, and logs from agents
OpenSearch ClusterStorage backend for traces, evaluations, and agent data
OpenSearch DashboardsVisualization layer for exploring data and building custom dashboards
  1. User Interaction — Users interact with the Agent Health Stack via the UI or CLI/SDK
  2. Agent Connection — The Server connects to agents via connectors using AG-UI protocol or custom adapters
  3. Telemetry Collection — Agents emit OTLP (OpenTelemetry Protocol) data to the OTEL Collector
  4. Data Storage — The Collector routes data to OpenSearch for storage and indexing
  5. Visualization — OpenSearch Dashboards provides unified visualization across all data sources

  • Tracing-Based Evaluation — Analyze the entire agent journey—every thought, tool call, and decision—with AI Judge scoring

  • Built on OpenTelemetry — Native support for OTEL GenAI semantic conventions. Compatible with any OTEL-instrumented agent

  • Real-Time Streaming — Watch agent execution unfold live via AG-UI protocol

  • Multi-Agent Comparison — Compare agents side-by-side with A/B testing and regression tracking

  • OpenSearch-Native — One platform, one query language for logs, metrics, and agent observability

  • Truly Open Source — Apache 2.0 licensed with no vendor lock-in


Get Started → — Learn how to instrument your agents and run evaluations