Welcome to OpenSearch AgentHealth

Open-source AI agent observability and evaluation platform. Built on OpenTelemetry standards with zero vendor lock-in.

Why OpenSearch AgentHealth?

Building AI agents is easy. Knowing why they fail is hard.

Traditional APM tools were designed for request-response services—not autonomous agents that reason, plan, and execute multi-step workflows. When your agent makes unexpected decisions, standard metrics and traces leave you blind to why.

OpenSearch AgentHealth enables Eval-Driven Development—build, observe, improve, repeat. Instrument your agents with OpenTelemetry from day one, capture full decision sequences (not just outputs), and use AI-as-a-Judge to score trajectories. Teams using this approach ship production agents in weeks instead of months.

Architecture

Agent Health Architecture

Stack Components

Component	Description
Agent Health	Core platform consisting of UI (web interface for traces and evaluations), CLI/SDK (programmatic instrumentation and evaluation), and Server (orchestration and data routing)
Agent Connectors	Protocol adapters for different agent types (AG-UI, Claude Code, Custom)
OTEL Collector	OpenTelemetry collector that receives traces, metrics, and logs from agents
OpenSearch Cluster	Storage backend for traces, evaluations, and agent data
OpenSearch Dashboards	Visualization layer for exploring data and building custom dashboards

Data Flow

User Interaction — Users interact with the Agent Health Stack via the UI or CLI/SDK
Agent Connection — The Server connects to agents via connectors using AG-UI protocol or custom adapters
Telemetry Collection — Agents emit OTLP (OpenTelemetry Protocol) data to the OTEL Collector
Data Storage — The Collector routes data to OpenSearch for storage and indexing
Visualization — OpenSearch Dashboards provides unified visualization across all data sources

Key Features

Tracing-Based Evaluation — Analyze the entire agent journey—every thought, tool call, and decision—with AI Judge scoring
Built on OpenTelemetry — Native support for OTEL GenAI semantic conventions. Compatible with any OTEL-instrumented agent
Real-Time Streaming — Watch agent execution unfold live via AG-UI protocol
Multi-Agent Comparison — Compare agents side-by-side with A/B testing and regression tracking
OpenSearch-Native — One platform, one query language for logs, metrics, and agent observability
Truly Open Source — Apache 2.0 licensed with no vendor lock-in

Next Steps

Get Started → — Learn how to instrument your agents and run evaluations