Integrations Overview

OpenSearch AgentHealth integrates with a wide ecosystem of AI tools, frameworks, and platforms. This guide provides an overview of available integrations.

Integration Categories

Model Providers

Connect to OpenAI, Anthropic, AWS Bedrock, and more.

Model Providers →

Agent Frameworks

LangGraph, Strands, HolmesGPT, CrewAI, and custom agents.

Agent Frameworks →

Cloud Providers

AWS, GCP, Azure for deployment and infrastructure.

Cloud Providers →

Custom Integrations

Build your own integrations with our SDK.

Custom Integrations →

Model Provider Integrations

Anthropic Claude

Native support for Claude models via direct API or AWS Bedrock:

from opensearch_agentops.integrations import AnthropicInstrumentation

AnthropicInstrumentation.instrument()

import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    messages=[{"role": "user", "content": prompt}]
)
# Automatically traced with token usage

OpenAI

Full instrumentation for OpenAI API calls:

from opensearch_agentops.integrations import OpenAIInstrumentation

OpenAIInstrumentation.instrument()

from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": prompt}]
)
# Automatically traced

AWS Bedrock

Support for all Bedrock-hosted models:

from opensearch_agentops.integrations import BedrockInstrumentation

BedrockInstrumentation.instrument()

import boto3
bedrock = boto3.client("bedrock-runtime")
response = bedrock.invoke_model(
    modelId="anthropic.claude-3-sonnet-20240229-v1:0",
    body=json.dumps({"prompt": prompt})
)
# Automatically traced with model info

Agent Framework Integrations

LangGraph

Deep integration with LangGraph for stateful agents:

from opensearch_agentops.integrations import LangGraphInstrumentation

# Enable instrumentation
LangGraphInstrumentation.instrument()

from langgraph.prebuilt import create_react_agent

agent = create_react_agent(model, tools)
result = agent.invoke({"messages": [("user", prompt)]})
# All nodes, edges, and state transitions are traced

Captured Data:

Node execution with timing
State transitions
Tool call details
Message history

Strands (AWS)

Native support for the Strands agent framework:

from opensearch_agentops.integrations import StrandsInstrumentation

StrandsInstrumentation.instrument()

from strands import Agent

agent = Agent(
    model="claude-sonnet-4",
    tools=[tool1, tool2]
)
result = agent.run(prompt)
# Full trajectory captured

HolmesGPT

Integration with HolmesGPT for root cause analysis:

from opensearch_agentops.integrations import HolmesGPTInstrumentation

HolmesGPTInstrumentation.instrument()

from holmesgpt import HolmesGPT

holmes = HolmesGPT()
investigation = holmes.investigate(issue_description)
# Investigation steps traced

HolmesGPT Benchmarks:

AgentHealth can import HolmesGPT’s 150+ evaluation scenarios:

# Import HolmesGPT benchmarks
benchmark = agentops.import_benchmark(
    source="holmesgpt/evaluations",
    categories=["kubernetes", "prometheus", "logs"],
    difficulty=["easy", "medium", "hard"]
)

CrewAI

Support for multi-agent CrewAI workflows:

from opensearch_agentops.integrations import CrewAIInstrumentation

CrewAIInstrumentation.instrument()

from crewai import Crew, Agent, Task

crew = Crew(agents=[agent1, agent2], tasks=[task1, task2])
result = crew.kickoff()
# All agent interactions traced

Evaluation Tool Integrations

Anthropic Bloom

Import Bloom behavioral evaluation benchmarks:

# Import Bloom benchmark
bloom = agentops.import_benchmark(
    source="anthropic/bloom",
    behaviors=[
        "sycophancy",
        "self-preservation",
        "self-preferential-bias",
        "instructed-sabotage"
    ]
)

# Run evaluation
run = agentops.run_benchmark(
    benchmark_id=bloom.id,
    agent="my-agent",
    model="claude-sonnet-4"
)

# Analyze results
print(f"Sycophancy elicitation rate: {run.metrics['sycophancy']}")

Bloom Evaluation Process:

Understanding: Analyze behavior definitions
Ideation: Generate diverse test scenarios
Rollout: Execute scenarios in parallel
Judgment: Score transcripts for behavior presence

Braintrust Integration

Export evaluations to Braintrust format:

# Export results to Braintrust
agentops.export_to_braintrust(
    run_id="run-001",
    project="my-agent-evals"
)

clawd.bot Integration

Connect with clawd.bot for agentic evaluations:

Note: clawd.bot enables multi-turn conversation evaluation, synthetic data generation, and test case creation from real user interactions.

from opensearch_agentops.integrations import ClawdBotInstrumentation

# Enable clawd.bot integration
ClawdBotInstrumentation.configure(
    api_key="your-clawd-bot-key",
    project="agent-evals"
)

# Generate synthetic test cases
synthetic_cases = agentops.clawd.generate_test_cases(
    scenario="customer support",
    num_cases=50,
    include_edge_cases=True
)

# Run multi-turn evaluation
multi_turn_run = agentops.clawd.evaluate_conversation(
    agent="my-agent",
    scenario="troubleshooting-flow",
    max_turns=10
)

clawd.bot Features:

Multi-turn evaluation: Test conversational agents over multiple exchanges
Synthetic data generation: Create realistic test scenarios automatically
User simulation: AI-powered user personas for testing
Edge case discovery: Automatically find failure modes

Data Pipeline Integrations

OpenSearch

Native storage backend for all telemetry:

services:
  opensearch:
    image: opensearchproject/opensearch:2.11.0
    environment:
      - discovery.type=single-node
      - OPENSEARCH_INITIAL_ADMIN_PASSWORD=admin
    ports:
      - "9200:9200"

Prometheus

Metrics export for dashboards and alerting:

scrape_configs:
  - job_name: 'otel-collector'
    static_configs:
      - targets: ['otel-collector:8889']

Data Prepper

Transform and enrich telemetry data:

otel-traces-pipeline:
  source:
    otel_trace_source:
      ssl: false
  processor:
    - service_map_stateful:
    - otel_traces:
  sink:
    - opensearch:
        hosts: ["https://opensearch:9200"]
        index_type: trace-analytics-raw

Comparison with Competitor Integrations

Integration	AgentHealth	Langfuse	Arize	Braintrust	LangSmith
OpenAI	Native OTEL	Native	Native	Native	Native
Anthropic	Native OTEL	Native	SDK	Native	Native
AWS Bedrock	Native OTEL	Manual	No	Manual	Manual
LangGraph	Auto-instrument	Auto	No	Manual	Native
Strands	Auto-instrument	No	No	No	No
HolmesGPT	Native + Benchmarks	No	No	Manual	No
Custom OTEL	Native	Adapter	No	No	No
Self-Hosted	Yes	Yes	No	No	No

Building Custom Integrations

For unsupported frameworks, use the adapter pattern:

from opensearch_agentops import CustomAdapter

class MyFrameworkAdapter(CustomAdapter):
    def execute(self, prompt, context=None):
        with self.tracer.start_as_current_span("my_framework") as span:
            span.set_attribute("gen_ai.system", "my-framework")

            # Your framework logic
            result = my_framework.run(prompt)

            return result

# Register adapter
agentops.register_adapter("my-framework", MyFrameworkAdapter)

Next Steps