Skip to main content

Integrations Overview

OpenSearch AgentHealth integrates with a wide ecosystem of AI tools, frameworks, and platforms. This guide provides an overview of available integrations.

Connect to OpenAI, Anthropic, AWS Bedrock, and more.

Model Providers →

LangGraph, Strands, HolmesGPT, CrewAI, and custom agents.

Agent Frameworks →

AWS, GCP, Azure for deployment and infrastructure.

Cloud Providers →

Build your own integrations with our SDK.

Custom Integrations →

Native support for Claude models via direct API or AWS Bedrock:

from opensearch_agentops.integrations import AnthropicInstrumentation
AnthropicInstrumentation.instrument()
import anthropic
client = anthropic.Anthropic()
response = client.messages.create(
model="claude-sonnet-4-20250514",
messages=[{"role": "user", "content": prompt}]
)
# Automatically traced with token usage

Full instrumentation for OpenAI API calls:

from opensearch_agentops.integrations import OpenAIInstrumentation
OpenAIInstrumentation.instrument()
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": prompt}]
)
# Automatically traced

Support for all Bedrock-hosted models:

from opensearch_agentops.integrations import BedrockInstrumentation
BedrockInstrumentation.instrument()
import boto3
bedrock = boto3.client("bedrock-runtime")
response = bedrock.invoke_model(
modelId="anthropic.claude-3-sonnet-20240229-v1:0",
body=json.dumps({"prompt": prompt})
)
# Automatically traced with model info

Deep integration with LangGraph for stateful agents:

from opensearch_agentops.integrations import LangGraphInstrumentation
# Enable instrumentation
LangGraphInstrumentation.instrument()
from langgraph.prebuilt import create_react_agent
agent = create_react_agent(model, tools)
result = agent.invoke({"messages": [("user", prompt)]})
# All nodes, edges, and state transitions are traced

Captured Data:

  • Node execution with timing
  • State transitions
  • Tool call details
  • Message history

Native support for the Strands agent framework:

from opensearch_agentops.integrations import StrandsInstrumentation
StrandsInstrumentation.instrument()
from strands import Agent
agent = Agent(
model="claude-sonnet-4",
tools=[tool1, tool2]
)
result = agent.run(prompt)
# Full trajectory captured

Integration with HolmesGPT for root cause analysis:

from opensearch_agentops.integrations import HolmesGPTInstrumentation
HolmesGPTInstrumentation.instrument()
from holmesgpt import HolmesGPT
holmes = HolmesGPT()
investigation = holmes.investigate(issue_description)
# Investigation steps traced

HolmesGPT Benchmarks:

AgentHealth can import HolmesGPT’s 150+ evaluation scenarios:

# Import HolmesGPT benchmarks
benchmark = agentops.import_benchmark(
source="holmesgpt/evaluations",
categories=["kubernetes", "prometheus", "logs"],
difficulty=["easy", "medium", "hard"]
)

Support for multi-agent CrewAI workflows:

from opensearch_agentops.integrations import CrewAIInstrumentation
CrewAIInstrumentation.instrument()
from crewai import Crew, Agent, Task
crew = Crew(agents=[agent1, agent2], tasks=[task1, task2])
result = crew.kickoff()
# All agent interactions traced

Import Bloom behavioral evaluation benchmarks:

# Import Bloom benchmark
bloom = agentops.import_benchmark(
source="anthropic/bloom",
behaviors=[
"sycophancy",
"self-preservation",
"self-preferential-bias",
"instructed-sabotage"
]
)
# Run evaluation
run = agentops.run_benchmark(
benchmark_id=bloom.id,
agent="my-agent",
model="claude-sonnet-4"
)
# Analyze results
print(f"Sycophancy elicitation rate: {run.metrics['sycophancy']}")

Bloom Evaluation Process:

  1. Understanding: Analyze behavior definitions
  2. Ideation: Generate diverse test scenarios
  3. Rollout: Execute scenarios in parallel
  4. Judgment: Score transcripts for behavior presence

Export evaluations to Braintrust format:

# Export results to Braintrust
agentops.export_to_braintrust(
run_id="run-001",
project="my-agent-evals"
)

Connect with clawd.bot for agentic evaluations:

Note: clawd.bot enables multi-turn conversation evaluation, synthetic data generation, and test case creation from real user interactions.

from opensearch_agentops.integrations import ClawdBotInstrumentation
# Enable clawd.bot integration
ClawdBotInstrumentation.configure(
api_key="your-clawd-bot-key",
project="agent-evals"
)
# Generate synthetic test cases
synthetic_cases = agentops.clawd.generate_test_cases(
scenario="customer support",
num_cases=50,
include_edge_cases=True
)
# Run multi-turn evaluation
multi_turn_run = agentops.clawd.evaluate_conversation(
agent="my-agent",
scenario="troubleshooting-flow",
max_turns=10
)

clawd.bot Features:

  • Multi-turn evaluation: Test conversational agents over multiple exchanges
  • Synthetic data generation: Create realistic test scenarios automatically
  • User simulation: AI-powered user personas for testing
  • Edge case discovery: Automatically find failure modes

Native storage backend for all telemetry:

docker-compose.yml
services:
opensearch:
image: opensearchproject/opensearch:2.11.0
environment:
- discovery.type=single-node
- OPENSEARCH_INITIAL_ADMIN_PASSWORD=admin
ports:
- "9200:9200"

Metrics export for dashboards and alerting:

prometheus.yml
scrape_configs:
- job_name: 'otel-collector'
static_configs:
- targets: ['otel-collector:8889']

Transform and enrich telemetry data:

data-prepper-pipelines.yaml
otel-traces-pipeline:
source:
otel_trace_source:
ssl: false
processor:
- service_map_stateful:
- otel_traces:
sink:
- opensearch:
hosts: ["https://opensearch:9200"]
index_type: trace-analytics-raw
IntegrationAgentHealthLangfuseArizeBraintrustLangSmith
OpenAINative OTELNativeNativeNativeNative
AnthropicNative OTELNativeSDKNativeNative
AWS BedrockNative OTELManualNoManualManual
LangGraphAuto-instrumentAutoNoManualNative
StrandsAuto-instrumentNoNoNoNo
HolmesGPTNative + BenchmarksNoNoManualNo
Custom OTELNativeAdapterNoNoNo
Self-HostedYesYesNoNoNo

For unsupported frameworks, use the adapter pattern:

from opensearch_agentops import CustomAdapter
class MyFrameworkAdapter(CustomAdapter):
def execute(self, prompt, context=None):
with self.tracer.start_as_current_span("my_framework") as span:
span.set_attribute("gen_ai.system", "my-framework")
# Your framework logic
result = my_framework.run(prompt)
return result
# Register adapter
agentops.register_adapter("my-framework", MyFrameworkAdapter)

Configure AI provider integrations →

Set up framework integrations →

Build your own integration →