Monitoring Dashboard
See every decision your agent makes. Not just server metrics — agent quality metrics.
Real-Time Metrics
Request Volume
Requests per minute, hour, and day. See traffic patterns at a glance.
Latency Distribution
p50, p95, p99 response times. Know exactly how fast your agent responds.
Token Usage
Track LLM token consumption by agent, model, and time period.
Error Rate
Failed requests, timeout rates, and error trends over time.
Tool Call Success
Per-tool success rates and execution times. Catch degrading tools early.
Scaling Events
When instances were added or removed and what triggered the change.
Alerts
Configure alerts to notify you before problems affect users:
- ✓ Error rate exceeds threshold (default: 5%)
- ✓ Response latency exceeds threshold (default: 10s p95)
- ✓ Tool call success rate drops below 95%
- ✓ Daily token spend exceeds 2x the 7-day average
- ✓ Agent health check fails 3 times in a row
Alerts delivered via email, Slack, or webhook.
Conversation Logs
Every conversation is logged with full detail — input, tool calls made, agent reasoning, and output. Search by date, agent, user, or content. Essential for debugging, quality improvement, and compliance.
Self-Hosted Alternative
- Build monitoring from scratch
- $50-100/mo for Datadog/Grafana
- Infrastructure metrics only
- No agent quality tracking
- Built-in, zero configuration
- Included in every plan
- Infra + quality metrics
- Conversation logging included