Blog

How to Monitor Your Paperclip Agents

May 27, 2026 · HostAgentes Team

Your agent is deployed and receiving traffic. But is it working well? Monitoring tells you. Here’s what to track and how to set it up.

Two Types of Monitoring

Infrastructure Monitoring

Is the server running? Is it responsive? This is traditional monitoring — CPU, memory, request latency, error rates. Necessary but insufficient.

Quality Monitoring

Is the agent making good decisions? Are users satisfied? Are tool calls succeeding? This is agent-specific monitoring — and it’s what separates good deployments from great ones.

You need both. Here’s how to approach each.

Infrastructure Metrics

These are the baseline health metrics:

MetricHealthyWarningCritical
Response latency (p50)<2s2-5s>5s
Response latency (p95)<5s5-10s>10s
Error rate<1%1-5%>5%
CPU utilization<60%60-80%>80%
Memory utilization<70%70-85%>85%
Queue depth<55-20>20

How to Monitor on HostAgentes

The built-in dashboard shows all infrastructure metrics in real-time:

  • Request volume over time
  • Latency distribution (p50, p95, p99)
  • Error rate trends
  • Active instances and scaling events
  • Token usage by agent

No setup required. It works from the moment you deploy.

Quality Metrics

These measure how well your agent is performing its job:

Conversation Completion Rate

What percentage of conversations reach a natural conclusion (vs. users abandoning mid-conversation)?

  • Good: >80%
  • Needs attention: 60-80%
  • Problem: <60%

Low completion rates suggest your agent isn’t meeting user expectations.

Tool Call Success Rate

What percentage of tool calls succeed?

  • Good: >98%
  • Needs attention: 95-98%
  • Problem: <95%

A declining tool call success rate means something in your tool chain is breaking.

User Satisfaction Score

Track explicit feedback (thumbs up/down or ratings) after conversations:

  • Good: >4.0/5.0
  • Needs attention: 3.0-4.0
  • Problem: <3.0

Hallucination Rate

Monitor for factual errors or fabricated information. This requires periodic human review of conversation logs. Flag conversations where the agent made unsupported claims.

Cost Per Conversation

Total LLM token cost divided by number of conversations. Track this over time to catch cost regressions from prompt changes or model switches.

Setting Up Alerts

On HostAgentes, configure alerts for:

  1. Error rate spike — error rate exceeds 5% for 5 minutes
  2. Latency degradation — p95 latency exceeds 10 seconds
  3. Tool failure — any tool’s success rate drops below 95%
  4. Cost anomaly — daily token spend exceeds 2x the 7-day average
  5. Agent unreachable — health check fails 3 times in a row

Alerts can be sent to email, Slack, or webhook endpoints.

Monitoring Dashboard Walkthrough

The HostAgentes dashboard shows:

Overview Panel

  • Active agents and their status
  • Total requests today / this week
  • Average response quality score
  • Token usage and costs

Agent Detail View

  • Request volume over time (1h, 6h, 24h, 7d)
  • Latency distribution chart
  • Tool call breakdown with success rates
  • Recent conversations with quality indicators

Scaling Events

  • When instances were added or removed
  • Traffic patterns that triggered scaling
  • Current instance count and utilization

Best Practices

  1. Check the dashboard daily — even a 2-minute review catches issues early
  2. Review conversation logs weekly — look for patterns in bad responses
  3. Set up alerts, not dashboards — alerts notify you of problems; dashboards show you data
  4. Track quality over time — a single metric snapshot isn’t useful; trends are
  5. Compare before/after changes — every agent update should be followed by a quality comparison

Getting Started with Monitoring

Every HostAgentes deployment includes monitoring from day one. Deploy an agent and the dashboard starts tracking immediately.

Start monitoring your agents →

Ready to deploy your Paperclip agents?

Managed hosting from $15/mo. Zero complications.

See Plans