Chapter 11
Monitoring and Incident Response
You cannot defend what you cannot see. Implement telemetry across the full agent lifecycle and build incident response procedures that account for AI-specific failure modes. Without visibility, attacks go undetected and incidents spiral.
11.1 Structured Telemetry and Immutable Audit
For each agentic task, log at minimum these four categories:
Identity
- User ID (or pseudonymous ID), role, tenant/organization
- Agent ID and version
Request
- Timestamp, environment, region
- User prompt (sanitized - PII and secrets redacted)
- High-level context such as retrieved document IDs, not full content
Actions
- Tools called and parameters (sanitized)
- Data domains touched (e.g., which tables, collections, or indices)
- Guardrails triggered and decisions taken
Outcome
- Final agent output (sanitized)
- Status: success, blocked, or error
- Any policy violations or escalations
Store logs in append-only or tamper-evident storage. Use WORM (write once, read many), hash-chaining, or signed logs. Align retention periods with your regulatory requirements. If someone can modify or delete audit logs after the fact, you have lost your ability to investigate incidents reliably.
11.2 Behavioral Monitoring and AI-Specific Detection
Static rules are not enough. You need to establish baselines per agent and watch for deviations.
Establish Baselines
For each agent, track:
- Normal tool usage frequency and mix
- Typical data volumes and classifications accessed
- Usual response lengths, latency, and behavioral patterns
Monitor for Anomalies
- Sudden spikes in high-risk tool calls, bulk data exports, or guardrail violations
- Activity at unusual times or from unexpected geographies
- Sudden shifts in agent behavior - tone changes, altered recommendations, or systematic policy deviations
Detect AI-Specific Threats
- Prompt injection and jailbreak attempts - Repeated attempts to override system instructions
- Cross-tenant access attempts - Any probe for data outside the current tenant boundary
- Data exfiltration patterns - Unusually large responses, repeated "list all" requests, or attempts to encode data in output
- Tool abuse - Misuse of code-execution or generic HTTP tools beyond their intended scope
11.3 Automated Safeguards
Detection without response is just an expensive logging exercise. Build automatic controls that act on what you detect.
Circuit Breakers for Agents
If error rates or policy violation rates cross defined thresholds, disable the agent automatically or switch to a degraded mode - read-only, no tool access. Do not let a malfunctioning agent keep operating at full capability while you figure out what went wrong.
Adaptive Security Posture
When threat levels are elevated:
- Disable risky tools temporarily
- Tighten rate limits
- Force human approval for actions that would normally be automated
Quarantine Modes
For suspicious users or tenants, move them to stricter policies and manual review. This limits blast radius while you investigate, without shutting down the entire system.
11.4 AI-Specific Incident Response
AI incidents are real incidents. Treat them as first-class concerns, integrated with your existing security operations.
Common Incident Classes
- Data leakage - PII, secrets, or confidential data exposed through agent outputs
- Tool misuse - Unauthorized changes made through agent tool calls
- RAG or memory poisoning - Corrupted retrieval data or manipulated agent memory
- Unsafe or harmful outputs - Toxic, biased, or dangerous content reaching production users
- Provider compromise or misconfiguration - Issues at the model provider or infrastructure level
For Each Class, Define a Runbook
1. Detection. Which alerts or metrics indicate the problem? Define specific thresholds and signals so your team knows what to look for.
2. Containment. Disable affected agents or tools. Revoke or rotate compromised credentials. Apply network lockdown if needed. Speed matters here.
3. Triage and analysis. Determine the scope: which tenants, users, and data were affected, over what time period, and through which flows. Identify root cause - was it prompt injection, misconfiguration, a code bug, or infrastructure compromise?
4. Remediation. Fix the code, policies, or patch the affected components. Clean or roll back poisoned memory and RAG indices. Restore systems under stricter observation until you have confidence the fix holds.
5. Communication. Notify internal stakeholders immediately. For regulated industries or contractual obligations, notify customers and regulators as required by applicable law and agreements.
6. Learning and improvement. Update your threat models, test suites, guardrails, and runbooks based on what you learned. Every incident should make the system harder to attack next time.
Need help with monitoring and incident response?
We help teams design telemetry architectures and build AI-specific incident response playbooks. If your agents are running in production, let's make sure you can see what they are doing and respond when something goes wrong.
Get in touch