Chapter 11

Monitoring and Incident Response

You cannot defend what you cannot see. Implement telemetry across the full agent lifecycle and build incident response procedures that account for AI-specific failure modes. Without visibility, attacks go undetected and incidents spiral.

11.1 Structured Telemetry and Immutable Audit

For each agentic task, log at minimum these four categories:

Identity

User ID (or pseudonymous ID), role, tenant/organization
Agent ID and version

Request

Timestamp, environment, region
User prompt (sanitized - PII and secrets redacted)
High-level context such as retrieved document IDs, not full content

Actions

Tools called and parameters (sanitized)
Data domains touched (e.g., which tables, collections, or indices)
Guardrails triggered and decisions taken

Outcome

Final agent output (sanitized)
Status: success, blocked, or error
Any policy violations or escalations

Store logs in append-only or tamper-evident storage. Use WORM (write once, read many), hash-chaining, or signed logs. Align retention periods with your regulatory requirements. If someone can modify or delete audit logs after the fact, you have lost your ability to investigate incidents reliably.

11.2 Behavioral Monitoring and AI-Specific Detection

Static rules are not enough. You need to establish baselines per agent and watch for deviations.

Establish Baselines

For each agent, track:

Normal tool usage frequency and mix
Typical data volumes and classifications accessed
Usual response lengths, latency, and behavioral patterns

Monitor for Anomalies

Sudden spikes in high-risk tool calls, bulk data exports, or guardrail violations
Activity at unusual times or from unexpected geographies
Sudden shifts in agent behavior - tone changes, altered recommendations, or systematic policy deviations

Detect AI-Specific Threats

Prompt injection and jailbreak attempts - Repeated attempts to override system instructions
Cross-tenant access attempts - Any probe for data outside the current tenant boundary
Data exfiltration patterns - Unusually large responses, repeated "list all" requests, or attempts to encode data in output
Tool abuse - Misuse of code-execution or generic HTTP tools beyond their intended scope

11.3 Automated Safeguards

Detection without response is just an expensive logging exercise. Build automatic controls that act on what you detect.

Circuit Breakers for Agents

If error rates or policy violation rates cross defined thresholds, disable the agent automatically or switch to a degraded mode - read-only, no tool access. Do not let a malfunctioning agent keep operating at full capability while you figure out what went wrong.

Adaptive Security Posture

When threat levels are elevated:

Disable risky tools temporarily
Tighten rate limits
Force human approval for actions that would normally be automated

Quarantine Modes

For suspicious users or tenants, move them to stricter policies and manual review. This limits blast radius while you investigate, without shutting down the entire system.

11.4 AI-Specific Incident Response

AI incidents are real incidents. Treat them as first-class concerns, integrated with your existing security operations.

Common Incident Classes

Data leakage - PII, secrets, or confidential data exposed through agent outputs
Tool misuse - Unauthorized changes made through agent tool calls
RAG or memory poisoning - Corrupted retrieval data or manipulated agent memory
Unsafe or harmful outputs - Toxic, biased, or dangerous content reaching production users
Provider compromise or misconfiguration - Issues at the model provider or infrastructure level

For Each Class, Define a Runbook

1. Detection. Which alerts or metrics indicate the problem? Define specific thresholds and signals so your team knows what to look for.

2. Containment. Disable affected agents or tools. Revoke or rotate compromised credentials. Apply network lockdown if needed. Speed matters here.

3. Triage and analysis. Determine the scope: which tenants, users, and data were affected, over what time period, and through which flows. Identify root cause - was it prompt injection, misconfiguration, a code bug, or infrastructure compromise?

4. Remediation. Fix the code, policies, or patch the affected components. Clean or roll back poisoned memory and RAG indices. Restore systems under stricter observation until you have confidence the fix holds.

5. Communication. Notify internal stakeholders immediately. For regulated industries or contractual obligations, notify customers and regulators as required by applicable law and agreements.

6. Learning and improvement. Update your threat models, test suites, guardrails, and runbooks based on what you learned. Every incident should make the system harder to attack next time.

Need help with monitoring and incident response?

We help teams design telemetry architectures and build AI-specific incident response playbooks. If your agents are running in production, let's make sure you can see what they are doing and respond when something goes wrong.

Get in touch