Identical content can be safe or malicious depending on context. Traditional enterprise security wasn’t built to tell the difference.
The wrong question
Enterprise security was built to answer one question: What is the system doing? Regex classifiers scan for prohibited words. DLP flags sensitive patterns. Identity platforms verify who is acting.
None of these asks the harder question: why?
Autonomous AI agents are now in production, approving refunds, querying customer platforms, and triggering workflows across sensitive systems. The security stack wasn’t built for them. It was built to inspect nouns: content, data, identity, and permissions. Agents operate in verbs: plan, reason, retrieve, execute, escalate. Until security can evaluate verbs in context, agentic AI will outrun the static controls built for an earlier era.
From responder to actor
The shift from chatbots to agents is the shift from language to behavior. A chatbot says things. An agent does things. That sounds like a small distinction. Operationally, it’s everything.
Talk to CISOs deploying agents right now, and the realization arrives in stages. First, they discover agents in their environment that they didn’t know existed. Then they realize they have no way to assess what those agents are vulnerable to, no pen testing or red teaming built for this class of system. Then comes the harder admission: even where guardrails exist, they were designed for the prompt era, not the runtime era. The real concern isn’t what data the agent receives or returns it’s what the agent actually does with that data.
The security question moves from “what did the model output” to “what did the model set in motion.” Nouns describe a state. Verbs describe consequences. Legacy tools inspect the former. The damage happens in the latter.
Recommended CyberTech Insights: Small DoD Manufacturers Facing a Growing CMMC Readiness Gap
Why nouns fail: the context problem
The core failure of content-based security in agentic systems is that identical content can be benign or malicious depending on context.
Consider a retail agent built to recommend products. Its job is to suggest a pair of shoes and link the customer to checkout. In one session, that’s exactly what it does. In another, the agent has been poisoned via a manipulated input. The same recommendation pattern, with identical surface-level content, now directs the customer to a phishing link.
The output appears to be a normal recommendation. The intent has been hijacked. No keyword filter catches that, because there are no keywords to catch.
Or consider an agent with database access. Each query it issues passes every existing control. Permissions check out. Schemas check out. Nothing in any individual query trips a rule. Then it deletes records at scale, well beyond anything its task required. We’ve seen this exact scenario across multiple organizations. Every control is approved at every step. The action that should never have happened happened anyway.
Three failure modes are worth naming:
- Single-prompt blindness. Content filters evaluate fragments. Agents accumulate context across sessions, memory, and tool outputs. The risk lives in the sequence, not the sentence.
- Time-dimension blindness. A data access request that’s legitimate at step one can be a policy violation by step six. Static filters don’t have a clock.
- Aggregation blindness. An agent exfiltrating records through fifty individually benign queries will never fail a keyword check. The whole is harmful; no part is.
This is also why purpose-built guardrail models, even ones explicitly designed for AI safety, leak. Recent red-team testing of a leading open-source guardrail model showed a bypass rate approaching 50% across 1,500 adversarial prompts, with the worst failures in high-risk scenarios. The model wasn’t broken. It was doing exactly what it was trained to do: evaluate content. It just couldn’t see what the content was for.
What “thinking in verbs” actually requires
The solution requires behavioral intent analysis, not as a new product category, but as a new security discipline. Security teams adopting agentic AI need a layer that evaluates four things at once:
- Goal alignment. Does the action serve the user’s stated objective, or has the objective quietly shifted under the agent’s feet?
- Scope adherence. Is the agent operating within the authority its developer defined, or has it drifted into territory it was never authorized to operate in?
- Behavioral consistency. Does this action fit the pattern of how this agent, this user, this workflow normally behaves?
- External-input integrity. Is the agent following the user’s instructions, or has a retrieved document, tool output, or third-party input silently redirected it?
When these four signals agree, the action is probably safe. When they diverge, that divergence is the alert. Not the content. The behavior.
Behavioral baselines aren’t instant. A useful baseline depends on volume: an agent that performs the same operation thousands of times a day produces enough data to flag the one operation that doesn’t fit. An agent that runs three times a week doesn’t. The honest answer to “how long until the baseline is useful?” is “it depends on how heavily the agent is used.” That’s a different conversation than security teams are used to having, and it’s the right one.
Recommended CyberTech Insights: Why Legacy Identity Governance and Administration Is Failing Modern Enterprises
Where legacy still works, and where it stops
None of this means content scanning is obsolete. For structured, well-defined contexts, where access rules are clean and the question is binary, traditional controls still do their job. The distinction isn’t “legacy bad, behavioral good.” It’s structural.
Where content-based controls hold up: enforcing access rules with clear allow/deny logic, blocking known-bad payloads, scanning for regulated data classes, leaving defined boundaries.
Where they break: when the question becomes “should this user be asking this right now,” or “is this agent allowed to perform this operation in this sequence.” Standard access tools don’t carry that context. They were never built to.
Agentic AI sits almost entirely on the second side of that line.
The honest part
There’s a temptation, every time a new attack surface emerges, to claim the industry has the answer. We don’t. The market is moving faster than any of us can fully map. What’s true today about agentic security may be incomplete in three to six months, and irrelevant in twelve. That’s the operating reality. Pretending otherwise is the failure mode.
So the contrarian take isn’t a prediction about where we’ll be in eighteen months. It’s a refusal to plan that far. The teams that will get this right are the ones that stay honest about what they don’t yet know, work in tighter loops, and treat agentic security as a discipline still being defined, not a solved category.
For CISOs reading this on a Monday morning, the practical first step is refreshingly focused: get specific about what you don’t yet know. Run a discovery exercise to identify every autonomous agent already running in your environment. For each one, ask a harder question than usual, not just “what is it doing?” but “why is it doing it, and should it be?” If you can’t answer with confidence, that’s your gap. Start there. Pilot a behavioral monitoring layer on your highest-risk agents, measure the divergence signals, and iterate quickly.
The teams that will define agentic security aren’t waiting for the perfect solution. They are the ones building visibility and control in real time, while the rest are still producing green dashboards as agents take unapproved actions. Security built for nouns will keep missing the verbs, and the shift is already overdue.
Recommended CyberTech Insights: Detection Is Only Half the Job: The Access Gap Breaking Modern Cyber Defense
To participate in our interviews, please write to our CyberTech Media Room at info@intentamplify.com
🔒 Login or Register to continue reading




