
CoSAI at RSAC 2026: Leading the Conversation on Secure AI
February 11, 2026Our latest paper maps the security challenges of a world where AI doesn’t just answer questions — it acts.
Picture this: it’s 2 a.m., an outage hits your production infrastructure, and no human has noticed yet. Within seconds, an autonomous AI agent wakes up, pulls the relevant logs, creates a Slack channel, checks the on-call rotation, and starts triaging the root cause. By the time your engineer picks up their phone ten minutes later, there’s already a pull request waiting for review.
This isn’t a vendor pitch or a thought experiment. It’s a composite based on how some of the most technically advanced companies actually operate today, and it’s the opening scenario of our new paper, The Future of Agentic Security: From Chatbots to Autonomous Swarms.
KEY TAKEAWAYS
- The attack surface has shifted to the semantic layer, where agents negotiate intent and delegate tasks in plain language.
- Legacy controls (static RBAC, regex DLP, OS-level EDR) are blind to how agentic systems operate.
- Visibility is not control. Reading an agent’s chat history is not the same as verifying code integrity.
- Two unsolved problems, intent-based authorization and the semantic mosaic effect, have no proven fix yet.
- Organizations that deploy agents without purpose-built security infrastructure will face risks that are hard to retrofit later.
WHAT SECURITY LEADERS SHOULD DO NOW
- Treat agents as ephemeral infrastructure. Spin up, execute, tear down. No persistent footholds.
- Replace static service tokens with dynamic, context-aware credentials scoped to each task.
- Enforce GitOps boundaries. Agents write pull requests; humans and pipelines handle deployment.
- Invest in Agent Detection and Response (ADR). Traditional EDR is blind to the LLM cognitive layer.
- Design review interfaces that surface agent uncertainty and resist automation bias.
- Begin building Semantic-Layer DLP. Pattern matching cannot catch inference-based data leakage.
From Tool to Colleague
Most organizations are still in the early stages of AI adoption: someone opens a chat window, types a question, and gets a response. Useful, but relatively contained. The shift we’re tracking is more fundamental. AI is increasingly being deployed not as a tool you invoke but as an autonomous participant in your operations, something that watches for events, makes decisions, coordinates with other AI systems, and takes action, all without waiting to be asked.
We use the term “agentic swarms” to describe these systems, and the name is apt. In the 2 a.m. scenario, multiple specialized AI agents collaborate in a shared Slack channel: one handles triage, another drafts the code fix, a third searches internal documentation for the right contacts. They delegate tasks to each other in plain English. They say “thank you” and “you’re welcome.” And they get the job done faster than any human team could at that hour.
The operational appeal is obvious. The security implications are profound and largely unsolved.
The Core Problem: The Attack Surface Has Moved
Traditional security tools are built around a fairly stable set of assumptions. Access control lists, network perimeters, endpoint monitoring, all of these are designed for a world where humans initiate actions and machines execute them in predictable ways.
Autonomous AI agents break those assumptions in several ways at once.
First, these systems need access to do their jobs. An on-call agent that can’t read logs, check dashboards, or create communication channels isn’t very useful. But broad access in the hands of an autonomous system creates risks that didn’t exist before. If an attacker can slip malicious instructions into a log file the agent reads, a technique called “prompt injection,” they can potentially hijack the agent’s legitimate access for their own purposes. The agent becomes an omniscient corporate oracle for frictionless reconnaissance, doing the attacker’s information gathering for them using access that was entirely legitimate.
Second, when agents communicate with each other in natural language, the usual guardrails don’t apply. A less-privileged agent can ask a more-privileged agent to do something on its behalf, and unless the receiving agent is designed to refuse, the result is an effective end-run around access controls. This is a multi-agent version of the “confused deputy problem,” a classic security failure now appearing in a new and harder-to-detect form.
Third, and perhaps most concerning for executives, there’s what we call the “semantic mosaic effect.” An AI agent with broad read access to company information can synthesize sensitive insights from many innocuous sources, without ever quoting a single protected document. Pattern-based data loss prevention tools, which work by scanning for specific keywords or identifiers, are essentially blind to this kind of leakage. An agent doesn’t have to steal a document to expose what’s in it.
Visibility Is Not Control
Our most pointed observation may be about something companies tend to find reassuring: the chat channel.
Because agents work in Slack (or a similar platform) alongside humans, there’s a natural sense that people are keeping an eye on things. The human engineer can scroll through the conversation, see what the agents did, and approve the final pull request. It feels like oversight.
It isn’t, necessarily. Reading a chat history is not the same as verifying the integrity of the code being proposed. If a compromised agent has been operating in sub-channels the human never saw, the summary in the main channel may be a curated version of events. And at 2 a.m., after a rushed scroll through an AI-generated incident timeline, how thoroughly is anyone really reviewing the proposed code change?
We use the term “automation bias” for the tendency to trust AI outputs more than we should, especially when those outputs are presented with confidence and clarity. Designing review interfaces that actively surface uncertainty — rather than presenting every recommendation as equally authoritative — is one of the open research problems we flag in the paper.
What Good Security Looks Like in This World
The paper isn’t just a catalog of problems. We also outline what a more secure agentic architecture looks like. A few principles stand out.
Treat agents like ephemeral infrastructure. Just as modern cloud deployments avoid long-lived servers in favor of containers that spin up and tear down, agent processes should run in isolated, temporary environments that are destroyed when the task completes. A compromised agent that leaves no persistent foothold is a far smaller problem than one that lingers in the environment.
Stop relying on static credentials. Long-lived service tokens embedded in agent environments are a significant vulnerability. The alternative is dynamic, context-aware credential management — granting access based on what the specific task requires, automatically constraining that access when the situation changes, and revoking it when the task is done. SPIFFE/SPIRE has emerged as the foundational approach for AI agent identity, with Google Cloud now offering native agent identity support. W3C DIDs, Agent Cards (well-known/agent-card.json), and OAuth 2.1 with delegation patterns are also converging.
Keep code and deployment separated. An agent that can write code should not be able to deploy it. We advocate for immutable GitOps enforcement: agents submit pull requests, humans review them, and a formal pipeline handles deployment. This mirrors controls that exist in well-run software development organizations and for similar reasons.
Build oversight into the architecture. The paper describes an “Oversight Agent” concept — a specialized system whose only job is monitoring the behavior of other agents and flagging or containing anything that looks wrong. Think of it as a security operations center running at machine speed.
Recognize that traditional security tools have a blind spot. Endpoint detection and response systems watch what happens at the operating system level. They see a container spin up. They don’t see what the AI inside that container is reasoning about, what instructions it just received, or what it’s about to do. A new category of tooling — Agent Detection and Response, or ADR — is emerging to fill this gap, but it’s early and the standards are still being written.
The Limits of Current Approaches
We want to be candid about what remains unsolved, because the honest answer matters more than a reassuring one. Two problems in particular are fundamental rather than merely difficult.
The first is intent-based authorization. Traditional access control works by granting specific permissions to specific identities. But when agents communicate in natural language, the question isn’t just who is asking — it’s what they’re actually trying to accomplish. There is currently no reliable way to evaluate the semantic intent of a natural language request and authorize or deny it accordingly. This is the most basic access control question in an agentic world, and it remains early-stage and unproven at scale.
The second is the semantic mosaic effect. No existing data loss prevention architecture can reliably detect when an AI agent is inferring and communicating sensitive information through paraphrase, summary, or inference rather than direct quotation. This is not a gap that better pattern-matching will close. It requires a fundamentally different approach to detecting data exposure, and that approach doesn’t yet exist.
Why This Matters Beyond the Security Team
This paper is addressed primarily to security and technology professionals, but its implications reach into the boardroom. The efficiency gains from autonomous AI agents are real and compelling. Organizations that can respond to incidents at 2 a.m. without waking anyone up, draft and review code changes in minutes, and coordinate complex workflows across systems and teams will have genuine operational advantages over those that can’t.
The question we’re really asking is whether organizations will build the security infrastructure to support that capability safely, or whether they’ll race to deploy these systems and discover the risks after the fact.
This work connects to research we’ve been publishing over the past year. Our January paper on Model Context Protocol security addressed the risks in the technical plumbing that connects AI agents to external tools and data. Our AI incident response framework, released last fall, tackled how organizations should respond when AI systems behave unexpectedly. This new paper extends that body of work into fully autonomous, multi-agent territory — which is where the frontier companies already are.
The window to define the right security architecture for these systems is narrowing. The more autonomous AI expands into enterprise operations, the harder it becomes to retrofit controls after the fact. That’s a message for security teams, and it’s equally a message for every executive deciding how quickly and broadly to deploy AI agents across their organization.
Read the full paper, “The Future of Agentic Security: From Chatbots to Autonomous Swarms,” at coalitionforsecureai.org. CoSAI is an OASIS Open Project bringing together AI and security experts from industry-leading organizations to develop practical guidance for safe AI deployment.




