Industry AnalysisApril 5, 20267 min read

X Opens the Floodgates for AI Agents. Who's Responsible When They Misbehave?

X just launched XMCP — a Model Context Protocol server that lets AI agents post, reply, search, and act autonomously on the platform. Any developer can now give an AI agent a set of X credentials and let it loose on a network of 500 million users.

This is not a feature launch. It's a governance gap opening in real time.

What XMCP actually enables

XMCP gives AI agents native access to X's core actions: posting, replying, sending DMs, searching, reading timelines. Combined with xAI credits bundled into X API access — strongly suggesting Grok-powered agents are the intended use case — the platform is effectively inviting autonomous agents to participate in public discourse at scale.

The technical capability is real and it's impressive. An agent can now monitor a brand's mentions, draft and post replies, escalate issues to a human, and log everything — all without a human in the loop for each action.

The question nobody is asking loudly enough: what happens when it goes wrong?

The accountability gap

When a human posts something harmful on X, the accountability chain is clear. When an AI agent does it — who's responsible?

The developer who built the agent? The company that deployed it? The platform that gave it API access? The enterprise buyer who hired the agent to manage their social presence?

Without a contract governing the agent's authorized actions, there is no answer. There is no scope definition that was agreed upfront. There is no audit trail proving what the agent was instructed to do versus what it actually did. There is no dispute resolution path when something goes wrong.

Pedro Franceschi, CEO of Brex, described this problem directly when he built "Crab Trap" — a second AI system to monitor his first AI agent in real time. He built it by hand, for one agent, because there was no platform to do it for him. His agent had no behavioral baseline. Crab Trap was surveillance without a contract. It watched, but it didn't know what "normal" was supposed to look like.

XMCP makes this problem 500 million users larger.

The governance layer X didn't ship

X shipped the rails. Nobody shipped the contract.

Before any enterprise deploys an XMCP-powered agent, they need answers to questions that XMCP doesn't address:

What is this agent authorized to do? Post only? Reply only? DMs? Follow/unfollow? Each of these has different risk profiles and different liability implications. Without a defined scope, "the agent did it" is the only answer available when something goes wrong.

What is this agent's behavioral baseline? An agent that posts on-brand content in week one but starts drifting into promotional content by week four has changed — but without a fingerprint of what it was supposed to do, that drift is invisible until the lawsuit.

What happens when the agent acts outside its brief? If an enterprise hires an AI agent to monitor customer service mentions and it starts posting political commentary, who filed the contract that defines that as a violation? What's the remedy? How is the harm proven?

Is there an audit trail? Not a log file that can be modified. An immutable, contemporaneous record of every action the agent took, written at the moment it happened, that can be produced in discovery.

What the governance layer looks like

This is not a theoretical problem. The infrastructure to solve it exists.

A Standard AI Services Agreement (SAISA) Exacts the agent's authorized scope into a Paper before deployment — what it can post, what it cannot, content constraints, rate limits, human approval requirements for specific action types. Exacting creates a behavioral fingerprint: a cryptographic hash of what the agent declared it would do. That fingerprint is the AI Provider's warranty.

At runtime, the Runtime layer continuously scores the agent's actual behavior against the Exacted fingerprint. Not in a post-session report. During execution. A Grok-powered agent that starts posting outside its contracted scope gets flagged, its session suspended, and escrow held pending review — automatically, without a human having to notice first.

Every action is written to an immutable Trace record. Not a log. A contemporaneous audit record that cannot be altered, that chains every session to its originating contract, and that becomes the evidentiary record if a dispute reaches arbitration.

When something goes wrong and the enterprise buyer disputes the agent's output, a tricameral AI panel — Claude, GPT-4o, and Gemini — evaluates the Trace evidence against the Exacted SAISA criteria and produces a defensible, auditable compound posterior verdict. Not a support ticket. Not a platform decision. A contractually-grounded determination with an appeals path.

The xAI connection

X bundling xAI credits with API access is a tell. The intended deployment pattern is: Grok agent, XMCP server, X platform. xAI builds the agent, X provides the surface, XMCP is the bridge.

exact.works is AI-provider agnostic by design. A Grok-powered agent needs a SAISA just as much as a Claude-powered agent or a GPT-4o-powered agent. The governance layer sits above the model layer — it doesn't care which LLM is doing the reasoning. It cares whether the agent is doing what it was contracted to do.

As the agentic economy converges on consumption-based pricing — X's pay-per-action model, AWS metering, xAI credits — exact.works' infrastructure metering model is designed for exactly this granularity. Every XMCP action is a governed, traceable, contractually-bounded transaction. Every deviation is detectable. Every dispute is resolvable.

The window is short

First-mover advantage in governance infrastructure is real and it's narrow. The XMCP announcement will generate a wave of enterprise interest in deploying X-native agents. Most of those enterprises have legal and compliance teams that will ask the accountability questions before they deploy.

Right now, there is no good answer to those questions except: exact.works.

The contract layer for AI agents acting on X isn't a future product. It's infrastructure that exists today, built on 17 sprints of governance architecture, with SAISA, Runtime, Trace, and settlement rails already operational.

Pedro built Crab Trap for one agent.

exact.works governs them all.

Every AI agent needs a contract.

exact.works →

← Back to Newsroom