Whose Leak Is It? DLP When an AI Agent Holds Your OAuth Token
$ grep -n "^##" 2026-06-whose-leak-dlp-ai-agent-oauth-scope.md>
I keep four or five Claude agents running across my projects, and a couple are wired straight into the business systems I actually use — Xero for the books, HubSpot for the pipeline — over my own OAuth token. The day I connected the first one, I sat at the consent screen and made the call every operator makes: which scopes to grant. The official Xero connector will ask for the lot if you let it, payroll among them, so I trimmed the grant to the handful an accounts agent could actually use.
Here is what I told myself while I did it, because it felt obviously true: the agent calls the same API my browser would, with my token, so it can only ever see what I can see. The boundary is the vendor's — Xero defines it, Xero enforces it — so data loss prevention is Xero's problem, not mine. I'd done the responsible thing and the rest was somebody else's RBAC.
Most of that holds up. The agent really does inherit my access boundary. But that sentence answered only one question — who is allowed to read this data? — and quietly skipped another: given that the agent can read it, where is it allowed to go next?
The question I'd actually answered
That second question is the whole of data loss prevention. Proofpoint defines DLP as the policies and processes that "identify and prohibit unauthorized exposure, sharing, or transfer of sensitive data" (Proofpoint). Exposure, sharing, transfer — every word is about what leaves. A browser session blurred reading and acting into one human act; an agent splits them. It reads — access, the vendor's domain, working as designed — and then it can transmit onward through a tool call, an outbound request, a created record. That second capability is egress, and it lives downstream of the token, on a stretch of the path the vendor was never positioned to see.
There's a sharper twist, from a vendor's own documentation. HubSpot's OAuth quickstart is explicit that access tokens "do not reflect the permissions or limitations of what a user can do" — a user who can view "only owned contacts" but authorizes the read scope mints a token that can "view all contacts in the account" (HubSpot docs). So even my comfortable premise — "it can only see what I can see" — isn't reliably true: the token's reach and the user's reach are two independent systems. And the egress half isn't on the vendor's side of the line at all.
The heist that broke nothing
In May 2025, Invariant Labs published an attack against the official GitHub MCP server, and it maps onto my setup uncomfortably well: a developer's own personal access token, an official MCP server, an AI assistant doing ordinary work. The token reaches both a public repo and a private one — normal. An attacker buries a prompt-injection payload in a public-repo issue. Later the developer asks their assistant to look at the open issues; the agent loads the malicious one into context, the planted instructions take over, and — using the same legitimate token — it reads files from the private repo and exfiltrates them by opening a pull request to the public one, where anyone can now read them. Invariant's researchers walked away with real specifics about their test user: private repository names, a plan to relocate to South America, even a salary (Invariant Labs).
No token stolen, no tool poisoned, no authentication broken. Every call authorised. And the line that should be pinned above every agent deployment is theirs:
"This is not a flaw in the GitHub MCP server code itself, but rather a fundamental architectural issue that must be addressed at the agent system level. This means that GitHub alone cannot resolve this vulnerability through server-side patches."
The same pattern produced EchoLeak against Microsoft 365 Copilot — a zero-click exfiltration Microsoft rated 9.3 critical, where a crafted email coaxed Copilot into leaking internal data through an auto-fetched image URL. In both cases the vendor's access layer saw nothing anomalous, because nothing was: the breach lived in the gap between an authorised read and an authorised egress — the gap I'd handed entirely to the vendor in my head.
Why no patch closes it
This can't be patched away, and the reason has a name. Simon Willison called it the lethal trifecta in June 2025: an agent that combines access to private data, exposure to untrusted content, and the ability to externally communicate is exploitable, full stop. My Xero agent has all three by design — it reads private financial data, ingests untrusted content the moment it touches an invoice PDF or supplier email, and can make outbound calls. Three legs, every one load-bearing, none removable without removing the point of the agent.
The easy misreading is "so MCP is broken, don't use it." That's wrong: the MCP authorization spec is genuinely good, mandating audience-bound tokens and forbidding token passthrough — "MCP servers MUST NOT accept any tokens that were not explicitly issued for the MCP server" (MCP Security Best Practices). It closes the access-layer holes with rigour. What it cannot close is the egress hole, because that's on the operator's side of the line — and there's no clean technical fix waiting to be installed. The guardrail products sold to plug it advertise "95% of attacks" caught, which in web application security "is very much a failing grade," and three years in, OpenAI's own CISO Dane Stuckey says flatly that "prompt injection remains a frontier, unsolved security problem" (via Simon Willison). So if the vendor can't fix it and no product reliably will, the only useful question left is who owns which half.
The split
Think about a contractor with a building pass for third-floor work that also, because the policy was drawn broad, opens the records room on two. He photographs an open filing cabinet and walks out. No alarm: the pass was legitimate, his presence authorised, and nothing at the door was watching what left in his pocket. That sorts cleanly into three duties.
The vendor owns access. Honour the scopes, validate the token audience, and — the part that's actively improving — ship granular scopes so the pass opens fewer rooms. Xero is mid-migration: as of March 2026 newly created apps must use granular scopes, with broad legacy ones deprecated. Its developer blog frames the rationale — "instead of broad access, your app now requests only the exact permissions it needs." My opening instinct was right here. The good vendors are doing their job.
The operator owns egress. The half my original model skipped, and it starts with the one lever I fully control: least-privilege token issuance. The official Xero MCP server's bearer-token scope list is broad — roughly twenty distinct scope strings spanning invoices, payments, bank transactions, reports, contacts, settings, and payroll (payroll.settings, payroll.employees, payroll.timesheets). The README's own Claude Desktop example narrows it to three: accounting.invoices accounting.contacts accounting.settings (XeroAPI/xero-mcp-server). Twenty versus three. An agent that reconciles invoices has no business holding a token that can read employee payroll, and the difference is one environment variable I set. OWASP puts the duty exactly here: execute actions "in the context of that specific user, and with the minimum privileges necessary" (OWASP).
But scope minimisation isn't a solution; it shrinks the blast radius without closing the trifecta. A three-scope agent that reads an invoice has a tight scope and the data has still left the building. The rest of the operator's job — constraining outbound destinations, keeping a human in the loop on writes and sends — is real work in a domain where no tool does it reliably yet. Necessary, not sufficient.
Two of the three duties can be designed for. The third can't be delegated at all — and that's a legal claim, not engineering.
The part that doesn't split
You can delegate the work to an agent. You cannot delegate the accountability — it's written into the law on both sides of the world I work between. Under GDPR, the controller "shall be responsible for, and be able to demonstrate compliance with" the regulation's principles (Article 5(2)) and must implement "appropriate technical and organisational measures" (Article 24); hiring a processor, the European Data Protection Board is explicit, "cannot relieve the controller from its accountability" (EDPB Guidelines 07/2020).
From where I sit in Singapore, the PDPA says it as plainly as anywhere. An AI agent operating on my behalf is close kin to a data intermediary, and the statute refuses to let the delegation launder responsibility: an organisation "has the same obligation under this Act in respect of personal data processed on its behalf... by a data intermediary as if the personal data were processed by the organisation itself" (sections 4(3) and 11(3)). As if you'd processed it yourself.
What makes this bite for agents is that, as Willison puts it, "an AI system cannot be held accountable for its actions" (Simon Willison). The agent cannot be liable; the vendor honoured its contract. So when the data leaves, accountability falls — by elimination and by statute — on the human who granted the token. And the gap between that assignment and any governance is wide: Okta found that "while 91% of organizations deploy AI agents, only 10% have a management strategy" (Okta) — exactly the wall most AI agents can't climb to reach production.
The agent didn't take on any access I didn't already have. It took on an exit I'd never once had to think about.
The Cutler.sg Newsletter
Weekly notes on AI, engineering leadership, and building in Singapore. No fluff.
The 30 Principles for Agentic Engineering — Part 4: Governance and Safety
Principles 21–25. The governance and safety layer: strictKnownMarketplaces, no goal-conflict prompts, quarterly AppSec, four telemetry signals, monthly incident discipline.
AI Reviews AI Is Not a Review: The Trust Trap Regulators Won't Accept
AI-reviews-AI looks like a control. Under MAS, the EU AI Act, and any reasonable audit, it isn't. Here's why your compliance team won't accept it — and the compensating controls that actually work.
Snyk's ToxicSkills Audit: 13.4% of Public Skills Are Vulnerable
I publish Claude Code skills and install other people's. Then Snyk audited 3,984 public ones: 13.4% had critical vulnerabilities, 76 were confirmed malicious, and ClawHavoc is the scarier story underneath. Here's the supply-chain hygiene I now refuse to skip.