Blog

AI Agent Secrets Management: the category, explained

A new infrastructure category is forming at the intersection of secrets management and AI agents. Naming it, mapping it, and separating it from the adjacent categories it gets confused with.

By Jesús E. Viera · April 4, 2026 · 11 min read

Secrets management is not a new field. Vault came out in 2015. AWS Secrets Manager in 2018. Every serious cloud has one, every serious team uses one, every serious postmortem includes the phrase "the secret was rotated." The field is solved — for humans and for services.

It is not solved for AI agents, because AI agents are neither humans nor services. They are a third thing, and the third thing has a different threat model. This post is an attempt to name that third thing and map the category it deserves.

A category is a shared threat model

Categories in infrastructure are useful when they collapse a recurring set of problems into a recognisable shape. "Observability" became a category when teams realised that metrics, logs, and traces were three views of the same underlying question: what is my system doing right now? "Infrastructure as code" became a category when teams realised that CloudFormation, Terraform, and Ansible were answering the same question with different syntaxes.

AI agent secrets management is the same kind of collapse. Once you see the shared shape, you stop reaching for the wrong tool. Here is the shape:

A non-deterministic process, authored by a third party (the model vendor), running in the user's environment with access to the user's credentials, composing and executing arbitrary commands against arbitrary destinations, and producing a transcript that is visible to the user, the model vendor, and any downstream analysis pipeline.

Every word in that paragraph is load-bearing. Remove "non-deterministic" and you have a script, which is the services-side secrets-management problem. Remove "authored by a third party" and you have an insider-threat problem. Remove "producing a transcript" and you have the services problem again. It is the specific combination that does not fit the categories we already have.

Three axes that distinguish agent secrets

When you lay agent secrets side-by-side with the prior art, three axes separate them.

Axis 1 — Who uses the secret

In traditional secrets management, the consumer is a service you wrote. Your code calls Vault, Vault returns the secret, your code uses it. The trust boundary is the service process; everything inside the process is trusted because you wrote it.

In agent secrets, the consumer is a model you did not write. The model is composing the call-site at inference time. You cannot pre-audit the code path; it does not exist until the model emits it. The trust boundary has to sit outside the consumer, not inside it.

Axis 2 — Who knows the secret

Services-era secrets management worked from a simple rule: "the secret exists in the service's memory for the duration of the request." That rule is tolerable because the service's memory is yours. Leaking between services is the problem Vault solves; the value itself is allowed to exist in RAM.

Agent-era secrets add a second rule that has no precedent: "the secret must not exist in the model's context, ever." Not in the prompt. Not in a tool result. Not in the transcript that gets replayed on the next turn. Not in a file the agent reads back. Not in an error message the agent surfaces.

The model is simultaneously the consumer of the secret's effect and an adversary with respect to the secret's value. This is the weird property that breaks every prior tool. Nothing in the services era was designed for a consumer that you also have to blind.

Axis 3 — When the secret is used

Traditional secrets are fetched in predictable places: a config loader at boot, a connection pool initialiser, a per-request middleware. You can enumerate the call-sites. You can audit them. You can attach policy to them.

Agent secrets are fetched at unpredictable times, for unpredictable destinations, as a side effect of the model trying to accomplish a user-stated goal. "Go deploy the preview" might fetch a GitHub token on turn 2, a Vercel key on turn 4, and an AWS session on turn 7, or it might fetch none of them, or it might fetch an OpenAI key because the model decided to benchmark something. You cannot enumerate. You can only intercept.

Why existing categories miss

Three adjacent categories deserve specific treatment, because people reach for them first.

Traditional secrets managers (Vault, AWS Secrets Manager, GCP Secret Manager, Doppler)

These were built for axis-1's services case. Their API shape is "return the secret value to the authenticated caller." That is exactly the wrong shape for agent secrets. The moment the agent authenticates to Vault and receives a token back, the token is in the agent's context — which means it is in the transcript, the logs, the Anthropic API payload, and the observability pipeline that reads all three.

You can layer a local proxy on top of Vault to avoid this, but then you are building the agent-secrets tool. The core product does not protect you.

Agent auth startups (Composio, Arcade, Nango)

These are the closest adjacency. They correctly identify that agents need to authenticate to third-party APIs on the user's behalf, and they solve an OAuth-shaped problem: "the agent wants to post to Slack; let it post to Slack without handing it Slack's bot token."

But OAuth has an escape hatch — the access token. For a SaaS API the token is a short-lived bearer; handing it to the agent's execution environment is acceptable because it expires in minutes. For an arbitrary secret (a database password, a Stripe secret key, a private signing key, a CI runner token), there is no OAuth-issued ephemeral equivalent. You have a static high-value credential, and the agent needs to use it without seeing it. Agent auth does not answer that.

Credential managers (1Password, Bitwarden)

These are human-facing by design. They are optimised for a human pasting into a browser, or a CLI shelling out to copy a value to the clipboard. They have no concept of a tool call boundary, no concept of an execution transcript, no scrubber between the consumer and the audit surface. Using 1Password's CLI from inside a Claude Code session puts the secret directly in Claude's context — the very thing you were trying to prevent.

1Password is not wrong for humans. It is wrong for agents.

The four requirements for an agent-shaped solution

A tool that actually fits the category meets four requirements. They are not negotiable; a tool that meets three is leaky.

1. Never-reveal as a hard invariant

The product must make it impossible for the model to see the plaintext value, not merely unlikely. "Impossible" here means the API surface has no verb for returning a value, the execution pipeline substitutes values into child-process memory without round-tripping them through the model's context, and output is scrubbed on the way back. We document this precisely in our threat model; the short version is that if there is any single path where a plaintext value reaches Claude, that path is a P0 bug.

2. An API shape the model can actually use

Models are good at composition. Give them a verb like secret_list and they will list secrets; give them a placeholder like {{GITHUB_TOKEN}} and they will compose it into a curl. The API must be rich enough that the model can solve the user's stated task — rotate a key, check expiry, classify a new value, tag it for a project — without ever needing a verb that returns a raw value. If the model has to invent workarounds (copying to a file and reading it back), the API has already failed.

3. Audit that includes agent context

Traditional audit logs answer "which service read this secret at 10:03am?" Agent audit logs have to answer a richer question: "which tool call, inside which agent turn, inside which user session, inside which project, used this secret for which destination?" Without the agent context, you cannot attribute anomalies. A secret used outside its normal pattern — a GitHub token hitting an AWS endpoint, a Stripe key called during an unrelated task — has to be visible, and the audit surface is the only place it can be.

4. Local-first trust posture

The agent runs on the user's laptop. The vault should too. Cloud secrets managers force a contract: you trust the cloud with your secrets in exchange for availability. For a developer running Claude Code on a MacBook, that contract is upside down — the secret needs to be present on the laptop regardless, and introducing a cloud round-trip adds a failure mode (network down, region degraded, subscription expired) without adding any security benefit. Sync across a developer's own devices is valuable; sync through a vendor-custodied cloud is not.

"Local-first" does not mean "no cloud." It means the laptop is the authority, not the cloud. The cloud may cache; the cloud must not own.

The shape of a ClauLock-class tool

We built ClauLock to meet all four requirements, and the concrete shape the design collapses to is instructive even if you do not use our implementation.

The vault is local. A single .clsec-vault file, XChaCha20-Poly1305 per-secret encryption, Argon2id KEK derivation, held by a user-level daemon in mlocked memory. No network required for any operation that involves a value.

The MCP surface is deliberately blind. The tools exposed to Claude return handles, names, metadata, and operation results. There is no secret_get. This is the single most important design decision; it closes the axis-2 problem at the API level, not in policy.

Substitution happens outside the transcript. When Claude writes curl ... -H "Authorization: Bearer {{GITHUB_TOKEN}}", a PreToolUse hook resolves the placeholder into the child process environment before the subprocess executes. The transcript records the placeholder form; the real bytes exist only in the child's address space for the duration of the call.

Output is scrubbed on the way back. A PostToolUse hook performs literal-byte replacement of any resolved value with [REDACTED:NAME] before the tool output reaches Claude. This catches the curl -v case where the child itself echoes the value.

The audit log is rich. Each secret access records the secret name, the kind, the tool call, the agent turn, the project, and the destination host if it can be determined from the rewritten command. The model can query this via secret_audit_log — but only the metadata, never the value. Users see the same log in the Tauri UI.

What this changes for teams

Once the category is named and the requirements are met, a few patterns become natural.

Per-agent scoping. You can grant Claude access to a subset of your secrets without granting it access to all of them. Scopes can be per-project (the "billing-integration" scope includes STRIPE_SECRET_KEY; the "blog" scope does not), per-destination (GITHUB_TOKEN only for api.github.com), or per-session.

Rotation tied to agent runs. Traditional rotation is calendar-driven. Agent-era rotation can be event-driven: after a completed agent task, rotate the subset of secrets the task used. The audit log tells you which subset.

Policy as code. Because the execution boundary is already in the substitution layer, you can attach policy there: "secrets tagged prod require manual confirmation every use," "secrets of kind payment require a second factor," "secrets only resolve during agent turns matching a project tag." This is the place policy can live; for agent secrets there is no other obvious home.

Shared vaults without shared plaintext. Team vaults can per-secret re-encrypt against each member's public key, so a secret added by Alice is decrypted locally on Bob's laptop without ever passing plaintext through a cloud. The server only sees ciphertext; losing the server loses no values.

What is still honestly unsolved

A category worth naming has honest edges. Two remain open.

Destination observability. When Claude sends a bearer token to an API, the API sees the plaintext. That is the point of using it. If the API logs the token — which it should not, but some do — we have no way to scrub that side. The only mitigation is per-secret destination allowlists: you cannot prevent the value from reaching the intended API, but you can prevent the model from sending it to an unintended destination.

Semantic leak classes. Scrubbing handles literal byte-string leaks. It does not handle a model that narrates "the secret starts with sk-proj- and is 48 characters long." That is a semantic leak, and the only real defence against it is a model well-trained enough to refuse. The category will mature when there is a standard prompt-level contract between vendors and agents about what the agent may say about a secret it cannot see.

Why name the category now

Categories get named late in retrospect and early in practice. It was obvious in 2018 that "observability" was a category; it was less obvious in 2013 when Charity Majors was writing about it at Parse. The value of early naming is that it lets adjacent communities — security engineers, platform teams, SRE — reach for the right vocabulary when they hit the problem.

The problem is here. Every team running Claude Code, Cursor, Cline, or any agent-class tool on a developer's machine with real credentials is, today, silently leaking plaintext into model context. Most of them have not noticed because the leak is invisible until it is not.

AI agent secrets management is the category that stops it. The shape is the four requirements. The implementation we built is one; the category admits others. The important thing is that a team evaluating a tool for this problem now has a checklist that distinguishes an actual fit from a near-miss.

ClauLock is an AI-agent secrets manager for Claude Code. Auditable Apache-2.0 crypto; source-available BUSL-1.1 product; signed binaries. See the comparison page or install it.