security & data access

What we touch.
What we keep.

Telemetry is read-only. Raw events stay in your store; Perfloop keeps only derived aggregates and snapshots. On code, the only writes are case branches and the pull requests they carry. The merge is always yours.

every data-access claim here is verifiable against the access you granted and your own audit logs

The data contract.

The contract is the same for every customer. One connection is live today — the code host. The telemetry connectors are in development, so the telemetry rows below are the contract for when each ships, not a connection you can make yet. Which datasets and fields it binds to is something you configure with your telemetry source's own primitives, before any discovery runs.

Data access by category
category	what crosses the boundary	what perfloop keeps	retention
metricsread-only	aggregated series with their labels: metric names, label keys and values, percentiles, rates, counts	snapshot-level summaries	life of workspace
tracesread-only	span structure: names, durations, status, service labels. other attributes only as keys you approve per case; default none	derived topology + timing summaries	life of workspace
logsread-only	aggregate query results only: counts, distributions, percentiles, over views you author. the connector never selects raw columns, and every query it runs is recorded verbatim in your audit logs	aggregates	life of workspace
profilesread-only	stack frames and sample counts; code structure, no payloads	derived flame summaries	life of workspace
source coderead + pr write	code in repos you connect; the writes are case branches and the pull requests you review, never a merge	repo structure (symbols, references, packages, files, entry points)	life of connection

never collected

log events & message bodiesThe connector issues only aggregate queries, never row reads; views you author exclude message bodies; every query text is recorded in your audit logs. The strictest mode grants no log permission at all.
request/response payloadsNot read from spans, not extracted into metrics, not present in any category the product queries.
trace attributes beyond approved keysThe connector reads names, durations, status, and service labels. The attribute allowlist starts empty and grows only with your approval.
end-user pii as a categoryNothing in the contract collects it. The hygiene rules in each guide exist to keep it out of metric labels and span attributes on your side.
credentials & secretsPerfloop doesn't harvest secrets from your code or telemetry, and egress DLP blocks any that surface from leaving the sandbox.
deploy eventsNothing is pushed to Perfloop. Before/after verification keys off the version dimension already in your telemetry (service.version or your equivalent).

Connections and residency.

What you connect and where Perfloop runs are separate, independent choices. Each connection is its own grant and revokes on its own. Nothing is connected by default.

what you connect

code host
Read access plus PR-write on repositories you select: the only writes are case branches and the PRs they carry, never a merge. Stands alone; no telemetry required.
read-only repo grant · benchmarks run in perfloop's sandbox · writes: case branches + prs, never a merge
telemetry source
A separate grant. Aggregates are the only mode it has.
aggregate reads are all the product makes · enforced by source permissions where they exist, by audited connector policy where they don't · each guide states which

where perfloop runs

our cloud
One workspace per customer on shared infrastructure.
the default · your data stays in this deployment
your cloud
The derived model never leaves your boundary.
byoc · enterprise · same contract, your residency

soc 2 attests our controls and applies regardless of what you connect or where we run · status below

Who enforces the boundary.

Where a source's permission model can scope Perfloop's access, it does the enforcing: GCP IAM, GitHub's app permissions. Where a source's permissions are coarser, the aggregate-only boundary is connector policy, checkable in your audit logs. Every guide lists its grants verbatim and states which kind of enforcement you're getting, before anyone connects anything.

datadog · axiom · grafana · gitlab · bitbucket: guides ship with their connectors · all connection guides

Agent containment.

The agent runs your code and reads your telemetry, so the architecture treats it as compromised and contains it.

the agent is treated as compromised

sandboxed sessionsEvery session runs in its own hardened, ephemeral sandbox (gVisor), destroyed when the session ends. Its only credential is a short-lived token that tells the proxy which session it is; the token grants nothing by itself, and the sandbox can't reach the infrastructure it runs on.
zero secrets insideThe sandbox holds no credentials for GitHub or your telemetry. Requests leave through the proxy, which attaches credentials only after policy checks pass. The agent never sees a token.
one path outDeny-by-default networking. The only route out of the sandbox, DNS included, is the proxy.
permissions enforced in the proxyHost, method, and path allowlists plus per-session grants, checked deterministically outside the model. GitHub writes are bound to the approved branch and PR; merge, approve, and mark-ready are denied at the proxy.
egress dlpOn the permitted egress paths, outbound bodies are scanned and detected secrets are blocked or replaced with non-reversible aliases before they leave. This is a secondary scrub behind the destination allowlist: defense in depth, not the boundary, and not a guarantee.

The control plane.

Containing the agent is half of it. The platform that holds your derived data is itself least-privilege.

least privilege by default

private databaseThe product database has no public address and refuses unencrypted connections. Access is over TLS from inside the private network only.
encrypted at restAll stored data is encrypted at rest with platform-managed keys.
scoped credentialsThird-party credentials are held only by the service that uses them, never by the agent. The code-host key, for instance, lives in the egress proxy, which mints short-lived per-request tokens at the boundary, so no session ever holds a raw token.
least privilege between servicesControl-plane services are deny-by-default on the network, every path explicitly allowed. The session controller can verify callers and manage sandboxes, and nothing more: no exec into workloads, no privilege escalation, no impersonation.
managed secret storeMaster keys and credentials live in a managed secret store, fetched per service at runtime under least privilege and held only in memory, never in an image or on disk.
audit retention floorAudit logs are retained at least 13 months, enforced by a database floor that refuses a shorter window. The cleanup job can only invoke that function, never read or alter audit rows.

The questions your security team will ask.

Answered up front, against ground truth.

01
Does our raw data ever reach an LLM?
Today, connected source code enters model context; telemetry connectors are in development, and when they ship only aggregates will. Inference runs on Vertex AI inside the same Google Cloud boundary that hosts the product: your data is not used to train models and never reaches a vendor outside GCP. By design, raw log events never enter model context: the planned telemetry path reads aggregates, not rows.
02
What happens if your agent is prompt-injected?
Assume it happens; the architecture does. The session holds no secrets and has one network path: the proxy, where destination allowlists, per-session permissions, and egress DLP are enforced outside the model. A hijacked agent gains no credentials, no new destinations, and no way to merge anything, and every request it makes passes through the proxy's checks.
03
Who are the subprocessors?
Three. Google Cloud for hosting and for model inference through Vertex AI: your data stays inside GCP, is not used to train models, and the model's maker never receives it. WorkOS for user authentication: it verifies identity and stores account data (name, email, IP address) for anyone who logs into Perfloop, and never receives your telemetry or source code. Axiom for Perfloop's own operational telemetry: service logs and metrics about Perfloop itself, which exclude customer code and customer telemetry content.
04
Can Perfloop employees see our data?
Operating the product on your behalf can require a Perfloop engineer to run a support session against your workspace. That access is SSO-authenticated, runs over the same proxy-mediated, revocable grant your own sessions use, and uninstalling cuts it. The formal controls around it (per-incident scoping, time-boxing, and access logging) are in progress, and we'll walk your team through exactly where they stand under a security review.
05
What happens when we revoke?
You revoke on your side, not by asking us. Uninstall the GitHub App and access dies immediately: the proxy mints every token per request from that App, so once it's gone there is nothing left to mint. No ticket, no waiting on us.
06
How do you know what data you need before seeing our telemetry?
We don't need to; that is the point of the contract. The categories are fixed and customer-invariant. The mapping to your datasets and fields happens inside grants you've already scoped: Perfloop discovers within them and proposes expansions per case, with stated justification, for you to approve. Access always precedes discovery.

Status.

This page documents what Perfloop accesses and how that access is controlled, in claims you can verify today. Formal attestations are in progress; their status is below.

architecture review: available now; This page, the full data-access specification, and a founder walkthrough with your security team.
soc 2 type i: not started; It hasn't begun. This row will say so the day it does.
dpa: on request; Including the subprocessor list above.
byoc: enterprise; Your cloud: the data plane in an account you provision.

For your
security team.

full specification + security questions: security@perfloop.ai →

What we touch.What we keep.

what you connect

code host

telemetry source

where perfloop runs

our cloud

your cloud

Does our raw data ever reach an LLM?

What happens if your agent is prompt-injected?

Who are the subprocessors?

Can Perfloop employees see our data?

What happens when we revoke?

How do you know what data you need before seeing our telemetry?

For yoursecurity team.

What we touch.
What we keep.

For your
security team.