Proofpane is the evidence layer for AI in regulated teams. Three contracts:
(1) every call passes a policy gate and lands in a tamper-evident audit log.
(2) every production default — which prompt ships, which model, which memory strategy —
comes from a significant experiment with an inter-rater reliability floor, not a hunch.
(3) the whole record exports as a signed Evidence Pack your auditor verifies offline.
For CISOs, CAEs, and Risk Officers. Not a policy-doc GRC checklist; not a log aggregator; not a prompt-injection scanner — see how we’re different.
One-click demo. No signup. No card. Populated org with real frozen verdicts.
Proofpane is a thin governance layer beneath the tools your team already loves — Claude Desktop, Cursor, Continue, your in-house MCP servers, your own agent code. Members keep their workflows; you keep the hash-chained audit, the cost cap, the policy gate, the Evidence Pack. No wrappers. No vendor lock-in. No "you must use our IDE."
Available reach: 10,000+ apps and 40,000+ actions. Every tool above speaks MCP — wire Proofpane in once, govern any of them. One integration, the whole ecosystem.
Every production default — which prompt variant ships, which memory strategy lives, which provider is the baseline — passes (a) a statistical significance gate over a content-hashed fixture, then (b) an inter-rater reliability floor (Krippendorff α with bootstrap CI). The verdict, the confidence interval, the fixture hash, the DLP rule-set fingerprint that scrubbed it, the approving operator — all frozen on the audit row and shipped in the Evidence Pack. Your auditor reconstructs why this is the current default from the bundle alone.
Every AI decision your team makes — every prompt, every multi-agent run, every Cursor session — lands in a cryptographically chained log scoped per tenant, so cross-tenant tampering is structurally detectable. Export as a signed Evidence Pack — a standalone offline verifier ships in the bundle so your auditor reads it without backend access, without a Proofpane account, six years from now.
Control library aligned with NIST AI RMF, ISO/IEC 42001, and EU AI Act evidence expectations — pre-mapped per skill, with per-org overrides. A closed-set guard cross-checks every cited control ID against a curated truth set so fabricated references can't pass. Proofpane supports operational evidence; it does not replace legal, regulatory, or certification assessment.
Token budget control is the spine of the architecture, not a dashboard pasted on top — every call records token + latency + cost into the chain, and five layers catch cost-explosions before they become invoices: (1) a pre-call gate refuses LLM calls over per-org cap (refusal audited); (2) threshold alerts (50% / 80% / 100%) push to Slack + email before you hit cap; (3) per-call anomaly flag on any call > N× recent baseline; (4) month-end forecast projects current burn against cap so a 2-week overspend is visible 2 weeks early; (5) provider price-drift detection — a plausibility band catches silent per-token bumps from Anthropic / OpenAI. Quality runs the same way on a parallel track: closed-set hallucination guard against 259+ framework control IDs, judge-grounded scoring, cross-vendor disagreement (3 providers vote), drift alerts on pass-rate drops. The /cost and /quality dashboards are the views; the design is the contract.
Two reflection loops, same approval contract. The first watches the audit log for drift, hallucination, and low-score signals, and proposes prompt edits against the org's own failure cases. The second tracks curated AI-research feeds and auto-sandboxes proposed updates against production behaviour. In both cases only the changes a human approves ever go live.
Compose governance tasks, multi-agent primitives (consensus and adversarial review), and scheduled triggers on a visual canvas. An AI builder edits the graph for you. Every node execution writes a row into the same audit chain — the canvas is the planning view, the chain is the proof.
Vanta, Drata, Tugboat, Secureframe
Certify that you have a control. Auto-collect SOC 2 / ISO evidence about your infrastructure. Excellent for the certification audit. Gap: Don’t see inside the AI call. Can’t prove the model picked a defensible answer.
CloudTrail, Datadog, Splunk, ELK
Record what happened across infrastructure. Powerful for incident reconstruction. Gap: Plain logs; not hash-chained, not signed, not scored. An auditor still has to take your word that the row wasn’t edited.
Evidence layer for AI in regulated teams
Hash-chained audit + significance-gated production defaults + inter-rater reliability floor + signed offline-verifiable Evidence Pack. When the regulator asks why this is your default — six months from now or six years — the answer is one URL. Same hash. Same row.
Complementary, not competitive: most Proofpane customers keep their GRC tool for SOC 2 + their log aggregator for SRE. Proofpane is the missing third layer — the one your auditor opens when they ask about a specific AI decision.