Sanctum

Ship features to simulated users before real ones

Fall 2025active2025•Website

Developer ToolsB2BAnalyticsMarket ResearchAI

Disclaimer

FYI Combinator is not affiliated with Y Combinator. Reports are generated by AI Research Agents and may not be 100% accurate.

Documenso

Open source e-signing

The open source DocuSign alternative. Beautiful, modern, and built for developers.

Learn more →

Your Company Here

Sponsor slot available

Want to be listed as a sponsor? Reach thousands of founders and developers.

Report from 3 months ago

What do they actually do

Sanctum turns a team’s real session recordings and analytics into AI “user models,” then runs those simulated users against new code (PRs/staging/ephemeral stacks) to surface breakages and UX friction before release. It delivers recordings, traces, and signals back into developer workflows (e.g., via a GitHub Action, SDK with OpenTelemetry, and outputs to other tools) so engineers and PMs can debug earlier and ship with more confidence (YC page, site).

Today it targets teams already using session replay/analytics and feature flags (e.g., PostHog, Amplitude, LaunchDarkly) and claims it can scale to large numbers of parallel simulations while generating recordings and metrics useful for diagnosis and profiling. Materials suggest they’re in pilot/early‑customer phase rather than broad commercial rollout (YC page).

Who are their target customer(s)

Engineering teams at SaaS/web companies shipping via PRs and CI: They miss regressions and UX issues that unit tests don’t catch and only learn about them in production. They want pre‑merge signals from realistic user flows with recordings/traces to debug faster (YC page).
Product managers for consumer or mobile apps using session replays/analytics: They rely on slow or risky live experiments and sometimes draw the wrong conclusions after users have a bad experience. They want simulated audiences to pre‑filter ideas and catch UX regressions before A/B tests (YC page).
QA and automation engineers owning end‑to‑end coverage: They deal with flaky tests and hard‑to‑reproduce user journeys. They want production‑derived user models that can run against ephemeral environments and surface where agents get stuck or fail (YC page).
Platform/DevOps/SRE teams managing staging and ephemeral stacks: Rollbacks and incidents occur because real user flows break in ways observability doesn’t catch pre‑deploy. They want simulated runs that exercise realistic traffic and provide recordings and traces for debugging (YC page).
Teams running feature flags and experimentation: Edge‑case UX issues slip through feature‑flagged launches and small dogfooding cohorts. They want to validate flagged code paths with realistic simulated traffic and pipe results into flagging/analytics tools (YC page).

How would they acquire their first 10, 50, and 100 customers

First 10: Run white‑glove pilots with teams already using session replay and feature flags: ingest a week of sessions, wire the GitHub Action to a high‑impact PR or staging URL, and deliver a short report with replays of a caught regression. Prioritize PostHog/Amplitude/LaunchDarkly users to reduce integration friction (YC page).
First 50: Publish a GitHub Marketplace Action, ship quickstarts for common stacks, and co‑market with replay/feature‑flag vendors via webinars and case studies to drive warm leads from their customer bases and developer communities (YC page).
First 100: Productize self‑serve onboarding with limited trial credits for simulated‑prod runs on PRs; add CI/Slack/GitHub notifications for conversion nudges and expand partner integrations to drive organic trials, referrals, and targeted mid‑market outreach (YC page).

What is the rough total addressable market

Top-down context:

Use global software testing as the primary TAM baseline since Sanctum replaces/augments end‑to‑end testing and pre‑deployment validation. Estimates put the 2024 software‑testing market at about $55–56B (GMI; see also KPMG).

Bottom-up calculation:

A practical initial SAM proxy is teams with mature CI/CD workflows already buying automation/replay tools; CI/CD tooling alone is estimated at ~${8}B, indicating a multi‑billion dollar pool of likely early adopters and adjacent budget (MRF). Session‑replay budgets (~$0.26B) reinforce that some buyers already invest in the inputs Sanctum consumes (BusinessResearchInsights).

Assumptions:

Testing/QA, CI/CD, and session‑replay budgets overlap and are often controlled by the same teams.
Early adopters run PR‑centric CI pipelines and already instrument session replay/analytics.
Pricing and value capture will resemble other testing/automation tools rather than observability list prices.

Who are some of their notable competitors

Testim: AI‑first end‑to‑end test automation that records flows and maintains tests for CI. Focuses on automated, self‑healing scripted tests rather than generating diverse simulated audiences from real session data to stress ephemeral PR environments (site).
mabl: AI‑powered testing and synthetic‑transaction platform for recording flows and running checks in CI/CD. Emphasizes automated test creation/monitoring vs. production‑derived agent populations for PR/staging runs (site).
Datadog Synthetic Monitoring: Scripted browser/API synthetic checks integrated with CI to catch availability/functional regressions. Geared to deterministic checks of key paths, not large, realistic agent populations with replayed interactions for UX debugging (product page).
FullStory (session replay): Captures and replays real user sessions for diagnosing production issues. Surfaces what happened post‑release but doesn’t generate and execute simulated users against PRs/ephemeral stacks pre‑merge (platform).
Playwright / Cypress (E2E frameworks): Open‑source frameworks for writing/running end‑to‑end tests in CI/GitHub Actions. Provide precise scripted tests but require manual scenario authoring; they don’t automatically convert real session histories into varied agent populations (Playwright CI, Cypress CI).