Theorem

Program verification so even your systems engineers can vibecode

Spring 2025active2025•Website

Machine Learning

Disclaimer

FYI Combinator is not affiliated with Y Combinator. Reports are generated by AI Research Agents and may not be 100% accurate.

Documenso

Open source e-signing

The open source DocuSign alternative. Beautiful, modern, and built for developers.

Learn more →

Your Company Here

Sponsor slot available

Want to be listed as a sponsor? Reach thousands of founders and developers.

Report from 3 months ago

What do they actually do

Theorem is building program verification tools that help engineering teams prove that critical systems code behaves as intended. They focus on cases where performance tweaks or rewrites can introduce subtle bugs—like GPU/parallel kernels, cryptographic code, infrastructure components, finance logic, and migrations—so engineers can check equivalence and correctness rather than relying only on tests or manual review (YC company page).

Today, they appear to work closely with early users via hands-on pilots: integrating with existing codebases (e.g., C/C++/Rust) to check properties, surface real defects, and produce actionable reports. Their YC profile describes large speedups in program verification compared with traditional methods, but details on specific product interfaces and maturity are still emerging (YC company page).

Who are their target customer(s)

Systems engineers writing highly optimized GPU or parallel kernels (ML infra, HPC teams): Hand‑tuned kernels can fail in rare edge cases and are hard to debug. They need confidence that low‑level optimizations don’t introduce correctness bugs (YC profile).
Engineers building cryptography or secure-transport code (TLS, crypto libraries): Security bugs can be catastrophic; current review/verification is slow, specialized, and hard to scale. They need machine‑checkable guarantees that integrate with CI (YC profile).
SREs and platform teams running public infrastructure (cloud services, browsers, OS components): Tiny logic errors can cause outages or silent failures. Manual review doesn’t scale as systems and AI‑generated changes grow (YC profile).
Finance engineering teams (trading, payments, accounting backends): Subtle numeric or business‑logic errors risk financial loss and regulatory issues. They need proof that updates or optimizations haven’t changed critical behavior (YC profile).
Teams doing large legacy migrations or performance-driven rewrites (e.g., Python → Rust, custom kernels): Migrations are risky and slow because it’s hard to guarantee new code behaves exactly like old code without expensive manual work (YC profile).

How would they acquire their first 10, 50, and 100 customers

First 10: High‑touch paid pilots with YC/network teams working on GPU kernels, crypto, or infra, where Theorem engineers pair with the customer to integrate the tool, find real bugs, and deliver an actionable report to use as references.
First 50: Publish anonymized case studies and playbooks from pilots; run targeted outbound to similar teams; offer a lightweight self‑service trial with templated onboarding so engineering teams can reproduce pilot outcomes with minimal founder involvement.
First 100: Form partnerships and integrations with compiler/tool vendors, cloud providers, and security consultancies; hire a small direct sales team and developer advocates to run workshops and produce reproducible content that drives organic trials and partner‑led deals.

What is the rough total addressable market

Top-down context:

Near‑term, Theorem fits into the formal‑verification/program‑verification market, estimated around ~$0.4B–$1.1B in 2024/2025 (ResearchIntelo, DataIntelo). Adjacent budgets include static analysis/SAST (~$1.2–$1.3B) and broader security testing (~$14.7B) that could be partially addressable over time (MMR static analysis, Grand View Research).

Bottom-up calculation:

Assume 2,000–5,000 teams globally with hard correctness requirements adopt a verification product at $50k–$150k ACV; this implies a core serviceable opportunity of roughly $100M–$750M. With workflow fit into broader AppSec/CI pipelines, the reachable pool could expand toward low single‑digit billions over time, but only a fraction is likely practical to capture initially.

Assumptions:

There are thousands of active teams globally with recurring, high‑assurance needs (GPU/HPC, crypto, infra, finance, migrations).
Willingness to pay is comparable to enterprise security/dev‑tool contracts ($50k–$150k+ ACV for critical subsystems).
Near‑term focus is on formal‑verification buyers; expansion comes from adjacent SAST/AppSec budgets (MMR, Grand View Research).

Who are some of their notable competitors

Galois: Provides formal‑methods tooling and services (e.g., SAW, Cryptol) to prove program correctness and cryptographic equivalence; relevant for teams needing machine‑checked refinement/equivalence for low‑level code (SAW, tool updates).
Certora: Automated prover and rule language used widely for smart‑contract verification; closer fit for crypto teams seeking CI‑friendly, automated checks, but less focused on GPU/HPC or general systems code (product, docs).
Trail of Bits: Security consultancy combining fuzzing, audits, and formal methods for blockchain, TLS, and systems software; competes when teams prefer expert services to author invariants/proofs and assurance artifacts (services, perspective on fuzzing vs. formal).
AdaCore (SPARK): SPARK/Ada language and toolchain for deductive verification of safety‑ and security‑critical systems; an alternative when customers adopt a language‑centric path to mathematical guarantees rather than verifying existing C/C++/Rust code (SPARK overview, intro course).
Nethermind: Formal verification services and tooling for cryptography, zk circuits, and protocol code (Lean/ATP tooling), including end‑to‑end verifications for production systems—strong overlap for advanced crypto/zk teams (formal verification).