What do they actually do
Capy provides an AI software engineer that actually runs code and ships work. Teams connect a GitHub repo and a lightweight config, and Capy’s agents triage issues into small tasks, execute them in isolated virtual machines, run tests and checks, and push branches/PRs back to GitHub when checks pass. The system combines orchestration, a safe execution environment (instant VMs/virtual desktops), and GitHub integration so changes are verifiable before review and merge (capy.ai, CodeCapy README, Scrapybara docs, Trigger.dev customer story).
Today, typical runs include reproducing bugs, writing or fixing tests, making small code changes, and validating with automated QA (tests, linting). Work happens in isolated desktops so agents can run apps, browsers, and end‑to‑end tests without touching customer infrastructure. Pricing is seat‑based with explicit concurrency limits ("jams"), indicating a live, paid product used by developer teams (capy.ai).
Who are their target customer(s)
- Mid‑size web engineering teams shipping features and fixes: Too much time is spent reproducing bugs and running end‑to‑end tests. They want something that can run the app, verify behavior, and produce ready PRs instead of just code suggestions (capy.ai, CodeCapy, Scrapybara).
- QA / test engineers responsible for UI and browser tests: Flaky or slow UI tests cause manual debugging and local setup. They need isolated desktops that reliably run the app and tests per change (Scrapybara docs, CodeCapy).
- Teams supporting platform‑specific builds or desktop apps (Windows/macOS): CI often can’t run platform‑only workflows, so engineers test and debug on local machines. They need hosted instances that reproduce those OS environments (CodeCapy README, Scrapybara docs).
- Small startups or feature teams with a growing backlog: Triaging issues and making many small fixes consumes limited time. They want an assistant that converts tickets into runnable changes and pushes vetted PRs to save developer hours (Trigger.dev customer story, capy.ai).
- Engineering managers / platform teams focused on release safety: They worry automation could make unsafe changes or create noisy failures. They need agent runs that execute tests, retry or escalate failures, and expose clear logs before merge (CodeCapy README, Trigger.dev story).
How would they acquire their first 10, 50, and 100 customers
- First 10: Founder‑led pilots with mid‑size engineering/QA teams via YC network, Trigger.dev contacts, and active GitHub repos. Run a live demo on a real repo showing an agent spinning up a VM, running tests, and opening a PR; convert with a short discounted pilot on the Pro plan (capy.ai, CodeCapy, Trigger.dev story).
- First 50: Publish ready‑to‑use CodeCapy templates, one‑hour setup guides, and demo videos of Scrapybara VMs running browsers/tests and producing PRs; promote through GitHub, dev forums, YC/HN, and webinars; convert signups with trial credit and fast onboarding (CodeCapy, Scrapybara docs, capy.ai).
- First 100: Form partnerships with CI/test tooling and platform vendors (e.g., orchestration partners), run structured pilots with success metrics and paid seat rollouts, and use case studies to drive repeatable outbound and group demos (Trigger.dev story, Scrapybara, CodeCapy, capy.ai).
What is the rough total addressable market
Top-down context:
Adjacent categories suggest a large and growing market: AI code tools were ~$6.1B in 2024 and projected to reach ~$26B by 2030, while AI‑powered software testing solutions were ~$4.5B in 2024 and growing, indicating a combined multi‑billion opportunity relevant to agentic coding and E2E test automation (Grand View Research, Valuates via Yahoo). There are also ~47.2M developers worldwide as of 2025, framing the potential seat base for AI developer tools (SlashData).
Bottom-up calculation:
If 50,000 engineering teams adopt an average of 20 seats at a blended $150/seat/month, that implies ~$1.8B in annual spend addressable by seat‑based agentic coding/testing tools. At 100,000 teams on the same assumptions, the opportunity approaches ~$3.6B/year.
Assumptions:
- Seat‑based pricing blended at ~$150/seat/month between Lite and Pro tiers (capy.ai).
- Targetable teams with needs for agentic coding and E2E testing number 50k–100k globally (subset of the ~47M developer population).
- Adoption starts in mid‑size teams with 20 seats on average; larger orgs and wider use cases expand ARPU over time.
Who are some of their notable competitors
- Cognition Devin: Markets itself as an AI software engineer that plans and executes tasks end‑to‑end, including testing and creating PRs; a direct competitor at the "autonomous SWE" layer (devin.ai).
- GitHub Copilot (Workspace/Agents): GitHub’s agentic features let developers assign issues to Copilot to autonomously write code and create PRs; Copilot Workspace preview showed plan/build/test flows with one‑click PRs inside GitHub (GitHub blog, Copilot features).
- Sweep AI: AI coding agent focused on JetBrains IDEs; originally offered a GitHub PR bot and now provides an agent/autocomplete that runs tests and makes multi‑file changes in IDEs (GitHub Marketplace note, sweep.dev).
- Sourcegraph (Cody / Agents / Amp): Code understanding platform powering agents and agentic coding (Amp), used to search, understand, and automate changes across large codebases—often paired with AI agents in enterprise environments (Sourcegraph agents, Amp).
- QA Wolf: AI‑native service for end‑to‑end test coverage and parallel test execution, positioned as taking QA “off your plate”; overlaps with Capy’s E2E testing and verification value proposition (qawolf.com).