Hatchet

Background task orchestration and visibility

Winter 2024active2024•Website

Developer ToolsOpen SourceInfrastructure

Disclaimer

FYI Combinator is not affiliated with Y Combinator. Reports are generated by AI Research Agents and may not be 100% accurate.

Documenso

Open source e-signing

The open source DocuSign alternative. Beautiful, modern, and built for developers.

Learn more →

Your Company Here

Sponsor slot available

Want to be listed as a sponsor? Reach thousands of founders and developers.

Report from 5 months ago

What do they actually do

Hatchet is an open‑source and managed system for running and observing background work alongside your application. It provides an orchestration engine (self‑hosted or Hatchet Cloud) that schedules and routes tasks, SDKs for Python/TypeScript/Go so developers define small functions as tasks and compose them into workflows, workers/agents that run in your own infrastructure and connect via gRPC, and a dashboard to inspect, replay, and alert on runs (docs overview, architecture, GitHub README).

Today it focuses on practical capabilities: durable execution and checkpoints for long‑running or failure‑prone jobs, concurrency controls and global rate limits, priority/fairness routing, child workflows for fan‑out or agent‑style patterns, low‑latency dispatch, and multiple triggers (API, cron, events/webhooks). It runs on Postgres as the durable store and can be installed via CLI/Helm or used as a managed cloud (durable execution, concurrency & rate limits, child spawning, observability, self‑hosting).

Who are their target customer(s)

ML/AI engineering teams building agents and real‑time ingestion: They need low‑latency orchestration that can spawn and manage many short‑lived tasks, recover on failure without duplicating work, and provide traceable runs to debug agent behavior (use cases, child spawning).
Data engineering teams running high‑throughput indexing or ETL pipelines: They struggle to process very large task volumes reliably and need durable retries plus global concurrency, rate limits, and fairness so one noisy job doesn’t starve others (durable execution, concurrency).
Platform/infra engineers operating in their own clusters (incl. GPUs): They want workers that run on Kubernetes/containers, predictable dispatch to specific nodes/resources (e.g., GPUs), and an engine they can self‑host with minimal extra infrastructure (architecture, self‑hosting).
Engineering leaders at multi‑team SaaS companies: They need visibility, replay, and alerting for background work, plus tenant isolation, quota controls, and enterprise access/compliance to prevent background failures from causing customer incidents (observability, multi‑tenant queues).
Product teams running large parallel media/compute jobs: They need durable checkpoints for long runs, easy fan‑out across many workers, and simple partial replays when a subset fails (durable execution, child spawning, usage examples).

How would they acquire their first 10, 50, and 100 customers

First 10: Convert active open‑source users and community advocates already running Hatchet by offering free pilot support, 1:1 migration help, and early access to cloud features; do direct outreach via GitHub contributors, the HN thread, and YC/network intros (GitHub, HN, YC profile).
First 50: Publish concrete migration guides, quickstarts, and how‑to posts for agent, ingestion, and ETL use cases; promote in targeted ML/data communities and run short technical webinars. Ship polished install scripts/Helm charts and templates, plus a paid onboarding option for platform teams (Quickstarts, Helm charts, use cases).
First 100: Drive product‑led growth with a clear free tier and fast upgrade path on Hatchet Cloud, while targeting mid‑market platform teams with case studies and a short technical audit. Add paid integrations (SSO, observability) to ease enterprise procurement; prioritize reliability and billing/teams features and run targeted outbound to SaaS/data companies (Hatchet Cloud, observability, multi‑tenant queues).

What is the rough total addressable market

Top-down context:

Workflow/process orchestration is sized in the low billions today (e.g., ~$7.3B in 2024 per Grand View Research; other reports place it in the mid‑teens) and projected to grow, while adjacent serverless/function and AI orchestration categories push a broader envelope into the tens of billions (Grand View Research, MarketResearchFuture, Mordor Intelligence, MarketsandMarkets).

Bottom-up calculation:

Illustrative SAM: assume 20,000–60,000 engineering‑led orgs with background task/ETL/agent needs in reachable markets; 10–20% buy a dedicated orchestration platform; average annual spend of $30k–$60k across cloud and self‑hosted enterprise tiers. That implies roughly $60M–$720M in near‑term serviceable spend (e.g., 30k orgs × 15% × $40k ≈ $180M; 60k × 15% × $50k ≈ $450M), with upside as features expand into broader serverless/AI orchestration.

Assumptions:

Targetable org count of 20k–60k across NA/EU and similar markets with meaningful background workloads.
Adoption of 10–20% for dedicated orchestration among those orgs (rest use homegrown/basic queues).
Blended annual spend of $30k–$60k per paying customer across managed cloud and enterprise self‑host.

Who are some of their notable competitors

Temporal: Open‑source durable workflow engine with language SDKs and a managed cloud; closest on durable execution but with a heavier, code‑first platform and its own operational footprint (docs).
Celery: Widely used Python task queue over Redis/RabbitMQ; simple for basic jobs but lacks built‑in durable workflow checkpoints, multi‑tenant routing, and rich observability/worker‑routing out of the box (docs, Hatchet vs. Celery).
Apache Airflow: Python‑based scheduler/monitor for batch data pipelines; strong for scheduled ETL DAGs but not aimed at low‑latency, high‑throughput short‑lived or agent‑style dispatch patterns (docs).
Argo Workflows: Kubernetes‑native workflow engine running each step as a container; good if you want container‑per‑task isolation on K8s, less focused on language SDKs and durable parent/child semantics for app‑level agents (docs).
Ray: Distributed compute framework for scaling Python/ML workloads and serving; strong runtime for parallel compute, but not a full durable orchestration + multi‑tenant routing/observability layer (docs).