Event Horizon Labs

We are building the Post-Human Trader.

Winter 2024active2024•Website

Artificial IntelligenceSearchInfrastructureAI

Disclaimer

FYI Combinator is not affiliated with Y Combinator. Reports are generated by AI Research Agents and may not be 100% accurate.

Documenso

Open source e-signing

The open source DocuSign alternative. Beautiful, modern, and built for developers.

Learn more →

Your Company Here

Sponsor slot available

Want to be listed as a sponsor? Reach thousands of founders and developers.

Report from 4 months ago

What do they actually do

Event Horizon Labs ships SciPhi, a cloud and open‑source stack for building, deploying, and monitoring retrieval‑augmented generation (RAG) and agentic retrieval systems. Teams use SciPhi Cloud for “one‑click” serverless pipelines and an API that performs multi‑step research across documents and the web, returns citation‑backed answers, and can generate knowledge graphs. The open‑source R2R engine powers ingestion (text, PDFs, images, audio, JSON), hybrid search, and agentic retrieval; the cloud product adds hosting and observability for production use cases SciPhi docs R2R GitHub YC launch.

Today, their users are developers and ML/DevOps teams who need reliable retrieval and multi‑step reasoning with explainability and citations. The product is available as a hosted service with a free tier and as open source, with public documentation, examples, and benchmarks aimed at production adoption Product Hunt SciPhi docs.

Who are their target customer(s)

Developer teams building apps that answer complex questions over documents and the web: Ad‑hoc LLM prompts hallucinate and break on multi‑step reasoning; they need concise, citation‑backed answers and a dependable pipeline rather than wiring components from scratch SciPhi docs.
ML engineers turning messy company data into searchable knowledge: They must ingest many formats (PDFs, audio, JSON), keep search accurate as content changes, and extract structured insights without maintaining a custom pipeline end‑to‑end R2R GitHub.
DevOps / platform teams running retrieval + reasoning in production: They face scaling, monitoring, and debugging for multi‑step queries and need built‑in observability and serverless deployment to control costs and complexity SciPhi docs.
Quantitative researchers seeking automated signal‑to‑execution pipelines: Research outputs aren’t consistently testable or actionable; they need end‑to‑end validation, risk controls, and low‑latency execution links to turn signals into trades YC profile.
Security‑ or compliance‑sensitive teams needing on‑prem control: They require an open, auditable stack that integrates with internal systems so sensitive data stays in‑house, including self‑host options R2R GitHub.

How would they acquire their first 10, 50, and 100 customers

First 10: Convert open‑source R2R users and Product Hunt/YC signups via a concierge onboarding flow, responsive support, and short pilots with free credits to validate retrieval + multi‑step reasoning on their own data R2R GitHub YC launch.
First 50: Publish benchmarked how‑tos and reproducible demos, run office hours and meetups, and ship turnkey configs for popular vector DBs/LLMs to shorten time‑to‑value; convert trials with documented case studies EHL benchmarking SciPhi docs.
First 100: Offer paid pilots with clear SLOs (latency, citation quality, maintainability), provide on‑prem bundles and audit help, list in cloud/vendor marketplaces, and use YC/Pillar networks for quant pilots run by a founding quant/solutions engineer Pillar VC Work at a Startup.

What is the rough total addressable market

Top-down context:

SciPhi/R2R sits across RAG, vector databases, and AI/agent orchestration—markets that together are in the single‑ to low‑double‑digit billions today and growing quickly; RAG is ~low billions and scaling, vector DBs were ~$1.6B in 2023, and AI developer/orchestration tools are projected ~$11B+ mid‑2020s, rising toward ~$30B by 2030 GVR RAG Vector DB AI dev tools. Including the long‑term autonomous trading ambition touches algorithmic/AI trading infrastructure markets measured in the tens of billions over multi‑year horizons Algorithmic trading AI trading platforms.

Bottom-up calculation:

Near‑term bottom‑up TAM: assume ~100k global teams will operationalize RAG/agentic retrieval over the next few years; at $10k–$30k ARR per team for hosted + support, that implies ~$1–3B core TAM. An enterprise‑skewed mix (e.g., 20k larger accounts at $50k–$150k ARR) adds ~$1–3B upside, keeping the practical range in low‑ to mid‑single‑digit billions.

Assumptions:

Tens of thousands of software, data, and platform teams will adopt managed RAG/agentic retrieval as standard enterprise capability.
Average contract values span from self‑serve ($10k ARR) to enterprise ($50k–$150k ARR) depending on deployment, observability, and compliance.
This excludes trading execution revenues; those are treated as a separate, longer‑term market.

Who are some of their notable competitors

LlamaIndex: Open‑source indexing/connectors for RAG. Strong developer toolkit but less focused on managed orchestration and production observability compared to SciPhi’s hosted stack SciPhi docs.
deepset Haystack: Open‑source framework for production QA/RAG pipelines with retriever/reader focus and enterprise deployment; fewer built‑in agentic multi‑step reasoning and managed cloud features out of the box relative to SciPhi SciPhi docs.
Weaviate: Managed + OSS vector database with semantic modules and graph‑like schema. Competes at the storage/knowledge‑graph layer but is not a full agentic research/orchestration engine by itself.
Pinecone: Hosted vector database for large‑scale similarity search. Often a component inside RAG stacks; it doesn’t provide end‑to‑end multi‑step reasoning, citation‑backed synthesis, or pipeline observability.
Milvus (Zilliz): Open‑source, high‑performance vector DB used at scale. Competes on vector storage/search and is typically one piece of a custom RAG stack rather than a managed agentic retrieval platform.