Moss logo

Moss

Real-time semantic search for Conversational AI

Fall 2025active2025Website
Artificial IntelligenceDeveloper ToolsSaaS
Sponsored
Documenso logo

Documenso

Open source e-signing

The open source DocuSign alternative. Beautiful, modern, and built for developers.

Learn more →
?

Your Company Here

Sponsor slot available

Want to be listed as a sponsor? Reach thousands of founders and developers.

Report from 27 days ago

What do they actually do

Moss provides a tiny retrieval engine and a managed control plane so apps can run semantic (meaning‑based) search right next to the AI that answers users. The same retrieval code can run in the browser, on a phone/desktop, or on your servers; Moss handles indexing, packaging, and syncing so queries return from local memory instead of going across the network (usemoss.dev). They target sub‑10 ms median retrieval and ship an embeddable runtime described as <20 kB, with official JavaScript and Python SDKs (usemoss.dev, YC profile).

Developers point Moss at data (docs, chat history), build an index via the portal/SDKs, deploy that index to client or server runtime, and query it during inference; the dashboard supports sync, analytics, and A/B testing of embeddings and index configs (features, docs). Today it’s being used in pilots for voice agents, copilots, and in‑app/document search; Moss reports multiple design partners and early paying customers (YC profile, GitHub demos).

Who are their target customer(s)

  • Voice‑agent and real‑time assistant teams: Remote retrieval adds too much latency and failure risk for live conversations; packaging and running search on‑device or at the edge is costly to build and maintain (usemoss.dev, YC profile).
  • Product teams adding in‑app or document search (copilots, knowledge bases, help centers): They need predictable, fast results and simple syncing across servers/clients without building and operating a custom retrieval backend (features, docs).
  • Mobile and web app developers needing tiny local AI support: Standard search runtimes are too large or require constant network access, making offline, private, or low‑bandwidth use cases impractical (usemoss.dev, GitHub).
  • Enterprise AI platform/infra teams: They must add observability, controlled rollouts, and security/compliance around retrieval and context delivery to pass audits and unblock production launches (YC profile, features).
  • Voice/orchestration platform partners and SDK integrators: They want a drop‑in, small retrieval layer they can bundle across customer environments instead of building bespoke integrations each time (GitHub, usemoss.dev).

How would they acquire their first 10, 50, and 100 customers

  • First 10: Convert existing design partners and pilots into paid deployments by running tightly scoped, latency‑sensitive pilots (voice agents or in‑app copilots) and co‑building a reference integration; publish case studies from those wins (YC profile, GitHub demos).
  • First 50: Publish turnkey starter kits and platform‑specific integrations (browser, mobile, Pipecat/LiveKit) plus 1–2 open‑source example apps to enable fast trials; pair with targeted outreach to teams building voice agents and copilots, amplified via YC/startup networks (features, GitHub).
  • First 100: Stand up a small solutions‑led sales motion for larger customers with paid pilots; add enterprise features (security, offline sync guarantees, observability) to remove procurement blockers; and establish channel partnerships so platforms bundle Moss by default (YC profile, features).

What is the rough total addressable market

Top-down context:

Adjacent markets Moss plugs into—conversational AI (~$11.6B in 2024), enterprise search (~$6.1B in 2024), and on‑device AI (~$8.6B in 2024)—sum to roughly $25–30B today, with RAG/retrieval infra also emerging as a multi‑billion category by 2030 (Grand View Research: Conversational AI, IMARC: Enterprise Search, Grand View: On‑device AI, Grand View: RAG, MarketsandMarkets: RAG). Moss can realistically sell into only a slice of this, given it’s a retrieval layer rather than a full conversational AI or KM suite.

Bottom-up calculation:

Assume ~15,000 relevant product/agent teams globally need low‑latency, local retrieval, with 30–40% adopting over time at a blended ACV of $25k–$150k (SMB to enterprise), implying roughly $0.5–1.5B serviceable TAM today; adding ~200 platform/SDK partners at $250k–$750k each expands this by ~$50–150M, with upside as real‑time/edge deployments grow.

Assumptions:

  • Number of relevant buyers: ~15k product/agent teams plus ~200 platform/SDK partners.
  • Adoption over time: 30–40% of relevant teams; platform attach limited by integration priorities.
  • Pricing: blended ACV $25k–$150k for teams; $250k–$750k for platform licensing/commit.

Who are some of their notable competitors

  • Pinecone: Managed cloud vector database built for scale and reliability over the network; optimized for server/cloud use rather than tiny on‑device runtimes.
  • Qdrant: Open‑source vector search engine (and hosted cloud) focused on flexible, high‑quality vector search and server‑side scaling/control.
  • Milvus (Zilliz): Open‑source vector database for large datasets and high throughput; runs as server infrastructure (sharding, GPUs) rather than an embeddable client runtime.
  • Weaviate: Vector search engine with built‑in modules for vectorization and enrichment; offers cloud/self‑hosted models but adds more server‑side complexity.
  • Chroma: Developer‑friendly embedding database that’s easy to run locally or hosted, but typically as a heavier server/client process versus a tiny packaged WASM/browser runtime.