Maitai

Reliable, self-improving enterprise AI

Summer 2024active2024•Website

AIOpsArtificial IntelligenceDeveloper ToolsEnterpriseAI

Disclaimer

FYI Combinator is not affiliated with Y Combinator. Reports are generated by AI Research Agents and may not be 100% accurate.

Documenso

Open source e-signing

The open source DocuSign alternative. Beautiful, modern, and built for developers.

Learn more →

Your Company Here

Sponsor slot available

Want to be listed as a sponsor? Reach thousands of founders and developers.

Report from 29 days ago

What do they actually do

Maitai provides a managed middle layer between your app and LLM providers. Teams route or mirror LLM traffic through Maitai to get real-time output checks (“Sentinels”), optional automatic fixes or fallbacks, and a portal with request logs, latency/cost metrics, test sets, and configuration controls like safe modes and model routing (docs intro, Sentinels, configuration).

It collects failure cases from live traffic, turns them into fine-tuning datasets, runs fine-tunes, and redeploys application-specific models back into the inference path over time (fine-tuning, test sets). Integration is drop-in via an OpenAI-compatible base_url or SDK, with a small proxy overhead; YC notes under 200ms evaluation and under ~30ms added latency for proxying (Quickstart, YC page).

Today it’s used by engineering/product teams already running LLMs in production. Public examples include a voice-ordering customer on YC’s page and a partnership with Phonely (on Groq hardware) reporting large P90 latency reductions and accuracy near 99% for their agents (YC page, VentureBeat, Phonely blog).

Who are their target customer(s)

Product or engineering teams running LLMs in production: Need to catch and fix model mistakes without rebuilding their stack; want visibility into failures and safe fallbacks so users aren’t exposed to hallucinations or outages.
Real-time/voice teams (call centers, voice ordering): Struggle with variable response times and compliance/correctness failures during live calls; need faster, predictable inference and high accuracy.
Ops, security, and compliance teams: Require automated checks, alerts, and governance to detect and block outputs that violate rules or policies before they reach customers.
ML/ML-ops engineers seeking app-specific models: Want fine-tuned models without manual labeling overhead; need tools to collect failures, create datasets, run regressions, and ship new models with minimal manual work.
Enterprise IT and procurement: Need SLAs, dedicated hosting options, and observability/contract controls to approve replacing general-purpose models with custom deployments.

How would they acquire their first 10, 50, and 100 customers

First 10: Run tightly scoped, paid pilots with production LLM teams (voice/call centers, chatbots, internal assistants), integrate via proxy or mirror mode, and provide hands-on tuning to show immediate reductions in visible failures and effective fallbacks (Quickstart, Sentinels).
First 50: Offer a self-serve OpenAI-compatible path with a free/low-cost trial so teams can measure faults, run test sets, and try automated corrections; convert trials to pilots by demonstrating reduced hallucinations and regressions in the portal (test sets, fine-tuning).
First 100: Leverage channel partners (inference/hardware, observability, cloud integrators) to sell bundled low-latency, SLA-backed deployments and replicate the Phonely/Groq-style wins; introduce an Agent marketplace and enterprise bundles to land-and-expand in larger accounts (Phonely case, homepage).

What is the rough total addressable market

Top-down context:

Enterprise spend on generative AI is already large—Menlo Ventures estimates about $13.8B in 2024, indicating a substantial and growing budget pool for reliability, ops, and model tooling (Menlo Ventures).

Bottom-up calculation:

Closest submarkets to Maitai are enterprise LLM/model APIs (~$6.7B in 2024), MLOps/model-ops (~$1.7B in 2024), and contact-center/voice AI (~$2–3B). Adjusting for overlap and the share addressable by a middleware layer yields roughly $6–12B today, with rapid growth expected (GMI LLM, GMI MLOps, FBI Call Center AI).

Assumptions:

Market reports overlap; we discount combined figures to avoid double-counting.
Focus is on buyers already running or scaling production LLM use (near-term serviceable market).
Only a portion of overall AI spend is captured by third-party reliability/middleware rather than infra or app-layer budgets.

Who are some of their notable competitors

Portkey: LLM gateway/proxy with observability, retries, routing, and caching; similar drop-in base_url approach for reliability and cost controls.
Helicone: Open-source proxy and dashboard for LLM observability; captures logs, metrics, and usage across providers with minimal integration.
LangSmith (LangChain): Tracing, evaluation, and dataset tooling for LLM apps built with or without LangChain; supports test sets and regressions for model changes.
Langfuse: Open-source LLM observability platform for tracing, evaluations, and analytics across model providers and applications.
Vellum: Prompt and model management with evaluations, A/B testing, and routing; helps teams iterate and productionize LLM workflows.