Plexe

Open-source agents to build predictive ML models from a prompt

Spring 2025active2025•Website

Machine LearningData ScienceAI

Disclaimer

FYI Combinator is not affiliated with Y Combinator. Reports are generated by AI Research Agents and may not be 100% accurate.

Documenso

Open source e-signing

The open source DocuSign alternative. Beautiful, modern, and built for developers.

Learn more →

Your Company Here

Sponsor slot available

Want to be listed as a sponsor? Reach thousands of founders and developers.

Report from 3 months ago

What do they actually do

Plexe lets teams turn a plain‑English description of a predictive task into a working machine‑learning pipeline. You can use it as an open‑source Python library in your own environment or via a managed web console and REST API that Plexe hosts GitHub README, docs, site.

The workflow is: connect your data, describe the goal (e.g., “predict churn”), and Plexe plans preprocessing and features, generates code, runs experiments with common ML libraries (scikit‑learn, PyTorch, TensorFlow), evaluates results, and iterates. It then provides a deployable endpoint you can call from your app, or lets you download/retrain the model docs.

Today the product is strongest on tabular business ML tasks and for teams without a dedicated ML function. Enterprise SSO is noted as a roadmap item, and expansion into unstructured data and deeper neural models is described as a future direction rather than a current core capability YC profile, authentication roadmap, third‑party writeup.

Who are their target customer(s)

Product managers at startups with data but no ML team: They need features like churn or fraud prediction but lack in‑house ML skills and time to build and maintain models. They want a fast path from idea to a production endpoint without staffing an ML team.
Product or business analysts who can query data but can’t ship models: They can find signals in spreadsheets/BI but can’t package them as a reliable, callable API. They need automated preprocessing, model training, and deployment from their datasets.
Small engineering teams or technical founders avoiding ML infra: They don’t want to stand up and maintain training, deployment, and monitoring systems. They prefer either a local library with control over data/compute or a hosted service that handles ops.
Developers who need fast prototypes and experiments: Writing pipelines and experiment code slows iteration. They want automated experiments with clear metrics to evaluate ideas quickly without hand‑coding every step.
Product teams with messy tabular datasets (e‑commerce, fintech, ops): Cleaning data, engineering features, and tuning models consume time and require specialist skills. They need automation around discovery, preprocessing, and model selection to get reliable predictions into apps.

How would they acquire their first 10, 50, and 100 customers

First 10: Founder‑led outreach to YC startups and developer‑led teams that have data but no ML org; offer a hands‑on pilot to connect one dataset and deliver a working API, converting open‑source trials and repo interest into pilots GitHub, YC, docs.
First 50: Productize the pilot playbook with templates and short videos for common tasks (churn, fraud, lead scoring), host weekly office hours for “bring a CSV” demos, and streamline hosted console onboarding with connectors and sample projects docs, GitHub.
First 100: Introduce a lightweight paid pilot package and a self‑serve paid tier; hire customer success to run pilots, publish case studies, drive referrals, and pursue data‑tool integrations while prioritizing enterprise basics (e.g., SSO) to close larger accounts authentication roadmap, docs.

What is the rough total addressable market

Top-down context:

Plexe sits mainly at the intersection of AutoML (≈$3.5B, 2024) and MLOps (≈$1.7–2.2B, 2024), with overlap into predictive analytics (≈$18–19B, 2024). The closest near‑term proxy is AutoML plus a slice of MLOps AutoML, MLOps, Predictive analytics.

Bottom-up calculation:

If Plexe serves 10k–50k paying teams globally at an average $5k–$10k ARR (mix of hosted console and library‑based usage), that implies a $50M–$500M annual revenue opportunity for its core tabular/AutoML + light MLOps offering.

Assumptions:

Average contract value of $5k–$10k ARR for SMB/mid‑market teams buying hosted/managed predictive features.
Achievable paying customer base of roughly 10k–50k teams globally within the target segments over time.
Focus remains on tabular AutoML plus basic deployment, not full enterprise AI platforms.

Who are some of their notable competitors

DataRobot: Enterprise AutoML platform with built‑in deployment and governance; targets larger companies and emphasizes enterprise controls more than an open‑source‑first developer library.
H2O.ai: Open‑source AutoML roots plus commercial products (Driverless AI) that automate feature discovery and model building; strong tabular performance and mature on‑prem/enterprise options.
Obviously AI: No‑code, hosted predictive tool for business users; overlaps with Plexe’s hosted console for non‑technical teams with a click‑to‑predict UX.
MindsDB: Open‑source, SQL‑first approach that connects to databases so users can train/query models via familiar data tools; emphasizes running models where data lives.
AutoGluon (AWS): Open‑source AutoML library for developers across tabular, text, and image; strong developer toolkit but lacks Plexe’s prompt‑driven agent layer and hosted console.