What do they actually do
Riveter runs many small AI search agents in parallel to browse the public web (search results, sites, PDFs, images) and return structured, tabular outputs. Teams use a web app to define projects or call a public API to run them, then plug the results back into spreadsheets or downstream systems homepage API docs.
A typical workflow is: define the fields you want, configure a project (prompts, tools, output schema), and feed input rows via the UI or API. For each row, Riveter spins up an agent that searches, navigates pages, reads documents, and extracts the requested fields, returning consistent table‑format results you can join to your original data API docs homepage.
The product is live with a web app and callable API (docs published, versioned). The team has publicly promoted a free trial to let users run small spreadsheet‑style jobs before committing API docs trial post.
Who are their target customer(s)
- Corporate strategy / M&A teams at growth‑stage companies: They need repeatable, auditable answers about competitors, partners, and market signals across many targets, but current research is manual, slow, and hard to reproduce across stakeholders.
- Product and data teams that maintain catalogs: They must enrich hundreds or thousands of rows (features, specs, images, docs) and often rely on fragile scrapers or manual work that breaks and doesn’t scale.
- Finance and analyst teams doing diligence and market sizing: They need up‑to‑date public facts from web pages and PDFs, but spend too much time copy‑pasting, hiring contractors, or stitching inconsistent sources.
- Sales / GTM / revenue operations teams: They want reliable enrichment (firmographics, signals from news or filings) for lead lists, but current enrichment is slow, incomplete, or requires custom engineering.
- Compliance / KYB teams: They must extract and verify facts from filings, PDFs and images at scale, and today that work is manual, expensive, and hard to audit for provenance.
How would they acquire their first 10, 50, and 100 customers
- First 10: Use founders’ and YC networks plus inbound trial signups to recruit 10 early teams, then run hands‑on pilots where Riveter configures projects against a real spreadsheet and returns structured results with an evidence trail Launch YC trial post API docs.
- First 50: Target product/data teams with ready‑made spreadsheet templates, guides, and a self‑serve API/UX so they can run small enrichment jobs without sales; promote a Sheets/Excel connector beta and convert trials via automated onboarding homepage API docs.
- First 100: Add a focused outbound motion to corporate strategy, finance, and compliance teams, offering multi‑row pilots with paid onboarding, integration help (BI, CRM), and basic service guarantees; document provenance and repeatability for audits and procurement to close short enterprise contracts Launch YC homepage.
What is the rough total addressable market
Top-down context:
Relevant analyst categories for Riveter—web scraping, data enrichment, and Document AI—sum to roughly USD ~18B in the mid‑2020s by simple addition, acknowledging overlap across categories Mordor Grand View MarketsandMarkets.
Bottom-up calculation:
Web scraping ≈ ~$1.0B, data enrichment ≈ ~$2.3B, and Document AI ≈ ~$14–15B (2025) per analyst reports; summed (overlap‑naïve) ≈ ~$18B Mordor Grand View MarketsandMarkets.
Assumptions:
- Numbers are USD and reflect mid‑2020s analyst estimates; precise figures vary by source and year.
- Categories overlap (double‑counting likely) but are additive here to provide a conservative top‑down anchor.
- Riveter competes for portions of each category where public web/document extraction and row‑level enrichment are required.
Who are some of their notable competitors
- Diffbot: Provides a structured Knowledge Graph of the public web and an Extract API for normalized company, product, and article data—useful for curated firmographics and product feeds; less focused on per‑row spreadsheet agent workflows product docs.
- Apify: A platform for building and running custom scraping/automation “Actors” with full API and integrations (including Sheets); powerful for teams willing to build crawlers, more developer‑first than Riveter’s spreadsheet‑style pattern Actors Sheets integration.
- Browse.ai: No‑code, point‑and‑click robots that turn websites into live datasets with syncs to Sheets, webhooks, or an API; overlaps on “turn pages into rows,” but centers on GUI scrapers/monitors rather than parallel web‑browsing agents that read PDFs/images product API.
- Import.io: Enterprise‑oriented extraction with self‑healing pipelines, scheduling, compliance/audit features, and exports to Sheets/BI tools; closer on reliability and auditability but focused on managed extractors vs. per‑row search agents overview docs.
- Zyte (formerly Scrapinghub): Offers anti‑ban proxy infrastructure, managed extraction services, and structured‑data APIs; strong for high‑volume scraping and difficult sites, less aimed at spreadsheet‑first, agent‑per‑input research workflows site docs.