What do they actually do
dmodel is a small research lab focused on making large models easier to inspect and steer. They publish interpretability research and runnable demos—notably a “steering characters” example with control vectors and a technical study on how models represent code “nullability”—with notebooks and artifacts others can reproduce steering demo nullability paper.
Today, teams typically read the posts, run the linked notebooks, and contact the founders for bespoke help. The company says it sells research and data rather than offering a hosted product; there’s no public pricing or dashboard, and deliverables look like research code, evaluation suites, probes, or datasets homepage YC listing steering demo.
The team is ~6 people with backgrounds at places like OpenAI and Stanford, and they frame their roadmap as continuing to release interpretability work and building tools/agents to steer or align models over time team page homepage.
Who are their target customer(s)
- AI research teams at startups building advanced agents: They need to identify and change internal model mechanisms behind failures or unexpected behaviors and often require custom experiments or datasets to test fixes homepage YC listing.
- ML engineers operating LLMs in production: They face intermittent, hard‑to‑reproduce failures and lack tools to trace or correct internal causes; steering and probes can offer more direct levers than prompt retries homepage.
- AI alignment and safety researchers (academia/non‑profits): They need reproducible tests and datasets to evaluate risky concepts or hidden capabilities; building these evaluation suites is time‑consuming. dmodel’s studies and artifacts (e.g., NullabilityEval) can be reused nullability paper.
- Security, compliance, and audit teams at larger companies: They must evidence that models won’t produce harmful outputs or leak sensitive behavior but lack ways to audit internal states; bespoke probes and reports can support audits and regulatory needs homepage.
- Product teams building consistent personas or task‑oriented agents: They need reliable role/tone/rule adherence across varied prompts; steering vectors can be more consistent than ad‑hoc prompting steering demo.
How would they acquire their first 10, 50, and 100 customers
- First 10: Directly recruit known teams (YC/OpenAI/Stanford networks) for fixed‑scope pilots that run their model through an existing demo or probe and deliver a short technical report to prove value fast YC listing steering demo.
- First 50: Convert readers of public posts into leads by releasing reproducible eval suites and offering paid half‑day workshops/office hours; follow on with paid engagements using the same probes/datasets nullability paper steering demo.
- First 100: Standardize repeatable work into an “audit kit” (templates, probes, scripts, deliverable checklist) and scale via partnerships with notebook providers, ML consultancies, and alignment groups, while keeping bespoke research as an upsell homepage.
What is the rough total addressable market
Top-down context:
Budget for explainability/interpretability, MLOps/model observability, and AI governance is already in the multi‑billion range and growing, suggesting a broad pool that could fund dmodel’s work Explainable AI – Grand View MLOps – Grand View AI model risk mgmt – Grand View AI in observability – Market.us.
Bottom-up calculation:
Starting from a de‑duplicated spend base in the low‑to‑mid billions across these categories, allocating ~1–5% to specialist research vendors and bespoke probes yields a near‑term serviceable market of roughly $100M–$600M per year for a research‑first shop Explainable AI – Grand View MLOps – Grand View.
Assumptions:
- Enterprise buyers allocate a small share (≈1–5%) of platform/tooling spend to specialist research, evaluation suites, and custom probes.
- Market categories overlap (explainability, observability, governance), so totals aren’t additive.
- Current go‑to‑market is research/services without a self‑serve product, which limits near‑term share.
Who are some of their notable competitors
- Redwood Research: Alignment research lab focused on model internals and safety. Overlaps on interpretability research but operates more as a basic‑research org than a vendor selling reproducible probes/datasets to product teams.
- Conjecture: AI safety/engineering group doing deep alignment experiments and internal tooling; public work skews long‑term alignment rather than commercial research/data engagements for product teams.
- Fiddler (Fiddler AI): Enterprise SaaS for monitoring, explainability, and audit of deployed models; competes for compliance/audit budgets with a hosted product and dashboards rather than bespoke interpretability studies.
- Truera: Model‑intelligence platform for evaluation, bias, and explainability; overlaps on governance needs but offers a packaged analytics product instead of a research‑lab service model.
- Scale AI: Data and model evaluation provider (labeling, red‑teaming, eval pipelines); customers may use Scale for evaluations instead of commissioning custom probes, though Scale focuses on scalable data ops more than deep interpretability of internals.