In 2023, I reviewed a pitch deck from an AI startup that had built a contract analysis tool. Their differentiation claim: a proprietary transformer model trained on 2 million legal contracts, achieving 94% accuracy on clause identification. Their competitors were at 89%.
Six months later, Claude and GPT-4 were both achieving 92%+ on similar tasks out of the box. The 5-point accuracy advantage had evaporated. The startup had to completely reframe its value proposition.
This story is not unusual. It is, in fact, the modal outcome for AI products that stake their defensibility on model performance. And yet the proprietary model pitch persists because it's concrete, measurable, and technically impressive. It's just not a moat.
Why Model Accuracy Is Not a Moat
A moat, in competitive strategy, is a structural advantage that's costly or impossible for competitors to replicate. The five classic moats are: switching costs, network effects, cost advantages, intangible assets (brands, patents), and efficient scale.
Model accuracy is none of these. It's a performance metric that reflects the current state of your training data, architecture choices, and compute investment. It can be matched by:
- A competitor with the same training data and more compute
- A foundation model provider who improves their base model
- A competitor who fine-tunes a foundation model on the same domain
- You, six months from now, after your competitor catches up
The fundamental problem is that model performance is increasingly determined by factors that aren't proprietary: compute scale, transformer architectures, and training techniques that get published in papers and replicated within months. The ceiling for what any well-funded team can achieve keeps rising. Your 94% is someone else's 96% twelve months from now.
What Actually Creates Defensibility
Data Flywheels - But Only When They Work
I covered this in more detail in the data strategy post, but the short version is: a data flywheel that genuinely gets better with use - where user feedback is captured, converted to training signal, and reflected in model improvement on a fast loop - is a real moat. The key word is genuinely. Most claimed data flywheels are data collection processes dressed in flywheel language.
Duolingo has a real data flywheel. Every learner interaction - correct answers, mistakes, time-on-task, skip patterns - feeds back into their model. Their language learning models are meaningfully better today than they were three years ago because they have data that reflects 500 million learning interactions. A new entrant can build a comparable model from public data, but they can't replicate three years of feedback from half a billion users.
Workflow Integration and Switching Costs
This is the most underrated moat in enterprise AI. Once your AI product is embedded in a team's daily workflow - integrated with their tools, trained on their data, configured for their specific processes - the cost of switching isn't just the cost of buying a new product. It's the cost of re-integrating, re-training, and losing the accumulated customization.
Epic's clinical AI capabilities have this moat by default. They're not necessarily the most technically sophisticated - in some cases they're mediocre. But they're embedded in the EHR workflow that hospitals have spent years configuring. Replacing Epic's AI with a best-of-breed external solution means building and maintaining integrations that introduce latency, compliance risk, and operational complexity. For most health systems, the switching cost exceeds the performance gain.
This is why enterprise AI companies should be thinking about workflow depth, not model accuracy. Every additional step your product handles in the user's workflow is another switching cost. Every integration you own is a barrier your competitor has to replicate. Accuracy is table stakes. Workflow embeddedness is the moat.
Proprietary Data That Cannot Be Purchased or Scraped
Not all proprietary data creates a moat - I made this point in the data strategy post. But certain categories of proprietary data genuinely cannot be replicated:
- Outcomes data from closed systems (health outcomes, financial returns, legal case outcomes) that are never published
- Real-time behavioral data from a platform with network effects (transaction data, search data, communication patterns)
- Expert-curated labeled data from a specialized domain where the labelers are scarce
If you have this kind of data and you're training models on it, the model performance advantage is real and sustainable - not because the model architecture is better, but because the training data is irreplaceable.
Institutional Knowledge Encoded in Prompts and Pipelines
This one is newer and underappreciated. As AI products mature, the value increasingly lives in the system design - the prompting strategies, the retrieval architectures, the fallback behaviors, the evaluation frameworks - rather than in the model itself. This institutional knowledge takes months to develop and test. It encodes hard-won lessons about failure modes, edge cases, and domain-specific quirks.
A competitor who buys the same foundation model still needs to replicate this institutional knowledge. That's months of work, not days. And by the time they've replicated it, you've moved further ahead.
Anthropic's Constitutional AI methodology, the custom system prompts that enterprise companies have built for their specific workflows, the evaluation uses that catch regressions - these are institutional knowledge that creates a genuine lag advantage, even if not a permanent one.
Domain Expertise That Shapes the Product
The deepest moat of all is domain expertise that's baked into every layer of the product: the problem framing, the failure mode handling, the evaluation criteria, the user interface, the trust calibration. This is hard to explain and impossible to measure, but it shows up in reliability and user trust over time.
The clinical AI products that get used are the ones where the failure modes reflect clinical understanding - where the system knows what it doesn't know, where it defers to human judgment on specific categories of cases, where the confidence calibration matches clinical intuition. Building this requires deep domain expertise embedded in the product team, not just in the model.
A new entrant with a technically superior model but no clinical domain expertise will build a tool that clinicians don't trust, even if the accuracy numbers are better. Domain expertise is a moat because it takes years to accumulate and can't be purchased or scraped.
The Practical Implication
If you're building an AI product, ask yourself: in two years, after foundation models have improved another two generations, what still differentiates us? If your answer involves model accuracy, you have a problem. If your answer involves workflow integration, proprietary data, institutional knowledge, or domain expertise - you're building a moat.
The shift in framing is: stop thinking about AI as the product and start thinking about AI as the engine. The product is the workflow transformation. The product is the data flywheel. The product is the domain-specific reliability that users trust. The AI model is infrastructure - valuable, necessary, but not differentiated.
This reframing changes your investment priorities. Less time optimizing model performance. More time deepening workflow integration. More time building labeling pipelines that capture user feedback. More time encoding domain expertise into your evaluation framework.
These are less glamorous investments than training a new model. They're also much harder to replicate.