The Build vs Buy Decision Framework for AI

Every AI product team faces this decision. Do we build our own model? Buy a commercial solution? Use an API? Deploy an open-source model on our own infrastructure? The answers I've seen organizations give to this question are almost always influenced more by team identity than by actual strategic analysis.

ML teams want to build. That's what they were hired for. Business teams want to buy. That's faster and cheaper. Both instincts are wrong in roughly half the cases.

Here's the framework I use.

The 4-Quadrant Framework

The decision comes down to two axes: how differentiated does this capability need to be, and how sensitive is the underlying data. These two variables get you to a 2x2 that covers most real-world scenarios.

Quadrant 1: Low Differentiation, Low Data Sensitivity - Buy and Configure

If you need AI to do something that doesn't differentiate your product, and the data isn't sensitive, buy a configured solution and move on. Don't spend engineering cycles here.

Examples: customer support chatbots for standard queries, internal HR Q&A, meeting summarization, basic document search. These are infrastructure problems, not product problems.

When I was at Mamaearth working on digital infrastructure, we evaluated building a custom product recommendation engine. After scoping the project, we realized we needed about 70% of what a commercial solution like Dynamic Yield already offered. The 30% delta wasn't creating competitive advantage - we were a CPG brand, not a recommendation engine company. We bought, configured, and shipped in six weeks instead of eight months.

The test for this quadrant: would your customers notice if this capability came from a vendor versus from you? If the answer is no, you're in Quadrant 1.

Quadrant 2: High Differentiation, Low Data Sensitivity - Buy and Customize

This is the quadrant most enterprise AI teams should be in, and almost none of them are. The pattern here is: use a foundation model or commercial API as your base, then layer your proprietary logic, domain knowledge, and product experience on top.

OpenAI's API, Anthropic's Claude API, Google's Vertex AI - these give you frontier-model quality at a fraction of the cost of training your own model. Your job isn't to out-train OpenAI. It's to build the product layer that makes the model useful for your specific context.

This is how Stripe built their fraud detection narrative explanations. They didn't build their own LLM. They used an off-the-shelf model and built the product logic that understood what a Stripe transaction looked like, what counted as suspicious, and how to explain the decision to a non-technical merchant. The differentiation was in the product layer, not the model layer.

The test for this quadrant: does your competitive advantage come from understanding the model better than your competitors, or from understanding your domain better than your competitors? If it's the latter, buy and customize.

Quadrant 3: Low Differentiation, High Data Sensitivity - Open Source and Self-Host

This quadrant exists because of regulatory and privacy requirements. Your data can't leave your infrastructure, but you still need AI capabilities that aren't differentiating your product. The answer is open-source models deployed on your own infrastructure.

Healthcare is the canonical example. Clinical notes, patient records, insurance claims - this data has strict regulatory constraints (HIPAA in the US, GDPR in Europe). But the AI capabilities you need - document summarization, entity extraction, coding assistance - aren't differentiating. The value is in having them, not in having built them yourself.

HuggingFace's model hub, LLaMA variants, and Mistral models have made this quadrant dramatically more accessible. You can run a capable 7B or 13B model on standard cloud infrastructure at a fraction of the cost of rolling your own. The tradeoff is the operational overhead of managing your own model infrastructure, but for highly regulated industries this cost is worth it.

The test for this quadrant: would your compliance or legal team block you from sending this data to a third-party API? If yes, you're in Quadrant 3.

Quadrant 4: High Differentiation, High Data Sensitivity - Build from Scratch

This is the quadrant you should be in only if you have a compelling answer to two questions: do we have proprietary data that no vendor can replicate, and is the model itself the product?

Google's search ranking model. Spotify's recommendation engine. Epic's clinical risk models trained on decades of patient outcomes from their EHR system. These companies built because their data was genuinely proprietary and the model was genuinely the product.

Most enterprise AI teams are not in this situation. They don't have unique data. Their model isn't the product - it's a component. Building from scratch for them is the equivalent of manufacturing your own servers because you need a web application.

The test for this quadrant: if a competitor got access to the same foundation model you're using, would they be able to replicate your AI capability? If yes, you don't have a real build case. You have a product and UX problem, which you should solve with product thinking, not model training.

The Mistakes I See Most Often

Treating All AI as Quadrant 4

This is the most expensive mistake. Teams default to building custom models because it feels more serious, more impressive, more like genuine AI research. The result is teams spending months training and fine-tuning models that are marginally better than a commercial API they could have been using on day one.

I've watched a fintech company spend seven months building a custom intent classification model for their support chatbot. When I asked what accuracy they were targeting, they said 92%. When I asked what GPT-4 got on their test set out of the box, they said they hadn't tried it. We ran it that afternoon. It got 89%. The seven-month project bought them 3 percentage points.

Ignoring Total Cost of Ownership

The build vs buy math is almost always wrong because teams calculate the cost of building (engineering time, infrastructure) but not the cost of owning. Models drift. Data distributions shift. You need to retrain, re-evaluate, re-deploy. Open-source self-hosted models in particular have significant ongoing operational overhead that commercial APIs don't.

When you're evaluating build vs buy, the real question isn't what does this cost to build. It's what does this cost to own for three years.

Not Revisiting the Decision

The build vs buy calculus changes as the market evolves. Two years ago, the case for building your own embedding model was much stronger. Today, Ada-002, text-embedding-3-large, and a dozen competitive offerings have made custom embedding models hard to justify for most use cases.

Your build vs buy decisions should be reviewed annually, or whenever there's a major shift in what commercial options are available. The decision you made in 2022 might be wrong in 2025.

A Practical Decision Process

When I'm scoping a new AI capability, I run through these questions in order:

Can a commercial API do 80% of this today? If yes, start there and measure the gap.
What's in the gap? Is it domain knowledge, data access, latency, privacy? The type of gap determines your path.
What's the cost of closing the gap through customization vs building from scratch? Usually customization is 5-10x cheaper.
What's the operational cost of each option over 3 years? Include maintenance, retraining, and the cost of the people needed to run it.
What happens if this decision is wrong? Can you migrate from a commercial API to a custom model later? Usually yes. Can you migrate from a custom model to a commercial API? Usually yes, with some rework. The reversibility of the decision matters.

The last point is important. Build vs buy is not a one-way door. Most of the time, you can start with a commercial API, learn what you actually need, and then invest in customization or custom training only for the parts where the commercial solution genuinely falls short.

Starting with the commercial option gives you real usage data to justify the custom build. Starting with the custom build gives you a lot of engineering sunk cost and no data about whether it's actually better for users.

The takeaway

The right answer to build vs buy is almost never always build or always buy. It's a portfolio decision: build where you have genuine proprietary advantage, buy where you don't. The discipline is being honest about which situation you're actually in.

Most teams overestimate how much their AI differentiation comes from the model and underestimate how much it comes from the product layer, the data strategy, and the domain expertise baked into the system. Get that calibration right, and the build vs buy decision usually answers itself.