AI Vendor Lock-In: Causes and Mitigations

FlowRidge

This article describes the principal sources of AI lock-in, the architectural and procedural mitigations that preserve future optionality, and the disciplined trade-offs an organisation must accept to avoid lock-in becoming the dominant strategic constraint.

The Sources of AI Lock-In

Embedding Format Lock-In

Embeddings produced by one provider’s model are not interchangeable with embeddings from a different provider’s model — even when both produce vectors of the same dimensionality. A vector store populated with OpenAI text-embedding-3-large embeddings cannot be queried with a Cohere embed-english-v3 query. Switching providers requires re-embedding the entire corpus, which can be both slow (hours to weeks for large corpora) and expensive.

The U.S. National Institute of Standards and Technology AI RMF Generative AI Profile at https://airc.nist.gov/AI_RMF_Knowledge_Base/Playbook/GenAI_Profile discusses dependence on specific model providers as an explicit governance consideration; embeddings are the most operationally sticky form of that dependence.

Fine-Tuning Lock-In

A model fine-tuned on a specific provider’s base model cannot be transplanted to a different provider. The fine-tuned weights are bound to the provider’s tokeniser, architecture, and training framework. Switching requires re-fine-tuning on the new provider’s base model, with no guarantee of comparable results.

Prompt Pattern Lock-In

System prompts, retrieval templates, and tool definitions developed and tuned against one model often perform poorly on another. The patterns that worked through extensive iteration are sometimes specific to subtle behaviours of the original model. Migration requires not just rewriting the prompts but re-evaluating downstream behaviour, often a months-long process.

Tooling and Platform Lock-In

ML platforms (SageMaker, Vertex AI, Azure Machine Learning) bundle data storage, model training, experiment tracking, and serving in tightly integrated ways. Migration to a different platform requires re-implementing pipelines, retraining models, and rebuilding operational tooling.

Data Format Lock-In

Some platforms store training data, model artefacts, and metadata in proprietary formats that are not portable. Migration requires export, transformation, and validation — sometimes infeasible at production scale.

Operational and Skill Lock-In

The team has built deep expertise in one vendor’s tooling, debugging patterns, and operational quirks. Switching imposes a re-learning curve that affects velocity for months.

Commercial Lock-In

Volume-discount contracts, prepaid credits, and committed-spend agreements make leaving expensive. The commercial structure can be more binding than the technical structure.

The Cost of Lock-In

Lock-in becomes a strategic problem when one of three conditions arises.

Performance gap. A competitor releases a model that materially outperforms the incumbent for the use case, and switching cost prevents capturing the value.

Pricing power. The incumbent raises prices, knowing the customer cannot easily leave.

Risk concentration. The incumbent experiences an outage, a security incident, a regulatory action, or commercial distress, and the customer has no operational alternative.

The European Union AI Act recital 105 at https://artificialintelligenceact.eu/recital/105/ acknowledges the systemic risk of dependence on a small number of general-purpose AI providers, particularly for downstream high-risk systems.

Architectural Mitigations

Several architectural patterns reduce lock-in at the cost of some short-term efficiency.

Model-Agnostic Inference Layer

Wrap every external model call in an internal abstraction (a “model gateway”) that exposes a stable interface and routes to whichever provider is current. Tools such as LiteLLM, Portkey, and OpenRouter provide reference implementations. The gateway enables switching providers without touching application code, and it enables A/B testing across providers to measure relative quality.

Multi-Provider Inference Routing

Route different request classes to different providers based on cost, performance, and capability. Even if any single provider could handle all requests, routing creates the operational muscle memory and the live evaluation data needed for fast switching.

Embedding Indirection

Store source content with provider-agnostic identifiers. Maintain embedding indexes per provider, with a re-embedding pipeline that can rebuild any index when the provider changes. The cost is double or triple storage; the benefit is the ability to switch embeddings without rebuilding the corpus identification scheme.

Open-Source Foundation Model Capability

Maintain at least minimal capability to deploy and operate an open-weights foundation model (Llama, Mistral, Qwen). Even if the open model is not the production choice today, having the capability constrains the closed-model providers’ pricing leverage.

Standard Format Adoption

Where standards exist, prefer them: ONNX for model interchange, MLflow flavours for experiment tracking, OpenLineage for data lineage, OpenTelemetry for observability. Standard formats may underperform proprietary alternatives marginally; the optionality they preserve usually justifies the trade.

The Linux Foundation AI & Data umbrella at https://lfaidata.foundation/ catalogues the open-source projects that constitute this standards layer.

Procedural Mitigations

Architecture alone is insufficient; procedural discipline is what keeps lock-in from accumulating despite good architecture.

Multi-Vendor Evaluation Cadence

At least annually, the program evaluates the current production providers against viable alternatives on a defined benchmark. The evaluation is published to the AI governance committee. The discipline forces the program to maintain familiarity with the alternative ecosystem rather than letting the incumbent’s roadmap define the world.

Switching-Cost Estimation

Each major vendor relationship has a documented switching cost estimate, refreshed semi-annually. The estimate covers technical migration effort, embedding re-generation, prompt re-tuning, evaluation re-running, and commercial unwind. The number itself is less important than the visibility — leadership making investment decisions should see how much optionality each decision is consuming.

Contract Term Management

AI vendor contracts should include data and model portability clauses, exit assistance commitments, and reasonable termination provisions. The European Union Cloud Code of Conduct at https://eucoc.cloud/en/home and adjacent industry frameworks provide language that translates to AI procurement.

Capability Inventory

A central register lists every capability the program depends on a vendor for, with the named alternative providers and the estimated switching cost. Gaps (capabilities with no alternative) are highlighted as strategic risks.

Pilot the Alternative

Periodically run a real workload — not just a benchmark — on an alternative provider. The exercise surfaces operational realities that pure evaluation misses.

The Trade-Off

Lock-in mitigation has costs. Abstraction layers add latency. Multi-provider routing adds operational complexity. Open-source self-hosting requires platform engineering investment. Standard formats may underperform proprietary ones.

Programs must choose deliberately how much optionality to buy. The right level depends on:

The materiality of AI to the business strategy.
The stability of the vendor ecosystem.
The pace of model capability change.
The regulatory environment.
The organisation’s risk appetite.

Highly regulated programs in financial services and healthcare typically invest heavily in optionality, accepting the operational cost. Less-regulated programs may rationally accept more lock-in for faster delivery.

Specific Recommendations by Layer

Foundation models. Maintain at least two production providers with live traffic, even if 90 percent goes to the primary. The Stanford Foundation Model Transparency Index at https://crfm.stanford.edu/fmti/ supports comparative evaluation.

Embeddings. Treat as a strategic decision; switching cost is high. Audit annually; switch only with a full re-embedding plan.

Vector stores. Prefer providers that support open API standards or self-hosted equivalents (Postgres pgvector, Qdrant, Weaviate self-hosted).

ML platforms. Build the application layer to platform-agnostic standards (containerised serving, model registry abstraction). Accept some efficiency loss.

Data storage. Use open table formats (Delta Lake, Apache Iceberg, Apache Hudi) rather than proprietary warehouse-internal formats.

Looking Forward

Module 1.24 closes here. The next module turns to AI FinOps — the financial engineering discipline that connects the cost allocation work of this module to the per-decision economics that determine which AI investments survive.