The AI Supply Chain — From Foundation Models to Production Systems

FlowRidge

Definition

The AI supply chain is the end-to-end set of organizations, data sets, models, software components, and services that contribute to an Artificial Intelligence (AI) system as it moves from raw data through model training, packaging, distribution, integration, deployment, and ongoing operation. Modern enterprise AI is rarely built from scratch. It is assembled from a chain of upstream providers — foundation model developers, data brokers, fine-tuning specialists, hosting platforms, embedding services, vector stores, plug-in marketplaces, and Software as a Service (SaaS) features — that together produce the system the end user actually touches. Governance that stops at the boundary of the deploying enterprise is governance that is missing most of the system.

This article opens Module 1.10 by mapping the canonical layers of the AI supply chain, naming the actors at each layer, and explaining why traditional vendor management frameworks — built for predictable, deterministic Information Technology (IT) supply chains — are insufficient for the probabilistic, opaque, fast-mutating supply chain that AI introduces.

Why the AI Supply Chain Is Different

Three properties distinguish the AI supply chain from the conventional software supply chain.

First, opacity. A relational database vendor can describe exactly what its software does. A foundation model provider often cannot — the model’s behaviour emerges from training, not from explicit programming. Even with full access to weights, the relationship between inputs and outputs is not formally specified. Stanford’s Center for Research on Foundation Models tracks this opacity through the Foundation Model Transparency Index (FMTI) at https://crfm.stanford.edu/fmti/, which rates major upstream providers across 100 indicators spanning data, labour, compute, methods, capabilities, risks, mitigations, and downstream use. Most providers score well below 50 percent, even after multiple disclosure rounds.

Second, velocity. Traditional vendor relationships involve quarterly or annual product releases, with formal change-management notices. Foundation model providers ship new model versions, raise rate limits, deprecate endpoints, and adjust safety filters on weekly or even daily cadences. A control evaluated in March may not describe the model in production in May.

Third, transitivity of risk. When an enterprise embeds a SaaS feature that itself calls a foundation model that itself was fine-tuned on data licensed from a data broker that itself scraped a public source, the deploying enterprise still owes its customers a duty of care. The European Union (EU) AI Act formalises this transitivity in Article 25 (deployer obligations) and Articles 53 to 55 (General-Purpose AI provider obligations), accessible at https://artificialintelligenceact.eu/. A deployer cannot disclaim accountability by pointing upstream.

The Canonical Layers

A defensible AI supply chain map names eight layers. Every production AI system can be decomposed into these.

Layer 1 — Source Data Producers

Owners and originators of the raw data that eventually trains models: web publishers, social platforms, sensor networks, governments, healthcare providers, employees, and customers. The provenance of training data is the foundation of every downstream legal and ethical claim. The Software Package Data Exchange (SPDX) standard at https://spdx.dev/ provides the canonical vocabulary for declaring data and software origins; it is increasingly extended to AI datasets through community work on dataset cards and data sheets.

Layer 2 — Data Aggregators and Brokers

Companies that license, curate, clean, label, deduplicate, and re-package source data for training use. These actors include image and video licensors, common-crawl maintainers, scientific dataset stewards, and human-feedback labelling firms. They sit between source producers and model trainers.

Layer 3 — Foundation Model Developers

Organizations that train large general-purpose models from raw data and significant compute — including but not limited to a small number of well-known providers. Under the EU AI Act, these actors are designated General-Purpose AI (GPAI) providers and bear specific obligations for documentation, copyright compliance, training-data summaries, and — for systemic-risk models — incident reporting and adversarial evaluation.

Layer 4 — Fine-Tuners and Specialised Model Providers

Organizations that take a foundation model and adapt it for a domain (legal, medical, financial, customer service) using proprietary data and Reinforcement Learning from Human Feedback (RLHF) or similar techniques. The fine-tuned model inherits the upstream model’s behaviours but adds new ones — and a new layer of accountability.

Layer 5 — Hosting and Inference Infrastructure

Cloud providers, dedicated inference platforms, and on-premises hardware vendors that serve model predictions. The Cloud Security Alliance at https://cloudsecurityalliance.org/ publishes guidance on shared-responsibility models for AI workloads, including the AI Controls Matrix, which extends classic Cloud Controls Matrix categories with AI-specific requirements.

Layer 6 — Application Builders and System Integrators

Software vendors that wrap models in user-facing applications: enterprise SaaS firms, agent builders, plug-in developers, and internal platform teams. Their work introduces orchestration logic, prompt templates, retrieval pipelines, function-calling, and guardrails that materially shape the system the user experiences.

Layer 7 — Distribution Platforms and Marketplaces

App stores, plug-in registries, model hubs (such as Hugging Face), and partner directories that catalogue and distribute AI components. The Hugging Face documentation on Safetensors at https://huggingface.co/docs/safetensors illustrates how distribution platforms now ship cryptographically verifiable, side-channel-resistant model weights — a control that did not exist in the early model-distribution era.

Layer 8 — Deployers and End-User Operators

The organization that puts the system in front of staff or customers. Under the EU AI Act, this is the actor with the broadest set of operational obligations, even when most of the system was procured.

Why Most Programs Cover Only One Layer

Most enterprise AI governance programs cover Layer 8 — their own deployments — and perhaps Layer 6 for systems they integrate. Layers 1 through 5 are typically invisible. Three reasons explain this gap.

The first reason is historical: vendor management evolved to assess single-counterparty IT contracts, not multi-tier model supply chains. The second is contractual: most upstream providers do not give downstream deployers the audit rights, documentation, or notification commitments needed to govern them properly. The third is technical: discovering which models, datasets, and components are present in a procured system is genuinely hard. The U.S. Cybersecurity and Infrastructure Security Agency (CISA) Software Bill of Materials (SBOM) programme at https://www.cisa.gov/sbom has begun extending SBOM concepts to AI through the AI Bill of Materials (AI-BOM) and Model Bill of Materials (MBOM) constructs, but adoption is uneven.

What Standards and Regulators Now Expect

Three normative anchors set the floor for AI supply-chain governance.

National Institute of Standards and Technology (NIST) AI Risk Management Framework (AI RMF) GOVERN-6 at https://www.nist.gov/itl/ai-risk-management-framework requires that organizations establish “policies and procedures to address AI risks and benefits arising from third-party software and data and other supply chain issues.” This is a direct anchor for every program described in this module.

International Organization for Standardization / International Electrotechnical Commission (ISO/IEC) 42001:2023 at https://www.iso.org/standard/81230.html introduces an AI Management System standard with Annex A.10 dedicated to third-party relationships, supplier obligations, and lifecycle accountability across procured AI components.

NIST Special Publication (SP) 800-161 Revision 1 at https://csrc.nist.gov/pubs/sp/800/161/r1/final extends classical Cybersecurity Supply Chain Risk Management (C-SCRM) practices to software and increasingly to model artefacts. Combined with the Supply-chain Levels for Software Artifacts (SLSA) framework at https://slsa.dev/, these references define what defensible build, attestation, and provenance look like for AI components.

Maturity Indicators

Maturity	What it looks like for AI supply chain mapping
Foundational (1)	The organization cannot name the foundation models, vector stores, or third-party AI features inside its top five business systems.
Developing (2)	A spreadsheet inventory exists for known third-party AI tools; foundation models are listed by name but not by version, region, or training-data lineage.
Defined (3)	A canonical AI-BOM is required for every system that scores above a defined risk threshold; layers 4 through 8 are populated; layer 1 to 3 entries are populated when contractually available.
Advanced (4)	The AI-BOM is generated from build pipelines and refreshed automatically; sub-processor and model-version changes trigger re-evaluation; gaps in upstream transparency are explicitly tracked as residual risks.
Transformational (5)	The organization contributes to industry AI-BOM and model-card standards; suppliers compete on transparency disclosures; the supply-chain map informs board-level risk reporting.

Practical Application

A mid-sized financial institution introducing a generative-AI customer-service assistant should begin Module 1.10 by drawing its actual supply chain on a single page. Layer 6 is the SaaS vendor of the assistant. Layer 5 is the cloud region the vendor uses. Layer 4 is the fine-tuned model the vendor licenses. Layer 3 is the foundation model behind that fine-tune. Layer 2 is the corpus the foundation model was trained on. Layer 1 is the underlying source data, often impossible to enumerate. For each layer, three questions must be answered: who is the responsible legal entity, what contractual rights does the deploying institution have, and what evidence exists of competent governance at that layer. The answers, even when many cells say “unknown,” become the calibration baseline for everything else this module covers.

The remaining 14 articles in Module 1.10 build on this map: vendor due diligence (Article 3), contracting patterns (Article 4), AI-BOM mechanics (Article 6), data provenance (Article 7), continuous monitoring (Article 10), incident response (Article 14), and tiered risk programs (Article 15). All of them assume that an organization knows what its supply chain actually is.