AI System Decommissioning Procedures

FlowRidge

This article describes the trigger conditions for decommissioning, the procedural elements that distinguish decommissioning from sunset-by-attrition, and the long-tail obligations that survive an AI system’s removal from production.

Triggers

Decommissioning should be triggered by a defined event, not by the gradual fade of a system into disuse. Common triggers include:

End of business use case: the underlying business need has been met, transferred to a different system, or eliminated.
Replacement: a new model or system has been validated and is taking over the workload.
Performance failure: the system has degraded beyond its acceptable operating range and remediation is not economic.
Risk accumulation: regulatory, ethical, or technical risk has crossed a threshold that no longer justifies operation. The European Union AI Act Article 18 at https://artificialintelligenceact.eu/article/18/ on serious incidents and post-market monitoring contemplates explicit retirement when corrective action is not feasible.
Vendor exit: a third-party AI dependency has been deprecated or the vendor is no longer commercially viable.
Strategy shift: an organisational decision to consolidate, divest, or exit the relevant business line.

Each trigger should automatically open a decommissioning case in the AI governance platform, with the procedural workflow attached.

Pre-Decommissioning Decisions

Before procedural execution, several decisions must be made and documented.

Replacement strategy. If the system is being replaced, the cutover plan, the parallel-run window, the success criteria, and the rollback path must be defined. The Office of Management and Budget Memorandum M-24-10 on AI use in U.S. federal agencies at https://www.whitehouse.gov/wp-content/uploads/2024/03/M-24-10-Advancing-Governance-Innovation-and-Risk-Management-for-Agency-Use-of-Artificial-Intelligence.pdf provides language on managing replacement of safety-impacting systems.

Affected stakeholder notification. Customers, employees, partners, and downstream system owners need notice with sufficient lead time. The notice content depends on the materiality and visibility of the system but typically includes the date of withdrawal, the replacement (if any), and the channel for raising concerns.

Regulatory notification. Some sectors require advance notice to regulators. Healthcare AI systems regulated by the U.S. Food and Drug Administration may require change notifications under Software as a Medical Device (SaMD) rules; the FDA discussion paper on AI/ML-Based Software as a Medical Device at https://www.fda.gov/files/medical%20devices/published/US-FDA-Artificial-Intelligence-and-Machine-Learning-Discussion-Paper.pdf describes the surrounding expectations.

Data disposition. Decisions about the training data, evaluation data, decision logs, and any derived artefacts must be made: archive, destroy, transfer to the replacement, or retain in a defined long-term repository.

Knowledge capture. Lessons learned from the system’s operating life — what worked, what failed, what was unexpected — should be captured for the benefit of future programs.

The Procedural Sequence

A defensible decommissioning procedure follows a predictable sequence.

Step 1: Operational freeze

The system is placed in a state where no new decisions are made but recent decisions can still be inspected, reversed, or appealed. Common patterns include disabling new request acceptance, enabling read-only access, and configuring the API to return a structured retirement message rather than a 404.

Step 2: Outstanding obligation closure

Any decisions still in appeal, exception cycle, or human review must be closed under the system that produced them. Routing them to the replacement system is rarely defensible because the rationale was different.

Step 3: Data export and archival

All audit trails (per Module 1.21), training and evaluation datasets (per Module 1.22), model artefacts, configuration, and supporting documentation must be exported to the long-term archive with content hashing and tamper-evidence. The archive location should be documented in the model registry.

Step 4: Integration decommissioning

Every consuming system, scheduled job, dashboard, and notification dependency must be identified (the lineage graph from the previous articles is the source) and updated to remove the dependency. Failure here produces dangling integrations that throw errors for years.

Step 5: Access revocation

Service accounts, API keys, secrets, and IAM roles that the system used must be revoked. The U.S. National Institute of Standards and Technology Special Publication 800-53 control AC-2 on Account Management at https://csrc.nist.gov/projects/risk-management/sp800-53-controls/release-search#!/control?version=5.1.1&number=AC-2 provides the surrounding framework.

Step 6: Infrastructure deprovisioning

Compute, storage, and network resources are deprovisioned. The deprovisioning should be paired with cost reconciliation to confirm that the savings actually appear in the cloud bill.

Step 7: Final attestation

The decommissioning owner attests that all steps have been completed, the archive is intact, and outstanding obligations have been closed. The attestation is filed alongside the model registry entry, which is updated to reflect the retired status.

Long-Tail Obligations

A decommissioned system is not a forgotten system. Several obligations survive retirement.

Audit response. Regulatory inquiries, customer complaints, and litigation can occur years after retirement. The archive must be retrievable and re-instantiable enough that the system’s prior behaviour can be reconstructed and explained. The Federal Reserve Supervisory Letter SR 11-7 model risk management expectations at https://www.federalreserve.gov/supervisionreg/srletters/sr1107.htm survive into the post-decommissioning window.

Subject rights. Data subjects retain their rights under privacy law, including erasure and access. The decommissioned system’s records must be addressable for these requests. Pseudonymisation strategies designed during operation should anticipate this.

Insurance and warranty. Some commercial AI deployments carry warranties or insurance obligations that survive decommissioning. The attestation file is the evidence that the program met its obligations at retirement.

Knowledge preservation. The lessons-learned record should be searchable from the AI governance knowledge base described in Module 1.26. New programs are often surprised to discover that the organisation already faced and resolved a problem they thought was novel.

Special Cases

Foundation-model dependencies. When a third-party foundation model is deprecated by its provider, the consumer’s options are: migrate to the successor model, switch to an alternative provider, or decommission the dependent system. The decision should be made before the provider’s deprecation date, not after, with the procedural sequence above applied to the consumer system regardless.

Federated and client-side models. Models that have been distributed to mobile devices or edge hardware cannot be unilaterally decommissioned. The plan must include client-side update mechanisms, fallback behaviour for clients that do not update, and accommodation for the long tail of clients that may never update.

Critical-infrastructure AI. Systems classified as critical infrastructure under sectoral regulation (energy, water, finance, telecoms) may have additional decommissioning requirements imposed by the regulator. The European Union AI Act Article 17 on quality management systems at https://artificialintelligenceact.eu/article/17/ implies persistent quality records even after retirement.

Common Failure Modes

The first is zombie systems — systems thought to be decommissioned but still serving requests because a forgotten upstream system never stopped calling them. Counter by lineage-driven verification: confirm every consumer has removed the dependency before final shutdown.

The second is archival rot — long-term storage in a format or system that becomes unreadable over the retention window. Counter by periodic retrieval drills and use of widely-supported open formats (Parquet for tables, ONNX for models, JSON for metadata).

The third is forgotten secrets — credentials and keys that were used by the system but never rotated or revoked. Counter by integrating decommissioning with secret rotation tooling.

The fourth is legal-hold collisions — a decommissioning that removes data subject to active legal hold. Counter by integration with legal-hold systems before any data deletion step.

Looking Forward

Module 1.22 closes here, having traced the full lifecycle from data origination through transformation, reproducibility, and decommissioning. Module 1.23 turns to documentation standards — model cards, datasheets, and the published artefacts that make all of this visible to stakeholders outside the immediate development team.