Encryption in AI: At Rest, In Transit, and Confidential Computing

FlowRidge

Definition

Encryption in Artificial Intelligence (AI) workloads is the application of cryptographic techniques to protect the confidentiality and integrity of training data, intermediate artefacts, model weights, inference inputs, and inference outputs across three domains: at rest (when stored), in transit (when moving across networks), and in use (when being processed by the compute substrate). Encryption is the foundational control that makes most other security controls meaningful — without it, network isolation, access control, and audit logging are circumventable by anyone who reaches the storage or wire. The discipline of encryption for AI workloads inherits everything traditional cryptographic practice teaches and adds new requirements specific to the volume, structure, and processing patterns of ML data.

This article walks the three domains, the operational practices that distinguish a defensible encryption posture from a checkbox compliance one, and the emerging discipline of confidential computing that addresses the in-use gap.

Encryption at rest

Encryption at rest protects training data, model artefacts, feature stores, inference logs, and any other persistent storage from disclosure to anyone who obtains the storage media or the storage account credentials. The reference baseline for AI workloads in 2026 is the same as for any sensitive workload: AES-256 or equivalent, with keys held in a Hardware Security Module (HSM) or cloud-native key management service (AWS Key Management Service, Azure Key Vault, Google Cloud Key Management Service), and per-tenant key separation where the workload is multi-tenant.

The AI-specific considerations are three.

Volume-driven key management. ML training corpora are large — terabytes to petabytes — and the encryption key chosen for the corpus typically encrypts data that will be read by many concurrent training jobs over the corpus’s lifetime. Key rotation strategies that work for transactional data (rotate the data-encryption key on a schedule, re-encrypt the data with the new key) are operationally infeasible for training corpora at scale. The pattern that scales is envelope encryption with rotated key-encryption keys: the data-encryption key is itself encrypted by a key-encryption key that rotates frequently, and the data-encryption key is rotated only on an extended schedule or on event.

Per-tenant separation. Multi-tenant ML platforms — internal platforms that serve many business units, external platforms that serve many customer accounts — require that one tenant’s data not be decryptable by infrastructure that holds another tenant’s keys. The pattern is per-tenant data-encryption keys, each wrapped by a per-tenant or per-environment key-encryption key, with access mediated by the tenant’s identity. The pattern is the cryptographic enforcement of the multi-tenant isolation that the network and identity layers also enforce.

Model artefact encryption. Model files — the trained weights — are the high-value asset that the rest of Article 4 of this module addresses. Encryption at rest with the same discipline as training data is the baseline; integrity verification with cryptographic signatures (also Article 4) is the complement.

The NIST AI Risk Management Framework Cybersecurity profile https://www.nist.gov/itl/ai-risk-management-framework requires encryption at rest for AI training and serving data. ISO/IEC 42001:2023 Annex A.6 https://www.iso.org/standard/81230.html requires AI Management System operators to apply cryptographic controls to AI assets that explicitly include training data, model artefacts, and inference logs. The European Union’s AI Act, Article 15 https://artificialintelligenceact.eu/article/15/, requires high-risk AI systems to be designed with cybersecurity controls that include data confidentiality.

Encryption in transit

Encryption in transit protects data on the wire between AI components. The reference baseline is Transport Layer Security (TLS) 1.3 with mutual authentication where both ends are workloads under the operator’s control, and TLS with server authentication where one end is an external client. The pattern is operationally well established and the AI-specific considerations are three.

Mesh-managed TLS. As discussed in Article 8, a service mesh (Istio, Linkerd, AWS App Mesh) issues per-workload certificates and enforces mutual TLS by default for every workload-to-workload connection. The pattern dramatically simplifies the certificate-management burden for AI platforms with many components and is the recommended pattern for any platform with more than a handful of services.

Inference endpoint hardening. Inference endpoints exposed to external clients require TLS configuration that resists downgrade and is regularly tested. Modern TLS practice (TLS 1.3, restricted cipher suites, OCSP stapling, HSTS where applicable, certificate transparency monitoring) applies without modification. The endpoint hardening should be tested by external scanners on a regular schedule and the results fed to the threat model from Article 1.

Encrypted intra-region traffic. Cloud platforms encrypt some traffic between availability zones and regions by default; some they do not. The operator should not assume; the operator should verify per platform per region and configure explicit TLS where the platform default is insufficient. The audit story for compliance frameworks (Article 15) requires evidence of in-transit encryption between every pair of components in the data path.

NIST SP 800-218A https://csrc.nist.gov/pubs/sp/800/218/a/final prescribes encryption in transit as a Secure Software Development Framework practice for AI systems. The OWASP Top 10 for Large Language Model Applications https://owasp.org/www-project-top-10-for-large-language-model-applications/ catalogs Insecure Output Handling (LLM05) under which unencrypted transport of LLM responses is one specific failure mode.

Encryption in use: confidential computing

The third domain — encryption in use — is the newest and the most operationally consequential for AI workloads in 2026. The classic encryption story protects data at rest and in transit but exposes data in use: when the data is loaded into memory for processing, it is plaintext, and the operator of the compute substrate (the hyperscale cloud, the on-premises hypervisor, the container platform) has potential access. For most workloads this is acceptable because the operator is part of the trust boundary. For AI workloads handling sensitive data — health records, financial transactions, regulated personal data, third-party intellectual property — the in-use exposure is increasingly the binding constraint on what the workload is permitted to handle and where it can be deployed.

Confidential computing addresses the gap. The pattern uses hardware-based Trusted Execution Environments (TEEs) — Intel Software Guard Extensions, AMD Secure Encrypted Virtualization, ARM Confidential Compute Architecture, NVIDIA Hopper Confidential Computing — to provide encrypted memory, attested execution, and isolation from the underlying operating system and hypervisor. Code and data inside the TEE are protected from the cloud operator, the host operating system, and any other workload sharing the hardware.

For AI workloads, confidential computing supports three patterns.

Confidential training. Training executes inside a TEE with the training data encrypted in memory. The operator of the training infrastructure cannot read the training data or the resulting model weights. The pattern enables training on sensitive data the data owner would not otherwise allow on shared infrastructure.

Confidential inference. The deployed model and the inference inputs are decrypted only inside the TEE; the cloud operator and any host-level adversary see only ciphertext. The pattern enables inference on sensitive inputs (private documents passed to a Large Language Model, regulated personal data passed to a classifier) on shared inference infrastructure.

Confidential federation. Multiple parties contribute training data or inference requests into a TEE that aggregates without exposing any party’s data to any other. The pattern enables collaborative ML across competing organizations, across jurisdictional boundaries, and across regulatory perimeters.

Confidential computing is a maturing market in 2026. The Gartner AI TRiSM Hype Cycle https://www.gartner.com/en/articles/gartner-top-strategic-technology-trends-for-2024 tracks the commercial maturity of confidential AI services across the major cloud providers and notes that the technology has crossed from research to operational adoption for high-sensitivity workloads. The MITRE ATLAS knowledge base https://atlas.mitre.org/ is starting to catalog the attack techniques specific to attesting and verifying TEEs, which is the new attack surface confidential computing introduces.

Maturity Indicators

Foundational. Encryption at rest is partially configured; some training data and model artefacts are stored unencrypted. Encryption in transit is configured for external endpoints but internal traffic is unencrypted. Key management uses default cloud-platform keys without rotation. Confidential computing has not been considered.

Applied. Encryption at rest is configured for all production training data, model artefacts, and inference logs. Encryption in transit is enforced for external endpoints with modern TLS configuration. Key management uses dedicated KMS keys with documented rotation schedules. The team has assessed which workloads require confidential computing.

Advanced. Per-tenant key separation is enforced for multi-tenant workloads. Mesh-managed mutual TLS is enforced for internal traffic. Key rotation is automated. Confidential computing is deployed for the highest-sensitivity workloads. The threat model from Article 1 names cryptographic compromise as a vector and the controls map back to it.

Strategic. Cryptographic posture is a first-class governance surface. Key usage is audited and anomalous patterns trigger detection. Confidential computing is used for any workload handling regulated data or third-party IP, with attestation evidence retained. The cryptographic implementation is itself audited on a regular schedule by external specialists. Red-team exercises (Article 11) include attempts to extract keys, downgrade TLS, and break out of TEEs.

Practical Application

A team that today has partial encryption coverage should make three changes this quarter. First, audit every storage location that holds training data, model artefacts, and inference logs and confirm that each has encryption at rest configured with a KMS key the operator manages (not a default platform key). Where encryption is missing, enable it; where the key is the platform default, migrate to a customer-managed key.

Second, audit every internal network path between AI components and confirm that TLS is enforced. Where it is not, configure mesh-managed mutual TLS or platform-native equivalents. The audit will surface paths the team did not realize were unencrypted.

Third, identify which production workloads handle data whose sensitivity would justify confidential computing — regulated personal data, third-party IP, sensitive business data — and pilot a confidential-computing deployment for one of them. The pilot establishes the operational pattern and the evidence base for expanding the approach across the platform as the use case warrants.

These three actions close the most likely encryption gaps, create the cryptographic foundation on which advanced patterns are built, and produce the audit evidence that compliance frameworks (Article 15) require.