Efficiency and innovation are often touted as hallmark attributes of generative AI. But as more enterprise businesses look to integrate the technology into their workflows, confidentiality — in data processing and sharing — is of utmost importance.
The recent introduction of AI-specific policies, such as the U.S. Executive Order on the Safe, Secure and Trustworthy AI and the European Union’s AI Act, is a regulatory step forward for developers and users alike. These policies set compliance standards for AI developers to ensure that sensitive, proprietary, or confidential data is protected. They also nod to the inherent value of AI models as intellectual property, wherein training data, algorithms, model architecture, and weights should be secured against unauthorized access.
How confidential computing protects data at scale
Cloud services providers (CSPs) have been helping their customers keep their sensitive code and data secure in transit on the network using TLS and HTTPS encryption, and secure at rest on disk using encryption with customer managed keys. However, one area of data protection that has not been addressed until more recently is the protection of data in use in server memory. This changed in 2019 when Microsoft and other industry leaders founded the Confidential Computing Consortium (CCC), a project community at the Linux Foundation, to accelerate the development and adoption of confidential computing. The CCC defines confidential computing as the protection of data in use by performing computations in a hardware-based and attested Trusted Execution Environment (TEE).
As a pioneer in this space, Microsoft Azure became one of the first CSPs to introduce confidential virtual machines, which are virtual machines running on confidential computing enabled CPUs. With confidential VMs, only the CPU hardware and the contents of the confidential VM are trusted—all other components of the software stack, including the hypervisor and host OS, are considered outside of this trust boundary and can be breached without exposing sensitive data in memory. And, in compliance with the CCC definition of confidential computing, Microsoft provides attestation tools to allow the user to verify the good state of the CPU and their VM before disk encryption keys are released and sensitive data is loaded into the VM.
The need for confidential GPUs
“We’ve worked very closely with customers to get their feedback on what types of AI models they hope to run, what security posture they are looking for, what use cases they want to enable,” said Vikas Bhatia, Head of Product for Azure Confidential Computing. “With answers including AI models such as Stable Diffusion, Zephyr, Llama2, and GPT2, it became very clear that GPU-enhanced confidential computing would be needed. Our introduction of Azure confidential VMs with NVIDIA H100 Tensor Core GPUs is our first step at addressing this market.”
“Our collaboration with NVIDIA has been a multi-year effort,” said Bhatia, “but this has been necessary to ensure that the TEE of the confidential VM can be securely extended to include the GPU and the communications channel that connects the two. Any AI applications uploaded, built, and deployed on this stack will remain protected from end to end.”
With these new GPU-enhanced confidential VMs, existing Azure customers can redeploy their CUDA models and the code that they’ve written already in an AI ML space in a confidential GPU environment to achieve what Bhatia calls a “unified confidentiality.” This establishes a secure channel with the GPU, wherein all subsequent data transfers between the VM and GPU are protected. Furthermore, the attestation process will verify that the VMs and GPUs are running a correctly configured TEE before any sensitive applications are launched.
The diverse applications of confidential GPUs
The effectiveness of generative AI models hinges on two factors: quality and quantity in training data. Despite training progress made with publicly available datasets, access to proprietary data is essential to leveraging the full potential of enterprise models. Through confidential GPUs computing, businesses can securely authorize the use of specialized data to perform more complex and targeted tasks, such as private data analysis, joint modeling, secure voting, or multi-party computation.
Bhatia identified three major use-cases for confidential GPUs:
- Confidential multi-party computation: Organizations can collaborate to train and run inferences on models without sharing proprietary data. Only the final result of a computation would be revealed to the participants.
- Confidential inferencing: Inferencing occurs when a query or input is sent to a machine learning model to obtain a prediction or response. Confidential GPUs protect data in all stages of the inferencing process from clients, the model developer, service operations, and cloud providers.
- Confidential training: Model algorithms and weights won’t be visible outside of TEEs set up by AI developers. Models can be securely trained on encrypted, distributed datasets that remain confidential to each party within a hardware-enforced boundary.
Azure’s healthcare customers, for example, are interested in employing confidential inferencing to analyze medical images, like X-rays, CT scans, and MRIs, without disclosing sensitive patient data or proprietary algorithms. Advanced image processing can improve the likelihood of diagnosis and treatment in identifying tumors, fractures, or anomalies in scans — without placing patient data at risk.
As an example, confidential GPUs are valuable in scenarios where data privacy is crucial but collaborative computation is still necessary. Researchers can run simulations of sensitive data (e.g. government data, scientific data) without sharing datasets or code to unauthorized parties. In the finance sector, confidential multi-party computation can be useful in fraud prevention work. Finance institutions can perform analyses or computations in a protected data clean room without disclosing individual financial details.
“Before confidential computing, companies struggled to securely implement this kind of data-sharing technology,” Bhatia said. “While in preview, clients have tested the VMs and found that the security enhancements help to address some of the challenges they’re facing with respect to compliance, governance and security.”
A new security standard for the AI era
As a leader in confidential computing, Azure’s robust security platform caters to the privacy needs of businesses worldwide. Innovative hardware is essential to maintaining a confidential GPU ecosystem of applications and AI models, which Azure is building towards. Bhatia’s hope is that this level of confidentiality will one day be standard across all industries. Data privacy and AI confidentiality should be a convention of everyday computing.
“Our initial offering is best suited for use with smaller language models,” Bhatia said. “And while work is underway to scale this technology to support LLMs, we know customers will benefit from the current version by discovering the possibilities this technology will bring.”
Similar to how the early internet was once run on unsecure HTTP sites, security standards are always evolving. With more organizations processing sensitive data for AI models, there’s a great need for confidential NVIDIA GPU-powered AI. Azure’s latest VMs are a necessary, innovative introduction to secure GPU computing, which Azure is working to scale up to multiple GPUs.
“We want to set a new security standard with our confidential VMs,” Bhatia said. “We build from the mindset that a rising tide lifts all boats.”
Curious about Azure confidential VMs with NVIDIA H100 Tensor Core GPUs? Sign up to preview Azure’s hardware-based security enhancements and protect your GPU data-in-use.