Building Secure Multi-Tenant AI Clouds: How DevZero Leverages NVIDIA BlueField-3 DPUs and MIG Technology

DevZero enables customers to build secure, scalable, multi-tenant clouds quickly and easily. As part of our fully cloud-native offering, we built a platform that allows multiple isolated users to share the same Kubernetes platform without compromising information security or confidentiality. We achieve this through a powerful, virtualized compute isolation primitive known as a “microVM”, which blends together the snappy startup times and familiar tooling of containers with the isolation characteristics of a full virtual machine. To tie everything together, we connect end users to their isolated microVMs through an encrypted virtual private network powered by WireGuard™, allowing you to connect back to your microVM from anywhere in the world. More about WireGuard™ here.

We’re always interested in keeping a keen eye on developing technologies inside the data center, including technologies such as CXL, NVLink, DPUs, and GPUs. As an NVIDIA Inception member, we’ve graciously been given access to some of NVIDIA’s latest and greatest platforms and have been working tirelessly on integrating our compute isolation technology with them. What we’ve been able to achieve is nothing short of amazing.

With NVIDIA BlueField-3 DPUs, we’re able to fully isolate each tenant from accessing each other over the network without compromising security. And, as the DPU runs in a secure, trusted environment outside the host, we’re able to offload much of the control plane logic to the DPU. This allows us to significantly reduce the attack surface on any one compute node, as most of the secure management logic is isolated from the host. We’re also looking into using the DPU to offload storage from the host (using NVIDIA’s NVMe emulation)—stay tuned!

This type of accelerator card was once exclusive to big hyperscalers and, in fact, is how many of their features are built. With the release of NVIDIA BlueField-3 DPUs, however, this technology is now accessible to the general public, meaning that everybody now has the building blocks to create their own hyperscalers powered by NVIDIA-accelerated computing.

What is a DPU?

A Data Processing Unit (DPU) is a specialized processor designed to offload, accelerate, and isolate data-centric tasks (networking, security, storage) from CPUs. Taking a closer look at an NVIDIA BlueField DPU, it combines a high-speed network adapter, a cluster of Arm processor cores, DDR5 memory, and an integral PCIe switch. This architecture allows operating system software such as Ubuntu Linux and UEFI to run, providing a familiar development and administration interface for sysadmins and developers alike. Each BlueField-3 also has an out-of-band management controller, very similar to existing servers.

Complementing the BlueField DPU, NVIDIA DOCA is the software framework that unlocks its full potential. DOCA provides developers with APIs, libraries, and services to offload, program, and orchestrate data center workloads—like networking, security, and storage—directly on BlueField, unlocking purpose-built software-defined hardware accelerations for cloud-native and AI environments.

The DPU has full direct control over the built-in network adapter and is able to dynamically intercept traffic sent by the host and redirect it elsewhere—all without the host being aware. This enables a broad range of network processing use cases, including encapsulation, encryption, etc. which can be implemented completely independently from the host, all fully programmable through DOCA.

‍

‍

Once you combine that with another emerging technology known as SR-IOV (Single Root Input/Output Virtualization), we’re able to multiplex an individual DPU into many hardware-isolated Ethernet devices, which all pass through the DPU and have transparent GENEVE encapsulation/decapsulation applied—completely independent of the host. We then take those hardware-isolated Ethernet devices and pass them through to each individual microVM as regular PCIe devices—no special configuration required. The microVM is then able to use said Ethernet device as a typical Ethernet device, without the host computer or the microVM being aware of the underlying encapsulation whatsoever.

GPU Multi-Tenancy

With our microVM technology, as well as the BlueField-3 DPU, we believe that we’ve solved compute and networking multi-tenancy. However, with the rise of AI, we’re seeing more and more organizations building their own private clouds designed for AI workloads, known as AI Factories. By building their own private clouds, these organizations seek a similar level of isolation and multi-tenancy as a regular cloud. As AI workloads require GPUs, GPUs must be maintained and isolated as well. Historically, mainstream cloud providers pass through full GPUs to virtual machines.

However, this comes at a substantial premium to end users, especially if portions of the GPU go unused, which is becoming increasingly commonplace as AI models become more efficient and require less VRAM.

As NVIDIA GPU innovations continue to proliferate, one in particular stands out in this context. With NVIDIA Multi-Instance GPU (MIG), we’re able to seamlessly divide large GPUs into 7 smaller, hardware-isolated instances.

Leveraging NVIDIA vGPU (C-Series) with NVIDIA MIG, we can pass virtual GPUs to our microVMs, which can then be leveraged for inference use cases. As each MIG is isolated within the GPU silicon itself, each tenant can fully utilize their GPU slice without degrading the experience of others. This reduces costs for end users while maintaining a layer of strong isolation between tenants, which is of utmost importance for cloud operators.

‍

‍

Today, our compute primitive takes advantage of NVIDIA vGPU (C-Series) with NVIDIA MIG, allowing platform operators to share a single GPU between many tenants without compromising on isolation whatsoever. This is especially powerful when considering AI inference use cases, which may not require the full VRAM of a GPU. We firmly believe that our technology, combined with NVIDIA’s technology, is an important step forward for enabling GPU cloud multi-tenancy for AI inferencing use cases. With NVIDIA GPUs and BlueField-3 DPUs, we’re able to build GPU clouds with similar isolation characteristics as mainstream cloud providers, paving the way for the next generation of AI workloads.

Conclusion

We believe that it has never been easier for individuals and organizations to build their own cloud than it is today. With the rise of Kubernetes and the latest NVIDIA-accelerated computing innovations, we’re now able to build an isolated, multi-tenant cloud with significant ease. We’re beyond excited to see what people build with these clouds, and we hope to empower the next generation of AI builders and tinkerers to harness the power of building their own AI-native cloud.

Building Secure Multi-Tenant AI Clouds: How DevZero Leverages NVIDIA BlueField-3 DPUs and MIG Technology

Ellie Ford

What is a DPU?

GPU Multi-Tenancy

Conclusion