Blog Article:

OpenNebula and NVIDIA DSX OS: The Perfect Match for Building Open and Sovereign AI Factories

NVIDIA DSX is an AI Factory-scale platform, unifying design, simulation, operations, and ecosystem technologies to help build AI Factories optimized for tokens per watt. AI Factory operators are moving beyond GPU capacity toward production AI cloud services. Getting there requires reliable lifecycle automation, runtime consistency, GPU health visibility, multi-tenant operations, and platform services that can integrate into their existing infrastructure.

NVIDIA’s announcement of DSX OS marks an important milestone for sovereign AI infrastructure. As part of the NVIDIA DSX platform, DSX OS delivers open source, modular software for IaaS and PaaS AI Factory infrastructure, providing composable components that span the AI infrastructure stack through a common architecture for lifecycle management, runtime consistency, health automation, resiliency, multi-tenant operations, and AI platform services – designed to integrate into partner control planes, infrastructure platforms, and AI cloud service stacks.

DSX OS gives partners modular building blocks across the IaaS and PaaS stack, developed from NVIDIA’s own AI Factory operating experience and designed for integration into partner platforms. As referenced in the press release, OpenNebula Systems is integrating NVIDIA Infra Controller (NICo) into its infrastructure platform to automate bare-metal lifecycle management, secure tenant transitions, and AI Factory operations across large-scale AI Factories.

How OpenNebula with DSX OS Amplifies AI Factory Economics

The goal is clear: help organizations deploy large-scale AI Factories faster, while improving efficiency, resilience, and infrastructure utilization. AI Factories depend on the continuous coordination of many moving parts: compute systems, networking, storage, data center facilities, power distribution, observability, runtime environments, and cloud operations. In the end, the efficiency of an AI Factory depends on how well these elements work together to produce more tokens with the energy available.

OpenNebula provides the cloud operating layer that brings together DSX OS components and other key infrastructure building blocks needed to deploy AI services faster and run AI Factories efficiently at scale. Through its lightweight and open architecture, OpenNebula unifies GPU resources, bare-metal infrastructure, Kubernetes, virtual machines, storage, and networking into a single platform for enterprise and service-provider AI environments. This includes support for secure multi-tenancy and governance through Virtual Data Centers, ACLs, quotas, and capacity controls, allowing operators to reduce complexity, isolate workloads, allocate resources fairly, and support demanding AI Factory environments.

OpenNebula also helps AI Factories run more efficiently by making better use of the infrastructure already in place. Its advanced DRS scheduling, intelligent workload placement, affinity and anti-affinity policies, GPU partitioning and sharing through vGPU and MIG, multi-GPU support with NVLink-aware scheduling, elastic resource allocation, and automation workflows help operators increase GPU utilization, share capacity more effectively, and reduce operational overhead. Networking performance is just as important as GPU availability, and OpenNebula can integrate with accelerated networking technologies such as NVIDIA Quantum InfiniBand and NVIDIA Spectrum-X Ethernet, as well as NVIDIA BlueField DPU-accelerated data paths, to support low-latency, high-throughput AI workloads. This fits naturally with the NVIDIA DSX goal of improving tokens per watt and lowering the cost of running AI infrastructure.

Reliability, resiliency, and scalability are critical for AI Factories running continuously across distributed environments. OpenNebula provides centralized management across clusters, racks, and data centers, with federation, hybrid cloud support, policy-driven automation, monitoring integration, high availability for control nodes, and fault tolerance for compute nodes. It also supports trusted sovereign AI environments with Secure Boot, UEFI, vTPM, and confidential VMs, while integrations such as NVIDIA Infra Controller (NICo) help automate provisioning and recovery workflows for secure multi-tenant operations.

Openness and Stability for Sovereign AI Factories

Openness is another essential part of the equation. As an open cloud platform, OpenNebula enables organizations to build sovereign AI Factories under their own control, avoiding dependency on proprietary cloud stacks and reducing the risk of vendor lock-in. This matters especially for governments, enterprises, research institutions, and service providers that need control over their infrastructure, data governance, operational policies, and long-term platform strategy.

This open approach is backed by nearly 20 years of experience operating large-scale production cloud environments across enterprise, telecom, edge, HPC, and research sectors, including systems with more than 2,000 nodes per control instance. OpenNebula brings a mature, production-proven operating model to AI Factory deployments, combining enterprise-grade support with a highly cost-effective subscription model designed to reduce total cost of ownership.

As AI infrastructure evolves toward larger, more distributed, and more automated environments, open cloud platforms will play a central role in enabling scalable and sovereign AI operations. No two AI Factories are the same. Each organization needs the freedom to adapt its architecture to its infrastructure, workloads, governance requirements, and business model. OpenNebula is designed for this reality, providing a modular and flexible cloud operating layer that integrates with the main infrastructure components required in production environments, including storage, backup, networking, monitoring, and security.

Together, NVIDIA DSX OS and OpenNebula provide a strong foundation for the next generation of AI Factories: open, efficient, resilient, and ready for sovereign deployment. The result is an infrastructure model that gives organizations the freedom to scale AI on their own terms, without sacrificing control, flexibility, or long-term choice.

Blog Article:

OpenNebula and NVIDIA DSX OS: The Perfect Match for Building Open and Sovereign AI Factories

How OpenNebula with DSX OS Amplifies AI Factory Economics

Openness and Stability for Sovereign AI Factories

Ignacio M. Llorente

Jun 1, 2026

Product

0 Comments

Submit a Comment Cancel reply

Related Articles

Elastic Capacity Management for Slurm and Kubernetes Clusters in AI Factories with OpenNebula

OpenNebula Storage Options: A Guide to Enterprise Backends

Slurm on OpenNebula: HPC Batch Scheduling for AI Training

Join to Our Newsletter

The Open Source Cloud & Edge Computing Platform.

Company

Partners

Read

Watch

support

Development

Integration