Blog Article:

From Bare Metal to AI Factory: End-to-End Multi-Tenant Automation with OpenNebula and NVIDIA NCX Infra Controller

Large-scale, multi-tenant AI Factories must combine high performance with automation and strong operational governance. Modern AI infrastructures bring together infrastructure services (GPU-as-a-Service / IaaS), platform services (Kubernetes-as-a-Service), and application services (AI-as-a-Service) into a unified stack designed to support scalable, production-grade AI workloads.

Delivering and operating these environments consistently requires automated bare-metal provisioning, standardized configuration management, and robust multi-tenant orchestration to ensure efficiency, isolation, and repeatability at scale. The integration of OpenNebula with NVIDIA NCX Infra Controller addresses this need, enabling a unified path from physical GPU hardware to fully instantiated AI factory environments.

From GPU Hardware to Cloud-Orchestrated AI Platforms

NVIDIA NCX Infra Controller provides bare-metal lifecycle management for GPU servers. It automates hardware discovery, firmware validation, OS deployment, and node configuration using a Kubernetes-based control plane. NVIDIA NCX standardizes GPU nodes and prepares them as production-ready infrastructure, ensuring consistency across large clusters.

OpenNebula operates at the cloud control layer. It provides multi-tenant orchestration, governance, quota management, identity integration, and lifecycle automation. Through OpenNebula, infrastructure is abstracted into isolated environments where tenants can consume GPU resources via IaaS, deploy Kubernetes clusters, and operate AI workloads.

By integrating OpenNebula with NVIDIA NCX, the bare-metal lifecycle and the cloud orchestration layer are unified. GPU servers move from hardware provisioning to cloud-controlled resource pools without manual intervention. The result is an end-to-end automated pipeline from physical GPU hardware to fully operational AI Factory environments.

Automated AI Factory Deployment

A primary use case for this integration is to enable neocloud or AI infrastructure providers to deliver fully managed, isolated AI Factory instances on demand. The key requirements include strict separation between individual AI Factory instances, as well as isolation between tenants operating within each factory, combined with end-to-end automation and scalable lifecycle management.

The reference architecture consists of a management server—or management cluster—hosting both the NVIDIA NCX service and OpenNebula front-end instances. GPU nodes reside in a shared pool, unassigned until requested.

When a customer requests a managed AI instance, a dedicated OpenNebula front-end VM is first deployed to ensure control-plane isolation. The entire provisioning workflow is then orchestrated by OpenNebula OneForm, which automates the deployment and lifecycle management of the required compute infrastructure.

As part of this workflow, OneForm triggers the allocation and provisioning of GPU nodes from the shared pool through NVIDIA NCX, which performs operating system installation and hardware validation. Once the nodes are ready, Ansible automation—also invoked by OneForm—registers and configures the nodes within the newly created environment.

The outcome is a fully automated workflow where an isolated AI Factory is instantiated on demand. GPU capacity is allocated dynamically while maintaining strict operational separation between AI Factories and tenants.

Validation in Real-World AI Factory Environments

The integration is currently being validated with CESGA in Galicia (Spain) and CloudFerro in Poland. These are real production-scale environments where AI Factories are delivered on demand and operational consistency is critical. The validation focuses on automation reliability, performance predictability, and governance enforcement.

This approach offers predictable provisioning times, lower operational overhead, improved GPU utilization, and strong tenant isolation. It also enables dedicated AI environments to be created for each project or customer while still sharing the same pool of GPU hardware. With OpenNebula OneForm integrated with NVIDIA NCX, compute nodes can be expanded or reduced as needed to support both training and inference workloads.

By linking bare-metal lifecycle management with cloud-native orchestration, OpenNebula and NVIDIA NCX establish a unified control plane spanning hardware, virtualization, and AI platforms.

Looking Ahead

As AI infrastructure becomes central to enterprise and national strategies, the ability to deploy and operate AI Factories efficiently will define competitiveness. Automation across the full stack—from firmware provisioning to workload orchestration—is essential for scalable operations.

This integration demonstrates a practical model for transforming GPU clusters into on-demand, multi-tenant AI Factories. It provides a reproducible architecture where hardware and cloud control operate as a coordinated system, ready to support large-scale AI deployments, neocloud environments, and sovereign AI initiatives.

Meet us in person! We’re be exhibiting NVIDIA GTC in San Jose. Come visit our team, see live demos, and discuss how OpenNebula can power your AI Factories and neocloud platforms.


ONEnextgen-logo

Funded by the Spanish Ministry for Digital Transformation and Civil Service through the ONEnextgen Project  (UNICO IPCEI-2023-003), and co-funded by the European Union’s NextGenerationEU through the RRF.


Neal Hansen

Senior Cloud Solutions Architect at OpenNebula Systems

Mar 17, 2026

0 Comments

Submit a Comment

Your email address will not be published. Required fields are marked *