Run:AI releases dynamic scheduling for NVIDIA Multi-Instance GPU technology

November 9, 2021 – Tel Aviv, Israel. Run:AI, a leader in compute orchestration for AI workloads, today announced dynamic scheduling support for customers using the NVIDIA Multi-Instance GPU (MIG) technology, which is available on NVIDIA A100 Tensor Core GPUs as well as other NVIDIA Ampere architecture GPUs. Run:AI now enables the creation and management of MIG instances on the fly, which is particularly useful for smaller inference and light training workloads that don’t require an entire GPU.

With Run:AI, MIG-enabled GPUs can be configured automatically, according to demand. For example, when a user requests access to one-seventh of a GPU, one MIG partition is configured and provided to the user. This process is seamless for the researcher requesting resources.

“NVIDIA MIG technology is revolutionary for running multiple simultaneous jobs like inference on one GPU,” said Omri Geller, Run:AI’s co-founder and CEO. “Now, with Run:AI’s dynamic scheduling for MIG, researchers simply request resources, and the platform automatically makes use of the right amount of compute for the job, adding flexibility and speed, and reducing idle GPU time.”

IT administrators benefit from automated configuration of MIG instances, which provisions the right-sized MIG instance based on workload demands. Run:AI’s management tools support MIG technology in the following ways:

Administrator tools manage MIG instances easily, monitor usage of MIG instances, bill users in a simplified way, and manage quotas.
IT admins simply enable MIG – no other configuration is needed for Run:AI to dynamically and automatically create GPU instances in the most efficient way based on the demand generated by data scientists.
Run:AI enables users to easily request access to MIG instances using the Run:AI GUI, CLI, preferred MLOps tool, or simple APIs.

In addition, Run:AI enables seamless elastic scalability for NVIDIA A100 and other GPUs, so workloads can scale up by connecting multiple A100-based systems together.

Run:AI technology was initially built for large-scale, distributed, multi-node training and has expanded to support every type of workload – from interactive jobs to inference and large-scale multi-node training. Run:AI also supports mixed clusters — NVIDIA A100 GPUs side-by-side with GPUs that do not support MIG functionality — enabling full utilization and control of all GPUs in the cluster as one pool of resources.

Run:AI is a proud member of the NVIDIA DGX-Ready Software partner program, certified to seamlessly run on NVIDIA DGX systems. Run:AI is also a Premier member of NVIDIA Inception, a program designed to nurture startups revolutionizing industries with advancements in data science, AI and high-performance computing.

Dr. Ronen Dar, Run:AI CTO and co-founder, will talk about inference workloads and dynamic MIG scheduling at NVIDIA GTC, a global AI conference taking place this week. Omri Geller, CEO and co-founder of Run:AI, will speak on a GTC panel about MLOps tools.

About Run:AI
Run:AI is a cloud-native compute management platform for the AI era. Run:AI gives data scientists access to all of the pooled compute power they need to accelerate AI development and deployment – whether on-premises or in the cloud. The platform provides IT and MLOps with real-time visibility and control over scheduling and dynamic provisioning of GPUs to deliver more than 2X gains in utilization of existing infrastructure. Built on Kubernetes, Run:AI enables seamless integration with existing IT and data science workflows. Learn more at www.run.ai.