Press Releases

Announcing Run:ai 2.18

by
Run:ai Team
–
July 22, 2024

We're excited to introduce Run:ai version 2.18, packed with new features designed to optimize GPU performance, simplify model deployment, and enhance control over AI workloads. This update brings significant improvements that help AI-driven enterprises maximize efficiency,  reduce complexity, and achieve better results. Read on to learn how these new capabilities can benefit your organization.

Maximizing GPU Performance and Efficiency

With the growing demand for larger AI models, efficient GPU usage is more critical than ever. The GPU memory swap capabilities introduced in 2.18 leverage GPU memory and CPU memory to provide graceful workload context switching. This in turn enables multiple workloads with large memory requirements to share a single GPU, reducing idle times and improving overall productivity. Whether it's sharing a GPU between interactive notebooks or managing concurrent workloads, GPU memory swap ensures that GPU resources are used optimally, maximizing performance and cost efficiency. To learn more please check out our blog post.

Enhancing Model Deployment and Scalability

Deploying and scaling models is a complex exercise. Striking the right balance across performance, cost and security requires careful resource provisioning, detailed analysis of several performance metrics and management of complex model scaling configurations. To simplify this process Run:ai 2.18 introduces several new capabilities including:

  • Resource templates to simplify the process of initiating inference workloads
  • Real-time visibility into critical performance metrics including latency, throughput and concurrency
  • Sophisticated auto-scaling rules (including scale-to-zero) that take advantage of advanced performance metrics and are maintained via simple, intuitive UI’s
  • Automatic, configurable URL generation to ensure easy and secure model access

Run:ai 2.18 also introduces integration with Hugging Face, a leading, community driven platform offering a vast library of pre-trained models. This integration automates model downloading, GPU loading, and vLLM setup, allowing users to deploy Hugging Face models with minimal effort. It ensures that users can leverage the latest advancements in LLMs without the hassle of complex configurations, further enhancing productivity and innovation in AI-driven projects. 

To learn more please check out our blog post.

Improving Workload Management with Real-Time Notifications

We are excited to introduce Email Notifications, a feature designed to keep data scientists informed about their workloads' status changes. Managing multiple workloads from submission to completion can be challenging, with critical phases often requiring prompt action to ensure timely completion. The new Email Notifications alert data scientists immediately to changes in workloads, including status updates, suspensions, failures, and timeouts. Data scientists can customize which phases they wish to be notified about, receiving detailed context such as workload type, project, and cluster. This feature is just the beginning of enhanced support for tracking AI environment activities, with future plans to expand notifications to include alerts on additional resources, such as nodes, and add channels like Slack. To learn more please check out our blog post.

‍New CLI for Seamless AI Workload Management

Run:ai 2.17 introduced a new Workload API and in this release we are pleased to introduce a new Command Line Interface (CLI) that integrates seamlessly with the workload API. The new CLI makes it easier for data scientists and researchers to manage AI workloads directly from their terminal tools while ensuring data consistency across the platform and allowing easy switching between clusters without extra configuration. The CLI is secure, fast, and lightweight, interfacing with the control plane for optimal performance. Installation and upgrades are straightforward, and new features like Quiet mode, Interactive mode, and automatic cluster configuration enhance usability. To learn more please check out our blog post.

Run:ai version 2.18 brings valuable updates to help you manage AI workloads more effectively. For a deeper dive into these new features, we encourage you to read the release notes, check out the detailed blogs linked above, or reach out to our sales team to schedule a demo. 

Thank You!