Integration

Managing authorization in Run:ai

by
Daniel Keinan
Alon Lavian
December 6, 2023

Using RBAC to manage permissions in Run:ai

Introduction

Managing permissions in complex organizational structures with different compute resources is a challenging and time-consuming task, especially Across k8s clusters. It can be easy to make mistakes, and it is difficult to keep track of who has access to what. Our solution automates the management of permissions so that you can focus on other things. We also provide a centralized view of all permissions, so you can easily see who has access to what. This helps you to ensure that only authorized users have access to sensitive data, and it can help prevent security breaches.

RBAC in k8s

Kubernetes Role-Based Access Control (RBAC) is a security mechanism that allows administrators to define and manage fine-grained access permissions for users or groups within a Kubernetes cluster. RBAC allows users to be assigned specific roles, which determine their access rights to resources and operations within the cluster. You can learn more about RBAC in k8s here.

RBAC in Run:ai

Kubernetes RBAC is limited to a single cluster, but what happens in complex environments with multiple clusters? Run:ai RBAC expands the scope of k8s RBAC, making it easy for administrators to manage access policies across multiple clusters. Additionally, Run:ai RBAC allows you to manage hierarchy levels within a cluster using the department feature, giving administrators more flexibility in controlling access.

How does it work?

Expanding k8s RBAC

The solution is elegant and expands the concepts of RBAC in k8s to model it in a k8s/cloud-native way. It does this by using a combination of Kubernetes resources and annotations to define permissions. This allows for more fine-grained control over access to resources and makes it easier to manage permissions in a distributed environment. Additionally, the solution is extensible, which means that it can be easily adapted to meet the specific needs of different organizations.

Bits and bytes

The three main building blocks of the RBAC solution in Run:ai are:

  • Subjects: These are the users, programmatic users (applications), or groups of users who are granted access to resources.
  • Roles: These are the sets of permissions that are assigned to subjects.
  • Scopes: These are the resources that are accessible to subjects with certain roles.

A combination of these three elements forms an access rule, which is used to manage permission policies.

A subject is an entity that receives a rule. Subjects can be:

  • Users
  • Applications
  • Groups (SSO only)

A role is a combination of entities and actions.

Permissions:

  • C = Create
  • R = Read
  • U = Update
  • D = Delete

Entities:

  • Entities are granular parts of the platform that can be controlled separately.


The role defines which set of actions is permitted on the platform entities. Run:ai supports several predefined roles.

For example, a role might allow a user to create and read documents, but not update or delete them.

A scope is a part of an organization that can be accessed based on assigned roles. Scopes include:

  • Projects
  • Departments
  • Clusters
  • Tenant (all clusters)

An access rule is the assignment of a role to a subject in a scope. Access rules are expressed as follows:

<subject> is a <role> in a <scope>.


For example:
User [email protected] is a department admin in Department A.


Using RBAC in Run:ai

Using RBAC in Run:ai is straightforward. Admins can apply their access policy via the UI or API by combining the above building blocks into access rules. Later on, when subjects attempt to perform actions on the platform, the actions will be allowed or denied based on the subject’s relevant rules.


Managing Access Rules

Managing the access rules can be done in the “Access Rules and Roles” dedicated view. You can view, create, and delete access rules easily via the UI.


Alternatively, you may manage the rules of a certain subject or scope. For example, here, from the application view, clicking on "Access rules" will display the rules related to the application.

Example - Creating an access rule

Let’s try and submit a workload using an application with a researcher role. we’ll

  • Create an application,
  • Create an access rule for it
  • Then try to submit a job using this application.


1. First, we’ll create a new application in the UI


2. We’ll create an access rule for this application. Let’s check it and press the “Access rules” button on the top row


3. We’ll assign an L1 researcher role to the app in the scope of a project called “my-project”


4. Now let’s try to submit a workload using this application to “my-project”

curl --location 'https://alonest.runailabs.com/researcher/api/v1/workload/proxy/namespaces/runai-my-project/TrainingWorkload' \
--header 'authorization: Bearer xyz' \
--header 'content-type: application/json' \
--data '{
  "apiVersion": "run.ai/v2alpha1",
  "kind": "TrainingWorkload",
  "metadata": {
    "name": "my-workload",
    "namespace": "runai-my-project"
  },
  "spec": {
    "gpu": {
      "value": "1"
    },
    "image": {
      "value": "gcr.io/run-ai-demo/quickstart-demo"
    },
    "name": {
      "value": "my-workload"
    }
  }
}'


5. The workload is submitted successfully and is now active


6. Let’s try to submit the workload to a project that is out of the application’s permitted scope called “not-my-project”

curl --location 'https://alonest.runailabs.com/researcher/api/v1/workload/proxy/namespaces/runai-not-my-project/TrainingWorkload' \
--header 'authorization: Bearer xyz' \
--header 'content-type: application/json' \
--data '{
  "apiVersion": "run.ai/v2alpha1",
  "kind": "TrainingWorkload",
  "metadata": {
    "name": "my-workload",
    "namespace": "runai-not-my-project"
  },
  "spec": {
    "gpu": {
      "value": "1"
    },
    "image": {
      "value": "gcr.io/run-ai-demo/quickstart-demo"
    },
    "name": {
      "value": "my-workload"
    }
  }
}'

7. The workload submission failed since the application is not authorized to act in this scope

{
  "message": "post request failed",
  "details": "user is not authorized to perform researcher actions: [myapp is unauthorized myapp is unauthorized myapp is unauthorized]",
  "status": 403
}


By implementing a fine-grained access rule, we were able to control the permissions of our application and only allow specific actions within the desired scope. Combining multiple access rules can give the admin the least amount of privilege for their subjects.