security-model.md 11 KB

Security Model

The Rook operator currently uses a highly privileged service account with permissions to create namespaces, roles, role bindings, etc. Our approach would not pass a security audit and this design explores an improvement to this. Furthermore given our use of multiple service accounts and namespace, setting policies and quotas is harder than it needs to be.

Goals

  • Reduce the number of service accounts and privileges used by Rook
  • Reduce the number of namespaces that are used by Rook
  • Only use services accounts and namespaces used by the cluster admin -- this enables them to set security policies and quotas that rook adheres to
  • Continue to support a least privileged model

What we do today

Today the cluster admin creates the rook system namespace, rook-operator service account and RBAC rules as follows:

apiVersion: v1
kind: Namespace
metadata:
  name: rook-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-operator
  namespace: rook-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-operator
rules:
- apiGroups: [""]
  resources: ["namespaces", "serviceaccounts", "secrets", "pods", "services", "nodes", "nodes/proxy", "configmaps", "events", "persistentvolumes", "persistentvolumeclaims"]
  verbs: [ "get", "list", "watch", "patch", "create", "update", "delete" ]
- apiGroups: ["extensions"]
  resources: ["thirdpartyresources", "deployments", "daemonsets", "replicasets"]
  verbs: [ "get", "list", "watch", "create", "delete" ]
- apiGroups: ["apiextensions.k8s.io"]
  resources: ["customresourcedefinitions"]
  verbs: [ "get", "list", "watch", "create", "delete" ]
- apiGroups: ["rbac.authorization.k8s.io"]
  resources: ["clusterroles", "clusterrolebindings", "roles", "rolebindings"]
  verbs: [ "get", "list", "watch", "create", "update", "delete" ]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: [ "get", "list", "watch", "delete" ]
- apiGroups: ["rook.io"]
  resources: ["*"]
  verbs: [ "*" ]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-operator
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rook-operator
subjects:
- kind: ServiceAccount
  name: rook-operator
  namespace: rook-system

rook-operator is a highly privileged service account with cluster wide scope. It likely has more privileges than is currently needed, for example, the operator does not create namespaces today. Note the name rook-system and rook-operator are not important and can be set to anything.

Once the rook operator is up and running it will automatically create the service account for the rook agent and the following RBAC rules:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-ceph-agent
  namespace: rook-system
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-ceph-agent
rules:
- apiGroups: [""]
  resources: ["pods", "secrets", "configmaps", "persistentvolumes", "nodes", "nodes/proxy"]
  verbs: [ "get", "list" ]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: [ "get" ]
- apiGroups: ["rook.io"]
  resources: ["volumeattachment"]
  verbs: [ "get", "list", "watch", "create", "update" ]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-ceph-agent
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rook-ceph-agent
subjects:
- kind: ServiceAccount
  name: rook-ceph-agent
  namespace: rook-system

When the cluster admin create a new Rook cluster they do so by adding a namespace and the rook cluster spec:

apiVersion: v1
kind: Namespace
metadata:
  name: mycluster
---
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
  name: myrookcluster
  namespace: mycluster
	...

At this point the rook operator will notice that a new rook cluster CRD showed up and proceeds to create a service account for the rook-api and rook-ceph-osd. It will also use the default service account in the mycluster namespace for some pods.

The rook-api service account and RBAC rules are as follows:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-api
  namespace: mycluster
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-api
  namespace: mycluster
rules:
- apiGroups: [""]
  resources: ["namespaces", "secrets", "pods", "services", "nodes", "configmaps", "events"]
  verbs: [ "get", "list", "watch", "create", "update" ]
- apiGroups: ["extensions"]
  resources: ["thirdpartyresources", "deployments", "daemonsets", "replicasets"]
  verbs: [ "get", "list", "create", "update" ]
- apiGroups: ["storage.k8s.io"]
  resources: ["storageclasses"]
  verbs: [ "get", "list" ]
- apiGroups: ["apiextensions.k8s.io"]
  resources: ["customresourcedefinitions"]
  verbs: [ "get", "list", "create" ]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-api
  namespace: mycluster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rook-api
subjects:
- kind: ServiceAccount
  name: rook-api
  namespace: mycluster

The rook-ceph-osd service account and RBAC rules are as follows:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-ceph-osd
  namespace: mycluster
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-ceph-osd
  namespace: mycluster
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: [ "get", "list", "watch", "create", "update", "delete" ]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-ceph-osd
  namespace: mycluster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rook-ceph-osd
subjects:
- kind: ServiceAccount
  name: rook-ceph-osd
  namespace: mycluster

Proposed Changes

Just as we do today the cluster admin is responsible for creating the rook-system namespace. I propose we have a single service account in this namespace and call it rook-system by default. The names used are inconsequential and can be set to something different by the cluster admin.

apiVersion: v1
kind: Namespace
metadata:
  name: rook-system
---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-system
  namespace: rook-system

The rook-system service account is responsible for launching all pods, services, daemonsets, etc. for Rook and should have enough privilege to do and nothing more. I've not audited all the RBAC rules but a good tool to do is here. For example:

kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-system
rules:
- apiGroups: [""]
  resources: ["pods", "services", "configmaps"]
  verbs: [ "get", "list", "watch", "patch", "create", "update, "delete" ]
- apiGroups: ["extensions"]
  resources: ["deployments", "daemonsets", "replicasets"]
  verbs: [ "get", "list", "watch", "patch", "create", "update, "delete" ]
- apiGroups: ["apiextensions.k8s.io"]
  resources: ["customresourcedefinitions"]
  verbs: [ "get", "list", "watch", "patch", "create", "update, "delete" ]
- apiGroups: ["rook.io"]
  resources: ["*"]
  verbs: [ "*" ]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-system
  namespace: rook-system
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rook-system
subjects:
- kind: ServiceAccount
  name: rook-system
  namespace: rook-system

Notably absent here are privileges to set other RBAC rules and create read cluster-wide secrets and other resources. Because the admin created the rook-system namespace and service account they are free to set policies on them using PSP or namespace quotas.

Also note that while we use a ClusterRole for rook-system we only use a RoleBinding to grant it access to the rook-system namespace. It does not have cluster-wide privileges.

When creating a Rook cluster the cluster admin will continue to define the namespace and cluster CRD as follows:

apiVersion: v1
kind: Namespace
metadata:
  name: mycluster
---
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
  name: myrookcluster
  namespace: mycluster
	...

In addition we will require that the cluster-admin define a service account and role binding as follows:

apiVersion: v1
kind: ServiceAccount
metadata:
  name: rook-cluster
  namespace: mycluster
---
kind: Role
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-cluster
  namespace: mycluster
rules:
- apiGroups: [""]
  resources: ["configmaps"]
  verbs: [ "get", "list", "watch", "create", "update", "delete" ]
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-cluster
  namespace: mycluster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rook-system
subjects:
- kind: ServiceAccount
  name: rook-system
  namespace: rook-system
---
kind: RoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: rook-system
  namespace: mycluster
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: Role
  name: rook-cluster
  namespace: rook-cluster
subjects:
- kind: ServiceAccount
  name: rook-cluster
  namespace: mycluster

This will grant the rook-system service account access to the new namespace and also setup a least privileged service account rook-cluster to be used for pods in this namespace that need K8S api access.

With this approach rook-system will only have access to namespaces nominated by the cluster admin. Also we will no longer create any service accounts or namespaces enabling admins to set stable policies and quotas.

Also all rook pods except the rook operator pod should run using rook-cluster service account in the namespace they're in.

Supporting common namespaces

Finally, we should support running multiple rook clusters in the same namespaces. While namespaces are a great organizational unit for pods etc. they are also a unit of policy and quotas. While we can force the cluster admin to go to an approach where they need to manage multiple namespaces, we would be better off if we give the option to cluster admin decide how they use namespace.

For example, it should be possible to run rook-operator, rook-agent, and multiple independent rook clusters in a single namespace. This is going to require setting a prefix for pod names and other resources that could collide.

The following should be possible:

apiVersion: v1
kind: Namespace
metadata:
  name: myrook
---
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
  name: red
  namespace: mycluster
	...
---
apiVersion: rook.io/v1alpha1
kind: Cluster
metadata:
  name: blue
  namespace: mycluster
	...