Target version: 1.1
Rook was designed for storage consumption in the same Kubernetes cluster as the clients who are consuming the storage. However, this scenario is not always sufficient.
Another common scenario is when Ceph is running in an "external" cluster from the clients. There are a number of reasons for this scenario:
Local Cluster | The cluster where clients are running that have a need to connect to the Ceph storage. Must be a Kubernetes/OpenShift cluster. |
External Cluster | The cluster where Ceph Mons, Mgr, OSDs, and MDS are running, which might have been deployed with Rook, Ansible, or any other method. |
Requirements for clients in the local cluster to connect to the external cluster include:
When the Rook operator is started, initially it is not aware of any clusters. When the admin creates the operator, they will want to configure the operator differently depending on if they want to configure a local Rook cluster, or an external cluster.
If external cluster management is required, the differences are:
allowPrivilegedContainer
allowHostDirVolumePlugin
allowHostPID
allowHostIPC
allowHostPorts
The CSI driver is agnostic of whether Ceph is running locally or externally. The core requirement of the CSI driver is the list of mons and the keyring with which to connect. This metadata is required whether the cluster is local or external. The Rook operator will need to keep this metadata updated throughout the lifetime of the CSI driver.
The CSI driver will be installed and configured by the Rook operator, similarly to any Rook cluster. The advantages of this approach instead of a standalone ceph-csi for external clusters include:
Question: How would Rook behave in the case where the admin deployed ceph-csi standalone as well as Rook? It seems reasonable not to support this, although it's not clear if there would actually be conflicts between the two.
The flex driver would also be agnostic of the cluster for the same reasons, but we won’t need to worry about the flex driver going forward.
In order for Rook to provide the storage to clients in the local cluster, the CephCluster CRD will be created in order for the operator to provide local management of the external cluster. There are several differences needed for the operator to be aware of an external cluster.
The first bullet point above requires an extra manual configuration step by the cluster admin from what they need in a typical Rook cluster. The other items above will be handled automatically by the Rook operator. The extra step involves exporting metadata from the external cluster and importing it to the local cluster:
kubectl create -f <config.yaml>
The CephCluster CRD will have a new property "external" to indicate whether the cluster is external. If true, the local operator will implement the described behavior. Other CRDs such as CephBlockPool, CephFilesystem, and CephObjectStore do not need this property since they all belong to the cluster and will effectively inherit the external property.
kind: CephCluster
spec:
external: true
The mgr modules, including the dashboard, would be running in the external cluster. Any configuration that happens through the dashboard would depend on the orchestration modules in that external cluster.
With the rook-ceph cluster created, the CSI driver integration will cover the Block (RWO) storage and no additional management is needed.
When a pool CRD is created in the local cluster, the operator will create the pool in the external cluster. The pool settings will only be applied the first time the pool is created and should be skipped thereafter. The ownership and lifetime of the pool will belong to the external cluster. The local cluster should not apply pool settings to overwrite the settings defined in the external cluster.
If the pool CRD is deleted from the local cluster, the pool will not be deleted in the external cluster.
A shared filesystem must only be created in the external cluster. Clients in the local cluster can connect to the MDS daemons in the external cluster.
The same instance of CephFS cannot have MDS daemons in different clusters. The MDS daemons must exist in the same cluster for a given filesystem. When the CephFilesystem CRD is created in the local cluster, Rook will ignore the request and print an error to the log.
An object store can be created that will start RGW daemons in the local cluster. When the CephObjectStore CRD is created in the local cluster, the local Rook operator does the following:
Question: Should we generate a unique name so an object store of the same name cannot be shared with the external cluster? Or should we allow sharing of the object store between the two clusters if the CRD has the same name? If the admin wants to create independent object stores, they could simply create them with unique CRD names.
Assuming the object store can be shared with the external cluster, similarly to pools, the owner of the object store is the external cluster. If the local cluster attempts to change the pool settings such as replication, they will be ignored.
Rook already creates and injects service monitoring configuration, consuming what the ceph-mgr prometheus exporter module generates. This enables the capability of a Kubernetes cluster to gather metrics from the external cluster and feed them in Prometheus.
The idea is to allow Rook-Ceph to connect to an external ceph-mgr prometheus module exporter.
Enhance external cluster script:
--prometheus-exporter-endpoint
flagAdd a new entry in the monitoring spec of the CephCluster
CR:
// ExternalMgrEndpoints point to existing Ceph prometheus exporter endpoints
ExternalMgrEndpoints []v1.EndpointAddress `json:"externalMgrEndpoints,omitempty"`
}
So the CephCluster CR will look like:
monitoring:
# requires Prometheus to be pre-installed
enabled: true
externalMgrEndpoints:
- ip: "192.168.0.2"
- ip: "192.168.0.3"
Configure monitoring as part of configureExternalCephCluster()
method
Create a new metric Service
Create an Endpoint resource based out of the IP addresses either discovered or provided by the user in the script