Use Ceph exporter as the new source of metrics based on performance counters coming from every Ceph daemon.
Ceph exporter was created to avoid performance problems at scale obtaining metrics in clusters with high number of daemons (mainly OSDs). The Prometheus manager module was not enough performant to obtain the performance counters from all the daemons along the whole Ceph cluster, so the solution for avoid this problem in big clusters is to have a new daemon collecting all the performance counters from all the daemons in each host, and expose them as standard prometheus metrics at /metrics endpoint for prometheus to scrape each of those endpoints.
With the Ceph exporter, the source of metrics changes in the Ceph clusters. There will be two main sources of metrics:
Ceph exporter will be deployed in Rook when the monitoring.metricsDisabled: false
setting is applied in the CephCluster CR and the Ceph version later than 17.2.7.
Ceph-exporter is a daemon for collecting perf counters of ceph daemons from admin socket files and exposing them to /metrics endpoint. So we will need the following things:
$dataDirHostPath/exporter
The exporter service will be created on each of the node where ceph-exporter pod is running.\
ceph-exporter-service.yaml
apiVersion: v1
kind: Service
metadata:
name: ceph-exporter-<node>
namespace: <namespace>
labels:
app: rook-ceph-exporter
node: <node>
spec:
internalTrafficPolicy: Cluster
ports:
- name: http-metrics
port: 9926
protocol: TCP
targetPort: 9926
selector:
app: rook-ceph-exporter
node: <nodeName>
sessionAffinity: None
type: ClusterIP
A single service monitor that will match all the exporter services.
apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
name: rook-ceph-exporter
namespace: <namespace>
spec:
namespaceSelector:
matchNames:
- <namespace>
selector:
matchLabels:
app: rook-ceph-exporter
endpoints:
- port: http-metrics
path: /metrics
interval: 5s