To help troubleshoot your Rook clusters, here are some tips on what information will help solve the issues you might be seeing. If after trying the suggestions found on this page and the problem is not resolved, the Rook team is very happy to help you troubleshoot the issues in their Slack channel. Once you have registered for the Rook Slack, proceed to the General channel to ask for assistance.
For common issues specific to Ceph, see the Ceph Common Issues page.
Kubernetes status and logs are the main resources needed to investigate issues in any Rook cluster.
Kubernetes status is the first line of investigating when something goes wrong with the cluster. Here are a few artifacts that are helpful to gather:
kubectl get pod -n <cluster-namespace> -o wide
kubectl get pod -n rook-ceph -o wide
kubectl logs -n <cluster-namespace> -l app=<storage-backend-operator>
kubectl logs -n rook-ceph -l app=rook-ceph-operator
kubectl logs -n <cluster-namespace> <pod-name>
, or a pod using a label such as mon1: kubectl logs -n <cluster-namespace> -l <label-matcher>
kubectl logs -n rook-ceph -l mon=a
journalctl -u kubelet
kubectl -n <cluster-namespace> logs <pod-name> --all-containers
kubectl -n <cluster-namespace> logs <pod-name> -c <container-name>
kubectl -n <cluster-namespace> logs --previous <pod-name>
Some pods have specialized init containers, so you may need to look at logs for different containers within the pod.
kubectl -n <namespace> logs <pod-name> -c <container-name>
kubectl -n <cluster-namespace> get all