restore.adoc 8.4 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196
  1. [#restore]
  2. # Restoring Neo4j Containers
  3. [NOTE]
  4. **This approach assumes you have credentials and wish to store your backups
  5. on Google Cloud Storage, AWS S3, or Azure Blob Storage**. If this is not the case, you
  6. will need to adjust the restore script for your desired cloud storage method, but the
  7. approach will work for any backup location.
  8. [NOTE]
  9. **This approach works only for Neo4j 4.0+**. The tools and the
  10. DBMS itself changed quite a lot between 3.5 and 4.0, and the approach
  11. here will likely not work for older databases without substantial
  12. modification.
  13. ## Approach
  14. The restore container is used as an `initContainer` in the main cluster. Prior to
  15. a node in the Neo4j cluster starting, the restore container copies down the backup
  16. set, and restores it into place. When the initContainer terminates, the regular
  17. Neo4j docker instance starts, and picks up where the backup left off.
  18. This container is primarily tested against the backup .tar.gz archives produced by
  19. the `backup` container in this same code repository. We recommend you use that approach. If you tar/gz your own backups using a different approach, be careful to
  20. inspect the `restore.sh` script, because it needs to make certain assumptions about
  21. directory structure that come out of archived backups in order to restore properly.
  22. In order to read the backup files from cloud storage the container needs to provide some credentials or authentication. The mechanisms available depends on the cloud service provider.
  23. ### Use a Service Account to access cloud storage (Google Cloud only)
  24. **GCP**
  25. > Workload Identity is the recommended way to access Google Cloud services from applications running within GKE due to its improved security properties and manageability.
  26. Follow the https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity[GCP instructions] to:
  27. - Enable Workload Identity on your GKE cluster
  28. - Create a Google Cloud IAMServiceAccount that has read permissions for your backup location
  29. - Bind the IAMServiceAccount to the Neo4j deployment's Kubernetes ServiceAccount*
  30. [*] you can configure the name of the Kubernetes ServiceAccount that a Neo4j deployment uses by setting `serviceAccountName` in values.yaml. To check the name of the Kubernetes ServiceAccount that a Neo4j deployment is using run `kubectl get pods -o=jsonpath='{.spec.serviceAccountName}{"\n"}' <your neo4j pod name>`
  31. If you are unable to use Workload Identity with GKE then you can create a service key secret instead as described in the next section.
  32. ### Create a service key secret to access cloud storage
  33. First you want to create a kubernetes secret that contains the content of your account service key. This key must have permissions to access the bucket and backup set that you're trying to restore.
  34. **AWS**
  35. - You must create the credential file and this file should look like this:
  36. ```aws-credentials
  37. [default]
  38. region=
  39. aws_access_key_id=
  40. aws_secret_access_key=
  41. ```
  42. - You have to create a secret for this file
  43. ```shell
  44. kubectl create secret generic neo4j-aws-credentials \
  45. --from-file=credentials=aws-credentials
  46. ```
  47. **GCP**
  48. You do NOT need to follow the steps in this section if you are using Workload Identity for GCP.
  49. - You must create the credential file and this file should look like this:
  50. ```gcp-credentials.json
  51. {
  52. "type": "",
  53. "project_id": "",
  54. "private_key_id": "",
  55. "private_key": "",
  56. "client_email": "",
  57. "client_id": "",
  58. "auth_uri": "",
  59. "token_uri": "",
  60. "auth_provider_x509_cert_url": "",
  61. "client_x509_cert_url": ""
  62. }
  63. ```
  64. - You have to create a secret for this file
  65. ```shell
  66. kubectl create secret generic neo4j-gcp-credentials \
  67. --from-file=credentials=gcp-credentials.json
  68. ```
  69. **Azure**
  70. - You must create the credential file and this file should look like this:
  71. ```azure-credentials.sh
  72. export ACCOUNT_NAME=<NAME_STORAGE_ACCOUNT>
  73. export ACCOUNT_KEY=<STORAGE_ACCOUNT_KEY>
  74. ```
  75. - You have to create a secret for this file
  76. ```shell
  77. kubectl create secret generic neo4j-azure-credentials \
  78. --from-file=credentials=azure-credentials.sh
  79. ```
  80. If this service key secret is not in place, the auth information will not be able to be mounted as
  81. a volume in the initContainer, and your pods may get stuck/hung at `ContainerCreating` phase.
  82. ### Configure the initContainer for Core and Read Replica Nodes
  83. Refer to the single instance restore deploy scenario to see how the initContainers are configured.
  84. What you will need to customize and ensure:
  85. * Ensure you have created the appropriate secret and set its name
  86. * Ensure that the volume mount to /auth matches the secret name you created above.
  87. * Ensure that your BUCKET, and credentials are set correctly given the way you created your secret.
  88. The example scenario above creates the initContainer just for core nodes. It's strongly recommended you do the same for `readReplica.initContainers` if you are using read replicas. If you restore only to core nodes and not to read replicas, when they start the core nodes will replicate the data to the read replicas. This will work just fine, but may result in longer startup times and much more bandwidth.
  89. ## Restore Environment Variables for the Init Container
  90. - To restore you need to add the necessary parameters to values.yaml and this file should look like this:
  91. ```values.yaml
  92. ...
  93. core:
  94. ...
  95. restore:
  96. enabled: true
  97. secretName: (neo4j-gcp-credentials|neo4j-aws-credentials|neo4j-azure-credentials|NULL) #required. Set NULL if using Workload Identity in GKE.
  98. database: neo4j,system #required
  99. cloudProvider: (gcp|aws|azure) #required
  100. bucket: (gs://|s3://)test-neo4j #required
  101. timestamp: "latest" #optional #default:"latest"
  102. forceOverwrite: true #optional #default:true
  103. purgeOnComplete: true #optinal #default:true
  104. readReplica:
  105. ...
  106. restore:
  107. enabled: true
  108. secretName: (neo4j-gcp-credentials|neo4j-aws-credentials|neo4j-azure-credentials|NULL) #required. Set NULL if using Workload Identity in GKE.
  109. database: neo4j,system #required
  110. cloudProvider: (gcp|aws|azure) #required
  111. bucket: (gs://|s3://)test-neo4j #required
  112. timestamp: "2020-06-16-12:32:57" #optional #default:"latest"
  113. forceOverwrite: true #optional #default:true
  114. purgeOnComplete: true #optinal #default:true
  115. ...
  116. ```
  117. - standard neo4j installation run from the root of neo4j-helm - not tools/restore. Further, it could be the same command you ran to create the cores/replicas originally.
  118. ```
  119. helm install \
  120. neo4j neo4j/neo4j \
  121. -f values.yaml \
  122. --set acceptLicenseAgreement=yes
  123. ```
  124. ## Warnings & Indications
  125. A common way you might deploy Neo4j would be restore from last backup when a container initializes. This would be good for a cluster, because it would minimize how much catch-up
  126. is needed when a node is launched. Any difference between the last backup and the rest of the
  127. cluster would be provided via catch-up.
  128. [NOTE]
  129. For single nodes, take extreme care here.
  130. If a node crashes, and you automatically restore from
  131. backup, and force-overwrite what was previously on the disk, you will lose any data that the
  132. database captured between when the last backup was taken, and when the crash happened. As a
  133. result, for single node instances of Neo4j you should either perform restores manually when you
  134. need them, or you should keep a very regular backup schedule to minimize this data loss. If data
  135. loss is under no circumstances acceptable, do not automate restores for single node deploys.
  136. [NOTE]
  137. **Special notes for Azure Storage**. Parameters require a "bucket", but for Azure storage the
  138. naming is slightly different. There is no protocol scheme as there would be for AWS or Google.
  139. The bucket specified is the "blob container name", not the account name. where the
  140. files were be placed by the backup, and is the same "blob container name" you used in the Backup
  141. Chapter, assuming you followed the examples there. Relative paths will be respected; if you set
  142. bucket to be `container/path/to/directory`, this expects your backup files to be stored in
  143. `container` at the path `/path/to/directory/db/db-TIMESTAMP.tar.gz` where "db" is the
  144. name of the database being backed up (i.e. neo4j and system).
  145. ## Running the Restore
  146. With the initContainer in place and properly configured, simply deploy a new cluster
  147. using the regular approach. Prior to start, the restore will happen, and when the
  148. cluster comes live, it will be populated with the data.
  149. ## Limitations
  150. - If you want usernames, passwords, and permissions to be restored, you must include
  151. a restore of the system graph.
  152. - Container has not yet been tested with incremental backups