cecf
/
deploy


			
							123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196
							[#restore]
# Restoring Neo4j Containers

[NOTE]
**This approach assumes you have credentials and wish to store your backups
on Google Cloud Storage, AWS S3, or Azure Blob Storage**.  If this is not the case, you 
will need to adjust the restore script for your desired cloud storage method, but the 
approach will work for any backup location.

[NOTE]
**This approach works only for Neo4j 4.0+**.   The tools and the
DBMS itself changed quite a lot between 3.5 and 4.0, and the approach
here will likely not work for older databases without substantial 
modification.

## Approach

The restore container is used as an `initContainer` in the main cluster.  Prior to
a node in the Neo4j cluster starting, the restore container copies down the backup
set, and restores it into place.  When the initContainer terminates, the regular
Neo4j docker instance starts, and picks up where the backup left off.

This container is primarily tested against the backup .tar.gz archives produced by
the `backup` container in this same code repository.  We recommend you use that approach.  If you tar/gz your own backups using a different approach, be careful to
inspect the `restore.sh` script, because it needs to make certain assumptions about
directory structure that come out of archived backups in order to restore properly.

In order to read the backup files from cloud storage the container needs to provide some credentials or authentication. The mechanisms available depends on the cloud service provider.

### Use a Service Account to access cloud storage (Google Cloud only)

**GCP**

> Workload Identity is the recommended way to access Google Cloud services from applications running within GKE due to its improved security properties and manageability.

Follow the https://cloud.google.com/kubernetes-engine/docs/how-to/workload-identity[GCP instructions] to:

 - Enable Workload Identity on your GKE cluster
 - Create a Google Cloud IAMServiceAccount that has read permissions for your backup location
 - Bind the IAMServiceAccount to the Neo4j deployment's Kubernetes ServiceAccount*

[*] you can configure the name of the Kubernetes ServiceAccount that a Neo4j deployment uses by setting `serviceAccountName` in values.yaml. To check the name of the Kubernetes ServiceAccount that a Neo4j deployment is using run `kubectl get pods -o=jsonpath='{.spec.serviceAccountName}{"\n"}' <your neo4j pod name>`

If you are unable to use Workload Identity with GKE then you can create a service key secret instead as described in the next section.

### Create a service key secret to access cloud storage

First you want to create a kubernetes secret that contains the content of your account service key.  This key must have permissions to access the bucket and backup set that you're trying to restore.

**AWS**

- You must create the credential file and this file should look like this:
```aws-credentials
[default]
region=
aws_access_key_id=
aws_secret_access_key=
```

- You have to create a secret for this file
```shell
kubectl create secret generic neo4j-aws-credentials \
    --from-file=credentials=aws-credentials
```

**GCP**

You do NOT need to follow the steps in this section if you are using Workload Identity for GCP.

- You must create the credential file and this file should look like this:
```gcp-credentials.json
{
  "type": "",
  "project_id": "",
  "private_key_id": "",
  "private_key": "",
  "client_email": "",
  "client_id": "",
  "auth_uri": "",
  "token_uri": "",
  "auth_provider_x509_cert_url": "",
  "client_x509_cert_url": ""
}

```

- You have to create a secret for this file
```shell
kubectl create secret generic neo4j-gcp-credentials \
    --from-file=credentials=gcp-credentials.json
```

**Azure**

- You must create the credential file and this file should look like this:
```azure-credentials.sh
export ACCOUNT_NAME=<NAME_STORAGE_ACCOUNT>
export ACCOUNT_KEY=<STORAGE_ACCOUNT_KEY>
```

- You have to create a secret for this file
```shell
kubectl create secret generic neo4j-azure-credentials \
    --from-file=credentials=azure-credentials.sh
```

If this service key secret is not in place, the auth information will not be able to be mounted as
a volume in the initContainer, and your pods may get stuck/hung at `ContainerCreating` phase.

### Configure the initContainer for Core and Read Replica Nodes

Refer to the single instance restore deploy scenario to see how the initContainers are configured.

What you will need to customize and ensure:
* Ensure you have created the appropriate secret and set its name
* Ensure that the volume mount to /auth matches the secret name you created above.
* Ensure that your BUCKET, and credentials are set correctly given the way you created your secret.

The example scenario above creates the initContainer just for core nodes.  It's strongly recommended you do the same for `readReplica.initContainers` if you are using read replicas. If you restore only to core nodes and not to read replicas, when they start the core nodes will replicate the data to the read replicas.   This will work just fine, but may result in longer startup times and much more bandwidth.

## Restore Environment Variables for the Init Container

- To restore you need to add the necessary parameters to values.yaml and this file should look like this:
```values.yaml
...
core:
  ...
  restore:
    enabled: true
    secretName: (neo4j-gcp-credentials|neo4j-aws-credentials|neo4j-azure-credentials|NULL) #required. Set NULL if using Workload Identity in GKE.
    database: neo4j,system #required
    cloudProvider: (gcp|aws|azure) #required
    bucket: (gs://|s3://)test-neo4j #required
    timestamp: "latest" #optional #default:"latest"
    forceOverwrite: true #optional #default:true
    purgeOnComplete: true #optinal #default:true
readReplica:
  ...
  restore:
    enabled: true
    secretName: (neo4j-gcp-credentials|neo4j-aws-credentials|neo4j-azure-credentials|NULL) #required. Set NULL if using Workload Identity in GKE.
    database: neo4j,system #required
    cloudProvider: (gcp|aws|azure) #required
    bucket: (gs://|s3://)test-neo4j #required
    timestamp: "2020-06-16-12:32:57" #optional #default:"latest"
    forceOverwrite: true #optional #default:true
    purgeOnComplete: true #optinal #default:true
...
```

- standard neo4j installation run from the root of neo4j-helm - not tools/restore.  Further, it could be the same command you ran to create the cores/replicas originally.
```
helm install \
    neo4j neo4j/neo4j \
    -f values.yaml \
    --set acceptLicenseAgreement=yes
```

## Warnings & Indications

A common way you might deploy Neo4j would be restore from last backup when a container initializes.  This would be good for a cluster, because it would minimize how much catch-up
is needed when a node is launched.  Any difference between the last backup and the rest of the
cluster would be provided via catch-up.

[NOTE]
For single nodes, take extreme care here.  

If a node crashes, and you automatically restore from
backup, and force-overwrite what was previously on the disk, you will lose any data that the
database captured between when the last backup was taken, and when the crash happened.  As a
result, for single node instances of Neo4j you should either perform restores manually when you
need them, or you should keep a very regular backup schedule to minimize this data loss.  If data
loss is under no circumstances acceptable, do not automate restores for single node deploys.

[NOTE]
**Special notes for Azure Storage**.  Parameters require a "bucket", but for Azure storage the 
naming is slightly different.  There is no protocol scheme as there would be for AWS or Google.
The bucket specified is the "blob container name", not the account name. where the
files were be placed by the backup, and is the same "blob container name" you used in the Backup
Chapter, assuming you followed the examples there.  Relative paths will be respected; if you set 
bucket to be `container/path/to/directory`, this expects your backup files to be stored in 
`container` at the path `/path/to/directory/db/db-TIMESTAMP.tar.gz` where "db" is the 
name of the database being backed up (i.e. neo4j and system).


## Running the Restore

With the initContainer in place and properly configured, simply deploy a new cluster 
using the regular approach.  Prior to start, the restore will happen, and when the 
cluster comes live, it will be populated with the data.

## Limitations

- If you want usernames, passwords, and permissions to be restored, you must include
a restore of the system graph.
- Container has not yet been tested with incremental backups