externalexposure42.adoc 13 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081828384858687888990919293949596979899100101102103104105106107108109110111112113114115116117118119120121122123124125126127128129130131132133134135136137138139140141142143144145146147148149150151152153154155156157158159160161162163164165166167168169170171172173174175176177178179180181182183184185186187188189190191192193194195196197198199200201202203204205206207208209210211212213214215216217218219220221222223224225226227228229230231232233234235236237238239240241242243244245246247248249250251252253254
  1. [#externalexposure]
  2. # External Exposure of Neo4j Clusters when using client routing
  3. [abstract]
  4. This chapter describes how to route traffic from the outside world or Internet to a Neo4j cluster running in Kubernetes when using client routing.
  5. Generally these instructions are only required for versions of Neo4j before 4.3.0. If you are using Neo4j 4.3.0 or later look at xref::externalexposure.adoc[external exposure instructions]
  6. ## Overview / Problem
  7. As described in the user guide, by default when you install Neo4j, each
  8. node in your cluster gets a private internal DNS address, which it advertises to its clients.
  9. This works "out of the box" without any knowledge of your local addressing or DNS situation. The
  10. downside is that external clients cannot use the bolt+routing or neo4j protocols to connect to the cluster,
  11. because they cannot route traffic to strictly cluster internal DNS names. With the default helm install,
  12. connections from the outside fail even with proper exposure of the pods, because:
  13. 1. The client connects to Neo4j
  14. 2. Fetches a routing table, which contains entries like `graph-neo4j-core-0.graph-neo4j.default.svc.cluster.local`
  15. 3. External clients attempt and fail to connect to routing table entries
  16. 4. Overall connection fails or times out.
  17. https://medium.com/neo4j/neo4j-considerations-in-orchestration-environments-584db747dca5[This article discusses these background issues] in depth. These instructions are
  18. intended as a quick method of exposing Neo4j Clusters, but you may have to do additional work
  19. depending on your configuration.
  20. ## Solution Approach
  21. To fix external clients, we need two things:
  22. 1. The `dbms.connector.*_address` settings inside of each Neo4j node set to the externally routable address
  23. 2. An externally valid DNS name or IP address that clients can connect to, that routes traffic to the kubernetes pod
  24. Some visual diagrams about what's going on https://docs.google.com/presentation/d/14ziuwTzB6O7cp7fq0mA1lxWwZpwnJ9G4pZiwuLxBK70/edit?usp=sharing[can be found in the architectural documentation here].
  25. We're going to address point 1 with some special configuration of the Neo4j pods themselves. I'll explain
  26. the Neo4j config bits first, and then we'll tie it together with the external. The most complex bit of this
  27. is ensuring each pod has the right config.
  28. We're going to address point 2 with Kubernetes Load Balancers. We will create one per pod in our Neo4j
  29. stateful set. We will associate static IP addresses to those load balancers. This enables packets to flow from
  30. outside of Kubernetes to the right pod / Neo4j cluster member.
  31. ## Proper Neo4j Pod Config
  32. In the helm chart within this repo, Neo4j core members are part of a stateful set, and get indexes.
  33. Given a deployment in a particular namespace, you end up with the following hostnames:
  34. * `<deployment>-neo4j-core-0.<deployment>-neo4j.<namespace>.svc.cluster.local`
  35. * `<deployment>-neo4j-core-1.<deployment>-neo4j.<namespace>.svc.cluster.local`
  36. * `<deployment>-neo4j-core-2.<deployment>-neo4j.<namespace>.svc.cluster.local`
  37. The helm chart in this repo can take a configurable ConfigMap for setting env vars on these pods. So
  38. we can define our own configuration and pass it to the StatefulSet on startup. The `custom-core-configmap.yml`
  39. file in this directory is an example of that.
  40. ### Create Static IP addresses for inbound cluster traffic
  41. I'm using GCP, so it is done like this. Important notes here, on GCP the region must match your GKE
  42. region, and the network tier must be premium. On other clouds, the conceptual step here is the same,
  43. but the details will differ: you need to allocate 3 static IP addresses, which we'll use in a later
  44. step.
  45. ```shell
  46. # Customize these next 2 for the region of your GKE cluster,
  47. # and your GCP project ID
  48. REGION=us-central1
  49. PROJECT=my-gcp-project-id
  50. for idx in 0 1 2 ; do
  51. gcloud compute addresses create \
  52. neo4j-static-ip-$idx --project=$PROJECT \
  53. --network-tier=PREMIUM --region=$REGION
  54. echo "IP$idx:"
  55. gcloud compute addresses describe neo4j-static-ip-$idx \
  56. --region=$REGION --project=$PROJECT --format=json | jq -r '.address'
  57. done
  58. ```
  59. **If you are doing this with Azure** please note that the static IP addresses must be in the same
  60. resource group as your kubernetes cluster, and can be created with
  61. link:https://docs.microsoft.com/en-us/cli/azure/network/public-ip?view=azure-cli-latest#az-network-public-ip-create[az network public-ip create] like this (just one single sample):
  62. `az network public-ip create -g resource_group_name -n core01 --sku standard --dns-name neo4jcore01 --allocation-method Static`. The Azure SKU used must be standard, and the resource group you need can be found in the kubernetes Load Balancer that [following the Azure Tutorial](https://docs.microsoft.com/en-us/azure/aks/kubernetes-walkthrough) sets up for you.
  63. For the remainder of this tutorial, let's assume that the core IP addresses I've allocated here are
  64. as follows; I'll refer to them as these environment variables:
  65. ```shell
  66. export IP0=35.202.123.82
  67. export IP1=34.71.151.230
  68. export IP2=35.232.116.39
  69. ```
  70. We will also need 3 exposure addresses that we want to advertise to the clients. I'm going to set these
  71. to be the same as the IP addresses, but if you have mapped DNS, you could use DNS names instead here.
  72. It's important for later steps that we have *both* IPs *and* addresses, because they're used differently.
  73. ```shell
  74. export ADDR0=$IP0
  75. export ADDR1=$IP1
  76. export ADDR2=$IP2
  77. ```
  78. ### Per-Host Configuration
  79. Recall that the Helm chart will let us configure core nodes with a custom config map. That's good.
  80. But the problem with 1 configmap for all 3 cores is that each host needs *different config* for proper exposure.
  81. So in the helm chart, we've divided the neo4j settings into basic settings, and over-rideable settings. In
  82. the custom configmap example, you'll see lines like this:
  83. ```yaml
  84. $DEPLOYMENT_neo4j_core_0_NEO4J_dbms_default__advertised__address: $ADDR0
  85. $DEPLOYMENT_neo4j_core_1_NEO4J_dbms_default__advertised__address: $ADDR0
  86. ```
  87. In a minute, after expanding $DEPLOYMENT to be "graph",
  88. these variables have "host prefixes" - `graph_neo4j_core_0_*` settings will only apply to the host
  89. `graph-neo4j-core-0`. (The dashes are changed to _ because dashes aren't supported in env var naming).
  90. Very important to notice that these override settings have the pod name/hostname already "baked into them",
  91. so it's important to know how you're planning to deploy Neo4j prior to setting this up.
  92. These "address settings" need to be changed to match the 3 static IPs that we allocated in the previous
  93. step. There are four critical env vars, all of which need to be configured, for each host:
  94. * `NEO4J_dbms_default__advertised__address`
  95. * `NEO4J_dbms_connector_bolt_advertised__address`
  96. * `NEO4J_dbms_connector_http_advertised__address`
  97. * `NEO4J_dbms_connector_https_advertised__address`
  98. With overrides, that's 12 special overrides (4 vars each for 3 containers)
  99. So using this "override approach" we can have *1 ConfigMap* that specifies all the config for 3 members
  100. of a cluster, while still allowing per-host configuration settings to differ. The override approach in
  101. question is implemented in a small amount of bash that is in the `core-statefulset.yaml` file. It simply
  102. reads the environment and applies default values, permitting overrides if the override matches the host
  103. where the changes are being applied.
  104. In the next command, we'll apply the custom configmap. Here you use the IP addresses from the previous
  105. step as ADDR0, ADDR1, and ADDR2. Alternatively, if those IP addresses are associated with DNS entries,
  106. you can use those DNS names instead. We're calling them addresses because they can be any address you
  107. want to advertise, and don't have to be an IP. But these addresses must resolve to the static IPs we
  108. created in the earlier step.
  109. ```shell
  110. export DEPLOYMENT=graph
  111. export NAMESPACE=default
  112. export ADDR0=35.202.123.82
  113. export ADDR1=34.71.151.230
  114. export ADDR2=35.232.116.39
  115. cat tools/external-exposure-legacy/custom-core-configmap.yaml | envsubst | kubectl apply -f -
  116. ```
  117. Once customized, we now have a ConfigMap we can point our Neo4j deployment at, to advertise properly.
  118. ### Installing the Helm Chart
  119. From the root of this repo, navigate to stable/neo4j and issue this command to install the helm chart
  120. with a deployment name of "graph". The deployment name *must match what you did in previous steps*,
  121. because remember we gave pod-specific overrides in the previous step.
  122. ```shell
  123. export DEPLOYMENT=graph
  124. helm install $DEPLOYMENT . \
  125. --set core.numberOfServers=3 \
  126. --set readReplica.numberOfServers=0 \
  127. --set core.configMap=$DEPLOYMENT-neo4j-externally-addressable-config \
  128. --set acceptLicenseAgreement=yes \
  129. --set neo4jPassword=mySecretPassword
  130. ```
  131. Note the custom configmap that is passed.
  132. ## External Exposure
  133. After a few minutes you'll have a fully-formed cluster whose pods show ready, and which you can connect
  134. to, *but* it will be advertising values that Kubernetes isn't routing yet. So what we need to do next is to
  135. create a load balancer *per Neo4j core pod*, and set the `loadBalancerIP` to be the static IP address we
  136. reserved in the earlier step, and advertised with the custom ConfigMap.
  137. A `load-balancer.yaml` file has been provided as a template, here's how to make 3 of them for given static
  138. IP addresses:
  139. ```shell
  140. export DEPLOYMENT=graph
  141. # Reuse IP0, etc. from the earlier step here.
  142. # These *must be IP addresses* and not hostnames, because we're
  143. # assigning load balancer IP addresses to bind to.
  144. export CORE_ADDRESSES=($IP0 $IP1 $IP2)
  145. for x in 0 1 2 ; do
  146. export IDX=$x
  147. export IP=${CORE_ADDRESSES[$x]}
  148. echo $DEPLOYMENT with IDX $IDX and IP $IP ;
  149. cat tools/external-exposure-legacy/load-balancer.yaml | envsubst | kubectl apply -f -
  150. done
  151. ```
  152. You'll notice we're using 3 load balancers for 3 pods. In a sense it's silly to "load balance" a single
  153. pod. But without a lot of extra software and configuration, this is the best option, because LBs will
  154. support TCP connections (ingresses won't), and LBs can get their own independent IP addresses which can be
  155. associated with DNS later on. Had we used NodePorts, we'd be at the mercy of more dynamic IP assignment,
  156. and also have to worry about a Kubernetes cluster member itself falling over. ClusterIPs aren't suitable
  157. at all, as they don't give you external addresses.
  158. Inside of these services, we use an `externalTrafficPolicy: Local`. Because we're routing to single pods and
  159. don't need any load spreading, local is fine. link:https://kubernetes.io/docs/tasks/access-application-cluster/create-external-load-balancer/[Refer to the kubernetes docs] for more information on this topic.
  160. There are other fancier options, such as the link:https://kubernetes.github.io/ingress-nginx/[nginx-ingress controller]
  161. but in this config we're shooting for something as simple as possible that you can do with existing
  162. kubernetes primities without installing new packages you might not already have.
  163. [NOTE]
  164. **Potential Trip-up point**: On GKE, the only thing needed to associate the static IP to the
  165. load balancer is this `loadBalancerIP` field in the YAML. On other clouds, there may be additional steps
  166. to allocate the static IP to the Kubernetes cluster. Consult your local cloud documentation.
  167. ## Putting it All Together
  168. We can verify our services are running nicely like this:
  169. ```
  170. $ kubectl get service | grep neo4j-external
  171. zeke-neo4j-external-0 LoadBalancer 10.0.5.183 35.202.123.82 7687:30529/TCP,74.3.140843/TCP,7473:30325/TCP 115s
  172. zeke-neo4j-external-1 LoadBalancer 10.0.9.182 34.71.151.230 7687:31059/TCP,74.3.141288/TCP,7473:31009/TCP 115s
  173. zeke-neo4j-external-2 LoadBalancer 10.0.12.38 35.232.116.39 7687:30523/TCP,74.3.140844/TCP,7473:31732/TCP 114s
  174. ```
  175. After all of these steps, you should end up with a cluster properly exposed. We can recover our password
  176. like so, and connect to any of the 3 static IPs.
  177. ```shell
  178. export NEO4J_PASSWORD=$(kubectl get secrets graph-neo4j-secrets -o yaml | grep password | sed 's/.*: //' | base64 -d)
  179. cypher-shell -a neo4j://34.66.183.174:7687 -u neo4j -p "$NEO4J_PASSWORD"
  180. ```
  181. Additionally, since we exposed port 7474, you can go to any of the static IPs on port 7474 and end up with
  182. Neo4j browser and be able to connect.
  183. ## Where to Go Next
  184. * If you have static IPs, you can of course associate DNS with them, and obtain signed
  185. certificates.
  186. * This in turn will let you expose signed cert HTTPS using standard Neo4j techniques, and
  187. will also permit advertising DNS instead of a bare IP if you wish.
  188. ## References
  189. * For background on general Kubernetes network exposure issues, I'd recommend this article:
  190. https://medium.com/google-cloud/kubernetes-$TYPE-vs-loadbalancer-vs-ingress-when-should-i-use-what-922f010849e0[Kubernetes $TYPE vs. LoadBalancer vs. Ingress? When should I use what?]