In Getting cockroachdb running on google cloud platform, I went through how to get a few docker containers running in an insecure, 3 node cockroachdb on a single VM. In real life, you'll want to run secured multiple nodes across multiple VMs with shared storage. This can be super complex and error prone if you, like me, don't really know what you're doing. Luckily, Kubernetes can help with orchestration, and cockroach have put together a great guide and some preconfigured .yaml files to help with that. I'll walk through that in this post. Initially we'll start with getting an insecure version running, just like we did in Getting cockroachdb running on google cloud platform, but this time using Kubernetes. The main effective difference between a secure and insecure version of cockroach is that certificates are required to be able to query the database from a client. A later post will highlight how to set up a secure version. What is KubernetesKubernetes helps you manage and deploy apps in containers in a process often called orchestration. It abstracts the idea of hardware and VMS away, and is open sourced and cloud supplier independent - so your configurations can be easily transported from Google to Amazon and other providers. I'm just going to focus on Google Cloud Platform though. You can read about Kubernetes here, although I must admit I find the documentation very heavy going. Most of Google Cloud platform stuff can be tough, but Kubernetes docs dives right in to the weeds from the outset, and in my personal view, it's hard to know where to start, what all the terms mean and how everything relates to each other. This article is a highly simplified version to remind myself. If it informs you in some way too, then so much the better. kubectl and gcloudI'll be using the CLI for cloud platform and Kubernetes to set all this up, so you first need to install them. I recommend you use the cloud shell (see Your own free linux VM) as it already has everything installed. If you want to use something else, then see these links for how to install gcloud and kubectl. Best to set up your default project and zone before doing anything else. Substitute the stuff in red with your own parameters.
Cluster and nodesA cluster is created and managed by Kubernetes Engine, and consists of a cluster master and nodes, where a node is a physical or virtual machine. A node could be a Google Compute engine VM ( as they will be in my example). Gcloud creates these for you, and the Kubernetes engine takes care of allocating your workload across these nodes. This is the idea of abstracting away the machines from the containers. You create a cluster like this
CredentialsOnce the cluster is created, you'll want its credentials loaded to your cloud shell environment so you can access it easily from the CLI
Yaml filesNow we are ready to use kubectl to configure the cluster. Kubernetes configuration can be a long, complex and tedious process and is usually done through the use of .yaml configuration files. Luckily, there is usually a prebaked one around for you to use. Cockroach provides this one for the next step. PodsA container is deployed as a member of a pod. Although a pod generally contains a single container, it can contain multiple containers which share the same storage, resources and lifetime. The master allocates pods across nodes and stops and starts them as required to optimize the workload. Stateful setsIf an application is stateful (for example if it uses common persistent storage shared between instances), Kubernetes needs to handle its pods in a special way. A database like cockroach, where the load is shared across multiple cockroach nodes (not the same concept as Kubernetes nodes), is clearly a stateful set. Using the yaml file provided by cockroach, kubectl will create multiple pods for cockroachdb containers in a statefulset , and assign persistent storage like so We end up with a cluster that looks like this
and heading over to the cloud console, we can see that it has created 3 VM nodes in the cluster which to handle the stateful set. and we can can check the pods in the statefulset, like so.
ScalingNow there are 3 replicas in this statefulset, and 3 compute engine nodes. It would be easy to make the assumption that there's a one to one relationship. However they are independent. Remember that the Kubernetes engine makes the decision of which pods to run on which nodes. You can scale a statefulset to have more pods, without necessarily affecting the number of VM nodes they are being run on. Below I have now scaled up to 4 pods (which are load sharing cockroach nodes) , still running across 3 VMs (Cluster nodes). StorageA pod has its own storage allocated, so each of our 4 pods should have some working space attached. and the shared database is shared across the 3 nodes. Here it is via the cloud console. Many resources can be viewed either via the CLIs (gcloud and kubectl), or visually through the cloud console under Kubernetes Engine or Compute Engine. JobNext we need to initialize cockroach with a Kubernetes job. This is defined by another yaml file provided by cockroach. In Kubernetes Engine/Workload we can find the log for running that under the cluster-init job To test the cockroachdb which is now all up and running. You can use the run command to make a container in a pod and execute it on a given host accessible via an endpoint
You could use exec rather than run to execute in an existing pod.
Accessing the dbTo make interactive queries inside the db, just exec in a pod. kubectl exec -it cockroachdb-3 ./cockroach -- sql --insecure Next you'll want to create an app to talk to the database and serve it's contents externally. I'll be using a server app to provide GraphQL access to an SQL database in the next post.
|
Services > Desktop Liberation - the definitive resource for Google Apps Script and Microsoft Office automation > Google cloud platform >