Configuring a Local, Scalable, High-Availability Kubernetes Postgres Service with Kubegres

In the last post we configured a high-availability vault server in a local k8s cluster. In this fourth post we are going to set up a local, k8s-managed, high-availability postgres database. Developers differ on whether to containerize their databases. The prevailing practice is to containerize the app--but leave database management to cloud providers, or run them on a VM. This is for good reason: containerization works most seamlessly for stateless components such as web apps, because containers are by definition ephemeral. Additionally, scaling production-grade databases and syncing data is notoriously difficult, so most developers choose to leave that chore to cloud-providers who offer databases-as-a-service.

Yet the draw to manage databases in k8s is alluring because we’d like to manage our entire stack (stateless and stateful) with a unified and declarative workflow. In addition, keeping our development and production environment as similar as possible gives us more confidence when testing or debugging.

In recent years managing databases in k8s has become more realistic as production-grade operators, such as the CrunchyData postgres operator have matured. Operators are k8s extensions that use custom resource definitions (CRDs) to manage application components. CRDs are developer-defined objects, defined with yaml, just like deployments, jobs etc.

My goal for this post was to see for myself how easily I could get one of these operators up-and-running locally. I tried to set up a database with two different production-grade operators: CrunchyData’s postgres-operator and Oracle’s mysql-operator. I was unable to successfully install either operator because I got various errors, and I wasn’t able to solve errors by asking for help on the two projects’ GitHub pages.

I was about to give up and settle on a different strategy whereby I'd have different configurations for development vs. production using kustomize to factor out common elements of the yaml manifests. The development environment would run one postgres instance and mount the data directory as a volume. For high-availability and fault-tolerance, I'd use a database-as-service in production. But something still bothered me about this. I didn't want to give up on the ideal of making my dev/prod environments as similar as possible.

Then I found kubegres, a very new postgres operator. It worked out-of-the-box for me. I had one issue, which was a misunderstanding on my part, and the kubegres folks quickly clarified it via their Issues queue.

With that background out of the way, let’s discuss how exactly how I got this set-up working. I borrowed heavily from kubegres' Getting Started guide, but added additional information and addressed some gotchas.

Cleanup the Client Code

Before we go on, let's clean up server.js a bit. First we'll factor out the JavaScript that interacts with vault.

Create a file, vault.js at the root of the project, and cut the vault code from server.js pasting it in vault.js. vault.js:

const fs = require("fs");
const axios = require("axios").default;

function Vault() {
  const axiosInst = axios.create({ baseURL: `${process.env.VAULT_ADDR}/v1` });
  const getHealth = async () => {
    const resp = await axiosInst.get(`/sys/health?standbyok=true`);
    return resp.data;
  };

  const getAPIToken = () => {
    return fs.readFileSync(process.env.JWT_PATH, { encoding: "utf-8" });
  };
  const getVaultAuth = async (role) => {
    const resp = await axiosInst.post("/auth/kubernetes/login", {
      jwt: getAPIToken(),
      role,
    });
    return resp.data;
  };
  const getSecrets = async (vaultToken) => {
    const resp = await axiosInst("/secret/data/webapp/config", {
      headers: { "X-Vault-Token": vaultToken },
    });
    return resp.data.data.data;
  };
  return {
    getAPIToken,
    getHealth,
    getSecrets,
    getVaultAuth,
  };
}
module.exports = Vault;

The top of server.js should now look like this:

const process = require("process");
const express = require("express");
const app = express();

// Import my fns that interact with Hashicorp Vault.
const Vault = require("./vault");
...

Let's also modify our server to expose two routes (/config/ and /films/), one to show config data from the ConfigMap and the secret vault data and the other to serve data from the database we are about to install. We are serving the config data for no real purpose other than to prove to ourselves that we haven't broken our previous code. Obviously we'd never serve secret data in real life. We'll also serve JSON instead of HTML.

Before the app.get endpoint definitions, add this line, which will configure express to serve JSON with 2-space indentations:

...
app.set("json spaces", 2);
...

First the /config/ endpoint. Replace the / endpoint with this:

const process = require("process");
const express = require("express");
const app = express();

// Import my fns that interact with Hashicorp Vault.
const Vault = require("./vault");
app.set("json spaces", 2);

// config endpoint showing configs, secrets etc.
app.get("/config", async (req, res) => {
  const vault = Vault();
  const vaultAuth = await vault.getVaultAuth("webapp");
  const secrets = await vault.getSecrets(vaultAuth.auth.client_token);
  res.json({
    MY_NON_SECRET: process.env.MY_NON_SECRET,
    MY_OTHER_NON_SECRET: process.env.MY_OTHER_NON_SECRET,
    username: secrets.username,
    password: secrets.password,
  });
});
...

Now point your browser to /config (make sure skaffold is up and running and you told minikube to serve your app via $ minikube start web-servive). You should see:

Install the kubegres Operator

I assume you left off from the previous post.

Download the operator manifest to the manifest directory. Assure you have skaffold dev running. Once you download this your project should rebuild on-the-fly:

$ wget \
   https://raw.githubusercontent.com/reactive-tech/kubegres/v1.9/kubegres.yaml \
   -P manifests/

kubegres.yaml should be downlaoded to your manifests directory, and skaffold should pick up the changes. Once skaffold does its thing, we should be able to see the controllers kubegres installed. Controllers are k8s objects that represent control loops that watch the state of your cluster, then make or request changes where needed:

$ kubectl get all -n kubegres-system
NAME                                               READY   STATUS    RESTARTS   AGE
pod/kubegres-controller-manager-798885c897-pzdgh   2/2     Running   6          11d

NAME                                                  TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)    AGE
service/kubegres-controller-manager-metrics-service   ClusterIP   10.106.22.161   <none>        8443/TCP   11d

NAME                                          READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/kubegres-controller-manager   1/1     1            1           11d

NAME                                                     DESIRED   CURRENT   READY   AGE
replicaset.apps/kubegres-controller-manager-798885c897   1         1         1       11d

Create a Secret Resource

Now, we are going to create as secret resource which supplies the (main and replica) postgres superuser passwords. In the manifests directory, create this yaml file:

manifests/postgres-secret.yaml:

apiVersion: v1
kind: Secret
metadata:
  name: postgres-secret
  namespace: default
type: Opaque
stringData:
  superUserPassword: postgresSuperUserPsw
  replicationUserPassword: postgresReplicaPsw

A few things to note: in real-life we'd have strong superuser passwords, and since we have a vault server running it would be much more secure to store these passwords there than in a plain-text secret. We'll fix this in the next post when we talk about injecting secrets.

Create postgres Pods

Now we'll create our set of postgres pods using the Kubegres CRD:

Create manifests/postgres.yaml:

apiVersion: kubegres.reactive-tech.io/v1
kind: Kubegres
metadata:
  name: postgres
  namespace: default
spec:
   replicas: 3
   image: postgres:13.2

   database:
      size: 200Mi
   customConfig: postgres-conf
   env:
      - name: POSTGRES_PASSWORD
        valueFrom:
           secretKeyRef:
              name: postgres-secret
              key: superUserPassword
      - name: POSTGRES_REPLICATION_PASSWORD
        valueFrom:
           secretKeyRef:
              name: postgres-secret
              key: replicationUserPassword

Once skaffold applies this, we'll have three postgres pods: one primary pod, and two replicas. The pods get the root password from env vars, set in the postgres-secret.yaml. Again, we'll address the security implications of this in the next post. For now we want to focus on getting the db service up-and-running.

The postgres Data Directory

You may wonder about postgres' data directory is in this configuration. After all, we haven't explicitly configured it within a manifest, and this directory is key to how postgres actually persists data. kubegres uses persistent volume claims (PVC) for the data directory. By default it uses whatever the cluster's default storage class. In the case of minikube the storage class is already provisioned and is located within minikube's internals:

$ kubectl get sc
NAME                 PROVISIONER                RECLAIMPOLICY   VOLUMEBINDINGMODE   ALLOWVOLUMEEXPANSION   AGE
standard (default)   k8s.io/minikube-hostpath   Delete          Immediate           false                  12d

When we configure the cluster for production, we'll have to customize this and set the pvc to a storage class that's appropriate for our cloud provider--perhaps Google Persistent Disk. We'll customize the storageClassName property in postgres.yaml when we get to that. Documentation for how to customize this is here: https://www.kubegres.io/doc/properties-explained.html.

Explicitly Order Manifests in skaffold

In the next section we are are going to configure the database via a k8s config manifest. We are going to want to ensure that our k8s objects are created in the proper order, so we need to change the deploy.kubectl.manifests entry in skaffold.yaml to this:

...
deploy:
  kubectl:
    manifests:
    - manifests/web-configmap.yaml
    - manifests/web-deployment.yaml
    - manifests/web-service.yaml
    - manifests/print-hello.yaml
    - manifests/postgres-conf.yaml
    - manifests/postgres-secret.yaml
    - manifests/postgres.yaml
...

Create postgres User and Database

A client app should connect to its own database, not the default postgres database that exists when postgres is first installed. Additionally it's a security problem to connect and interact with postgres as the superuser, so we need to figure out the best place to create a new, non-root user and a database. In kubegres the way to hook into the initialization is by overriding the primary init script. Create a new manifest, manifests/postgres-conf.yaml. In it we are going to embed a shell script that runs in the primary postgres pod. It calls psql, logging in as the super user and creates a new non-superuser along with a new database:

apiVersion: v1
kind: ConfigMap
metadata:
  name: postgres-conf
  namespace: default

data:
  primary_init_script.sh: |
    #!/bin/bash
    set -e

    customDatabaseName="web"
    customUserName="web"

    psql -v ON_ERROR_STOP=1 --username "$POSTGRES_USER" --dbname "$POSTGRES_DB" <<-EOSQL
    CREATE DATABASE $customDatabaseName;
    CREATE USER $customUserName WITH PASSWORD 'akhd5';
    GRANT ALL PRIVILEGES ON DATABASE "$customDatabaseName" to $customUserName;
    EOSQL

    echo Init script is completed;

Again, in a later post we will move the database name, user name to configs, and the password to a secret in vault. You should note that when you are writing the init script, it won't get executed after the first time the PVCs are provisioned. So you when debugging you will have to make sure to delete the PVCs:

$ kubectl delete pvc postgres-db-mypostgres-3-0 && \
   kubectl delete pvc postgres-db-mypostgres-2-0 && \
   kubectl delete pvc postgres-db-mypostgres-1-0
   ...

Connecting the Client to the Database

Now that we have the database pods up and running, we need to connect to the database via the client.

First we install node-postgres, the standard node postgres client:

$ npm i -S pg

Now, let's say we are writing a movie film app. Create a film.js module next to server.js. This will be like a film model:

film.js:

const { Pool } = require("pg");

function getPool() {
  return new Pool({
    user: "web",
    host: "postgres",
    database: "web",
    password: "akhd5",
    port: 5432,
  });
}

function createFilmsTable(pool) {
  pool.query(
    `CREATE TABLE IF NOT EXISTS films (
    title       varchar(40) NOT NULL,
    kind        varchar(10)
);`,
    (_err, _queryRes) => {}
  );
}

function getFilms(pool) {
  return new Promise((resolve) => {
    pool.query(`SELECT title, kind FROM "films";`, (err, queryRes) => {
      resolve(queryRes.rows);
    });
  });
}
async function populateFilmsTable(pool) {
  const films = await getFilms(pool);
  if (films.length === 0) {
    pool.query(
      `INSERT INTO "films"(title, kind)
    VALUES('Superman', 'drama');`,
      (_, queryRes) => {
        return queryRes.rows;
      }
    );
  }
}

module.exports = {
  createFilmsTable,
  getFilms,
  getPool,
  populateFilmsTable,
};

Again, for now we are hard-coding the database name, password, and host in our source code. Never do this in production code!. We'll move this sensitive data into vault in the next post.

Note that we use the service name, "postgres," for the host in the database connection string. This service is created in the metadata.name field in postgres.yaml. Now anytime we need to access the host, we use the string "postgres".

The populateFilmsTable function acts as a database fixture. In our case we are hard-coding one row for demo purposes.

Import the functions in server.js from film.js:

...
// Import my fns that interact with postgres.
const {
  createFilmsTable,
  populateFilmsTable,
  getFilms,
  getPool,
} = require("./film");

Before the endpoints are defined, right above:

...
app.get("/config", async (req, res) => {
...

create the database pool and create the films table:

...
let pool = getPool();
createFilmsTable(pool);

app.get("/config", async (req, res) => {
...

Now, create the films endpoint. If there are no films in the database yet, we populate them:

...
app.get("/films", async (req, res, next) => {
  let films = await getFilms(pool);
  if (films.length === 0) {
    await populateFilmsTable(pool);
    films = await getFilms();
  }
  res.json(films);
});

The entire server.js file should now look like this:

const process = require("process");
const express = require("express");
const app = express();
const {
  createFilmsTable,
  populateFilmsTable,
  getFilms,
  getPool,
} = require("./film");

const Vault = require("./vault");

app.set("json spaces", 2);

let pool = getPool();
createFilmsTable(pool);

app.get("/config", async (req, res) => {
  const vault = Vault();

  const vaultAuth = await vault.getVaultAuth("webapp");
  const secrets = await vault.getSecrets(vaultAuth.auth.client_token);
  res.json({
    MY_NON_SECRET: process.env.MY_NON_SECRET,
    MY_OTHER_NON_SECRET: process.env.MY_OTHER_NON_SECRET,
    username: secrets.username,
    password: secrets.password,
  });
});

app.get("/films", async (req, res, next) => {
  let films = await getFilms(pool);
  if (films.length === 0) {
    await populateFilmsTable(pool);
    films = await getFilms();
  }
  res.json(films);
});

app.listen(3000, () => {
  console.log("Listening on http://localhost:3000");
});

Testing Fault-Tolerance

To confirm replication is working and our set up is fault tolerant, let's kill a postgres pod and confirm our app still works:

$ kubectl get pods | grep postgres
mypostgres-1-0                         1/1     Running     0          3d2h
mypostgres-2-0                         1/1     Running     0          3d9h
mypostgres-3-0                         1/1     Running     0          2d23h

We can see our primary and replication pods. Let's kill two of the replication pods:

kubectl delete pod mypostgres-2-0 && \
  kubectl delete pod mypostgres-3-0 
...

Then:

$ kubectl get pods | grep postgres
mypostgres-4-0                         1/1     Running     0          3d2h
mypostgres-5-0                         1/1     Running     0          12s
mypostgres-6-0                         0/1     Running     0          2s

We can see that our replication pods were just brought up. If you go to your app in the browser during the deletion process, you'll see that the app never went down.

Killing the primary postgres pod is a bit more problematic as kubegres has to promote a replica pod to the primary position. This could take thirty seconds or so. In the meantime a client cannot connect to the database. This is why kubegres suggests built-in backoff-retries when interacting with the database.

Kubernetes Secret Management

Most apps consume secret data (e.g. API keys, database passwords etc.). We explored managing configuration in the first part of this series using configmaps. However, configmaps are meant
for storing non-sensitive configuration data because they are unencrypted at rest and usually are set by a yaml file, which would likely be checked into source control.

Continue reading "Kubernetes Secret Management"

Local Development in Kubernetes

In this second post of my Exploring Kubernetes series we're going to make our k8s development environment less onerous. The code can be found here.

There's continuing debate whether you should local k8s development at all. One camp says that you should develop on a remote cluster in order to mimic production as much as possible. While this has merits, there are two big disadvantages to remote development: cost and availability.

Continue reading "Local Development in Kubernetes"

Exploring Kubernetes

With cloud native and microservice architectures gaining traction, Kubernetes (k8s) has become the standard tool for managing deployments. But what is it, do I need it, and how do I most effectively get started with it? That's what this post aims to clarify.

I'm no k8s expert. I've been picking it up because I'm interested in the devops space and because I see the problem domain it solves in my daily work. I've found the best way to learn something is to simply start working with it. Even better, is to write about it, as writing reinforces what you learned. If you can't explain something clearly, then you don't really understand it.

In this series of posts we'll develop a basic expressjs server, use k8s to develop locally and deploy it to a production environment. We'll take it step-by-step. After the first post we'll have an expressjs server running and be able to deploy it via k8s to a development environment. In further posts we'll explore local development, secret management, production clusters and stateful components, like databases.

Continue reading "Exploring Kubernetes"

Evaluating an Existing Tech Project

A dilemma often faced by professional programmers is whether to work with what they have, or start fresh when inheriting others' code.

They say it's easier to write code than to read it (especially other people's code). Why is this?

  1. Unless the code is perfectly readable, it's challenging to put yourself in someone else's head.
  2. Every project has context and baggage that is often not documented.
  3. As a new developer you have the luxury to be a critic. The former developer had deadlines to meet, family crises etc. You will also, soon enough...

Continue reading "Evaluating an Existing Tech Project"

The Essence of Functional Programming

This post is aimed toward giving a basic understanding of functional programming (fp) for the beginning to intermediate JavaScript developer with a few years of experience. This is not an endorsement of applying such techniques in arbitrary circumstances. Other factors, such as existing code, co-workers' experiences and preferences etc. are important factors when deciding to use this style.

Functional programming (fp) is getting a lot of attention in the webdev/JavaScript community, due to the visibility of libraries such as ReactJs, Redux and Rxjs.

Continue reading "The Essence of Functional Programming"