Terraforming Ghost: Persistent Volume - PART 3

Automate Ghost blog with Terraform, Docker and Cloudflare. Part 3: Data persistence with Digitalocean Block Storage volumes.

Terraforming Ghost: Persistent Volume - PART 3

In the last two articles of this series, we have worked on a repeatable setup for Ghost and Commento. The setup includes infrastructure on Digitalocean and Cloudflare, automated with Docker and Terraform and giving us an ephemeral setup. In this article, we will add data persistence to our configuration using Digitalocean Block Storage.

( Photo by Matteo Catanese )

Before moving further, a couple of links for the earlier chapters of this tutorial:

Also, the code is available on Github under MIT License. Grab it now and use it to follow along and then deploy your own stack.

git clone https://github.com/Vortexmind/terraforming-ghost.git

Target Architecture

To begin with, I have made a diagram showing what is our objective. Compared to the previous diagrams, I decided to try a different tool. Excalidraw is quite fun to use and you should give it a spin too if you haven't already!

Target Architecture for our Terraformed Ghost Part 3
Target Architecture for our Terraformed Ghost Part 3

This tutorial is focusing on adding the part on the right, DigitalOcean Block Storage. This is a separate volume that can be attached/detached as needed to a Digitalocean Droplet, and used as persistent memory.

The idea is that we will use it to store our databases data folders, our persistent configuration and our Let's Encrypt certificates. This will allow us to decouple these from the lifecycle of the Droplet. We can create / destroy the Droplet without losing all the blog data and the other key configurations every time.

What we want to do is the following:

  • Create a Digital Block Storage volume. We will use the Digital Ocean API to achieve this.
  • Modify our Terraform configuration to read our volume and attach it to the droplet that we already manage with it.
  • Modify our Cloud-Init file to mount the volume and create (if necessary) the required folder structure.
  • Modify our Docker Compose configuration to use the newly created volume as mount points for our containers.

There is a point I wanted to clear out before starting: why I use the Digitalocean API (and not Terraform) to manage the storage volume? My reasoning is simple: as the volume is intended to be persistent, I don't want to include it in the Terraform configuration. In short, when I run teraform destroy I don't want to delete the volume, and when I run terraform apply I want to pick up with the same data I had before.

If I need to start from scratch I can always delete and recreate the volume, but this is managed outside of the Terraform lifecycle. We still want to use Terraform to automate the attach / detach aspect of it.

Makes sense? Let's dive into the steps!

Digitalocean Block Storage

To create the volume, we need first to set up a Digitalocean API Token. The process is explained in detail at this link.

Once you have grabbed your token, go to your terminal and set it up as an environment variable so that it can be used by the simple script I created. My script expects DO_TOKEN variable to be populated with your token.

export DO_TOKEN='... your token here ...'

Then, in the scripts folder of my Github repository you will find a manage_volume.sh script. This is a simple wrapper for the Digital Ocean API calls required to create/delete the volume. To see the available parameters, just run it without any arguments.

./manage_volume.sh

By default, the volume will be named ghostvol and have a size of 15 GB in the lon1 Digitalocean region. If you are happy with these defaults, you can simply run the following command:

./manage_volume -o create

And your empty volume will be created (pre-formatted with ext4 filesystem) in DigitalOcean.

Terraform configuration

Let's focus on the main changes introduced since our previous version (click here for the full diff)

data "digitalocean_volume" "block-volume" {
  name   = var.digitalocean_volume_name
  region = var.digitalocean_droplet_region
}

[...]

resource "digitalocean_volume_attachment" "vol-attachment" {
  droplet_id = digitalocean_droplet.web.id
  volume_id  = data.digitalocean_volume.block-volume.id
}
digitalocean.tf

Here we have declared:

  • A data resource digitalocean_volume. This looks up the volume we created via the API so that it is visible and manageable by Terraform.
  • A digitalocean_volume_attachment ┬áresource, which instructs Terraform to attach our volume with the droplet that is managed by Terraform.

We have also modified the droplet user_data to include an additional template variable which will be passed to our Cloud-Init template: digitalocean_volume_name

The rest of the Terraform changes is pretty much self-explanatory. For convenience, I have updated the output.tf to include some "copy & paste" commands that are useful for managing our installation.

Cloud-Init configuration & Docker Compose

Finally, we have changed our existing cloud-init template web-cloud-init.yaml , which includes the Docker Compose declaration as well.

The main change in the Docker Compose configuration is the removal of the docker volumes declarations, which we modified into simple bind mounts.

The reason we do this is because we do not want to rely on Docker to manage our volumes. Same as before, the volume has its own lifecycle and what we want to do is just to mount it on the host and then map it to the container folders so that these can find the needed data directly.

Another key change has been added to the the runcmd section of our Cloud-Init file:

runcmd:
[...]
- mkdir -p /mnt/${digitalocean_volume_name}
- mount -o discard,defaults,noatime /dev/disk/by-id/scsi-0DO_Volume_${digitalocean_volume_name} /mnt/${digitalocean_volume_name}
- echo '/dev/disk/by-id/scsi-0DO_Volume_${digitalocean_volume_name} /mnt/${digitalocean_volume_name} ext4 defaults,nofail,discard 0 0' | sudo tee -a /etc/fstab
- mkdir -p /mnt/${digitalocean_volume_name}/mysql_data
- mkdir -p /mnt/${digitalocean_volume_name}/postgres_data
- mkdir -p /mnt/${digitalocean_volume_name}/www_data
- mkdir -p /mnt/${digitalocean_volume_name}/certificates_data
- mkdir -p /mnt/${digitalocean_volume_name}/logs_data
web-cloud-init.yaml

When we startup our droplet, we create the mount point for our volume.

We then mount the volume and we add it to /etc/fstab so this can survive a reboot. We then also create all the folders that are our bind mounts for when we start docker-compose later in the instruction list. The -p flag in mkdir allow us to ignore the command if the folders already exists (for example, when the volume was already bootstrapped with our data).

Putting it all together

With the above changes implemented, we are now able to:

  • Start our stack for the first time
  • Access our Ghost blog, create the admin accounts (Ghost and commento)
  • Write some posts and comments, modify Ghost settings etc...
  • Tear down the environment. Crucially, unless we decide to delete the volume, it will remain available for future use.
  • We can now stand up our Terraform stack again, and our new Droplet will be re-attached to our existing volume, so we can continue where we left it off, without any data loss.

This also allows us for example to take snapshots of the volume only instead of taking a backup of the entire droplet, which could save us some costs as well.

Going forward, I am planning to review and add further capabilities to this setup, but this was an important step to avoid a completely ephemeral setup!

Have you tried to run this? How did it work for you? Let me know your thoughts in the comments. If you missed the earlier episodes of this tutorial, I am including again the links below so you can see the whole picture:

Until next time, and happy hacking!

If you liked this article, follow me on Twitter for more updates!

Comments

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Comments are moderated so there will be a delay before your comment is published.