Docker Compose for a Twelve-Factor Ruby Rails app

2016-01-20 · Computing

I’m very much a fan of The Twelve-Factor App methodology. Docker (with Docker Compose) can be used to build on these ideas, allowing greater control over an app’s environment and reinforcement of the principles. Here, I present some notes related to setting up Docker Compose for a (not-quite) Twelve-Factor Ruby Rails app, specifically from commits I’ve made to Tunefl, my open-source LilyPond mini-score engraving and sharing for musicians. Note that the Docker Compose documentation has an example of containerising a Ruby Rails app. However, that installs gems as root and executes with elevated privileges. Besides, I have something more complex in mind! :)

preliminary thoughts

I tend to be cautious about changing from approaches which work well, so have kept an interested eye on the many touted advantages of using Docker without immediately rushing to convert everything. Much like any tool, including a screwdriver, Docker can be used for a variety of approaches to solve a problem. I’m not convinced that using Docker automatically means a solution is better than one without it or without explicit containerisation. However, there are a few specific things offered by Docker for development which became too tempting for me to resist:

version-locking of external service dependencies: Having a specific, repeatable recipe for library dependencies is vital for long-term stability and sanity. I’ve been increasingly trying to get into the habit of explicitly documenting all external service versions in READMEs (e.g. PostgreSQL 9.3). Docker allows these dependencies to be enforced.
virtualisation security: If you’re developing within a single organisation, isolated development zones might not be a concern. But if you’re working across multiple security contexts, such as for multiple clients, this becomes something you should consider seriously. If it’s possible to git pull some company’s code and start what you think is a local webserver, only to find a mistake or malicious modification causes another company’s code or keys to be sent over the network, you’ve got a problem. Virtualisation, such as using VirtualBox, protects against this. Docker makes this virtualisation less obtrusive.
networking security: Even if you’re running code in isolation, it’s important to remember the security of external services. If you simply access all your PostgreSQL databases under the same user, even setting a password won’t protect you against an app connecting to the wrong database. This could be something quite accidental even within the same app, such as wiping the wrong database when running a automated test suite. That alone is one reason why I prefer to access dev and test databases with different users and passwords, sometimes adding additional protection such as requiring an extra environment variable (e.g. CI=1) to be set for the test stack. Docker not only allows individual service instances to be separated, but also the entire network stack to be segmented, stopping not only one organisation’s project from accessing another’s, but also your test stack from accessing your dev stack.
mount restrictions: This is something I’ll confess I don’t usually bother with when developing. But Docker allows easy configuration of volumes, including mounting all of the working directory as read-only and granting specific exceptions. This means I can be sure than an app is not doing something unpleasant like writing a runtime file when I’m not expecting it (breaking when behind load-balancing, not to mention not being part of my server backup strategy).

docker-compose.yml

Let’s work inwards, starting from an existing Procfile definition of internal services:

# Procfile

web:            bundle exec thin start -p $PORT
worker:         bundle exec sidekiq

Tunefl also has a .env.example, designed to be copied to .env and adjusted (and copied to .test.env, etc. for multi-stack setups). This is the app’s central settings file in environment variable form, containing external service connections such as DATABASE_URL. I don’t use config/database.yml files or similar. We can continue to use these settings with only minimal changes, also keeping it easy for those who wish to install without Docker.

We’ll be using the newer networking method rather than the Docker container links method, so we need to change hostnames; e.g.:

# .env.example

DATABASE_URL=postgresql://tunefl_dev:password@tunefl_postgres_1.tunefl/tunefl_dev
REDIS_URL=redis://tunefl_redis_1.tunefl:6379/0

For the web service, we set the build context as the current directory, point to the templated .env, mount the current directory as read-only, declare Port 8080 within the container, accessible outside the container on an ephemeral port, and specify the web service previously defined in Procfile (PORT isn’t available here, so we specify it explicitly; the port is only used within the container, so this doesn’t worry me too much).

# docker-compose.yml

---
web:
  build: &build "."
  env_file: &env_file ".env"
  volumes:
    - ".:/srv/tunefl/:ro"
  ports:
    - "127.0.0.1::8080"
  command: "bundle exec thin start -p 8080"

For the worker service, we use the same build context and .env, using YAML node anchors and references to ensure settings are kept in sync (we could also use Docker Compose extends syntax), use a non-default Dockerfile as the worker requires additional packages to be installed, connect to the same volumes as web, and specify the worker service previously defined in Procfile. It’s worth noting that although volumes are shared, here, that’s only to facilitate default Tunefl local-storage setup (rather than AWS S3 or similar), as well as to make gem installation faster.

# docker-compose.yml (continued)

worker:
  build: *build
  env_file: *env_file
  dockerfile: "Dockerfile.worker"
  volumes_from:
    - web
  command: "bundle exec sidekiq"

For the postgres and redis services, we lock to particular versions, and use the images’ defaults.

# docker-compose.yml (continued)

postgres:
  image: "postgres:9.3.10"
redis:
  image: "redis:3.0.5"

I also used the opportunity to change the app to write logs only to STDOUT rather than also log/ directory files, but this wouldn’t be necessary for an app already following Twelve-Factor properly. Also because of the read-only mount and some difficulties with db/schema.rb getting changed on gem installation from the repository version, I disabled the rake db:schema:dump task, which is automatically called by rake db:migrate (this is an old Rails 3 app, so config.active_record.dump_schema_after_migration isn’t available, and the rake db:schema:load method doesn’t work for this app).

Dockerfile

Next, we need a Dockerfile to define how to build the image. In this case, we’ll actually create two: Dockerfile containing the main definition for the web service, and Dockerfile.worker containing the definition for the worker service, which requires additional packages to be installed. Where possible, we’ll speed up the builds by considering carefully the order of the Docker layers, and using a shared volume for gem installation. For an app with fewer services or less complex package dependencies, one Dockerfile would likely be sufficient.

We use SYNC comments, which have no special meaning, to denote which sections of Dockerfile and Dockerfile.worker should be kept in sync, either for Docker layer optimisation (SYNC: Dockerfile/1), or because it’s a shared pattern (SYNC: Dockerfile/2).

Dockerfile—web service

We base the build on the official Docker ruby:2.2.3 image, install packages needed for gem installation, and create a user:

# Dockerfile

# SYNC: Dockerfile/1 {
FROM ruby:2.2.3
RUN \
    apt-get update -y && \
    apt-get install -y \
        build-essential \
        libpq-dev && \
    useradd --home-dir /srv/tunefl/ --shell /usr/sbin/nologin tunefl
# SYNC: }

Next, we create the skeleton structure within our writeable volumes, and set ownership. It’s probably not necessary to create every one of these directories, but I had a list handy from Capistrano deployments, so I used this. Note that we don’t actually chown /srv/tunefl itself, as the default permissions allow the unprivileged user read access, and that’s all we need.

# Dockerfile (continued)

RUN \
    mkdir \
        /srv/tunefl.bundle/ \
        /srv/tunefl/ \
        /srv/tunefl/public/ \
        /srv/tunefl/public/assets/ \
        /srv/tunefl/public/system/ \
        /srv/tunefl/public/uploads/ \
        /srv/tunefl/tmp/ \
        /srv/tunefl/tmp/cache/ \
        /srv/tunefl/tmp/pids/ \
        /srv/tunefl/tmp/sockets/ && \
    chown -R tunefl \
        /srv/tunefl.bundle/ \
        # not /srv/tunefl/
        /srv/tunefl/public/ \
        /srv/tunefl/tmp/

Next, we copy the library dependency definitions. These aren’t owned by the unprivileged user either, ensuring that these definitions cannot accidentally change when we install the gems.

# Dockerfile (continued)

COPY [ \
    "Gemfile", \
    "Gemfile.lock", \
    "/srv/tunefl/"]

We set the working directory and become the unprivileged user, which will be used both to install the gems and to run the web service itself. We define BUNDLE_APP_CONFIG to point to a writeable volume owned by the unprivileged user. Then, we install the library dependencies.

# Dockerfile (continued)

# SYNC: Dockerfile/2 {
WORKDIR /srv/tunefl/
USER tunefl
ENV BUNDLE_APP_CONFIG /srv/tunefl.bundle/
# SYNC: }

RUN bundle install --path /srv/tunefl.bundle/

Finally, we declare the writeable volumes: /srv/tunefl.bundle to contain installed gems, and /srv/tunefl/public/ and /srv/tunefl/tmp/ to be mounted over the top of read-only /srv/tunefl. Ideally, we’d switch tmp/ to another location entirely, but this doesn’t appear to be straightforward with Rails 3.

# Dockerfile (continued)

VOLUME [ \
    "/srv/tunefl.bundle/", \
    "/srv/tunefl/public/", \
    "/srv/tunefl/tmp/"]

Dockerfile.worker—worker service

We keep the SYNC: Dockerfile/1 section in sync with that in Dockerfile, rather than modifying it to add our additional packages. This enables the Docker layer cache to be used, significantly increasing the speed of builds.

# Dockerfile.worker

# SYNC: Dockerfile/1 {
FROM ruby:2.2.3
RUN \
    apt-get update -y && \
    apt-get install -y \
        build-essential \
        libpq-dev && \
    useradd --home-dir /srv/tunefl/ --shell /usr/sbin/nologin tunefl
# SYNC: }

Next, we install the additional packages needed for the worker service. The lilypond package has lots of dependencies, and installation takes a long time. This fragment is placed so as to maximise the shared layers between Dockerfile and Dockerfile.worker. However, web does not need lilypond, so we keep that build clean, which is the whole motivation for the separate Dockerfile.worker.

# Dockerfile.worker (continued)

RUN \
    apt-get update -y && \
    apt-get install -y \
        lilypond

Finally, we keep the SYNC: Dockerfile/2 section in sync with that in Dockerfile, simply to reuse the shared pattern. We don’t benefit from the Docker layer cache cross-Dockerfiles here, as the previous steps have diverged. Note that we don’t need to set up any directories or install any gems, as we’re reusing the web service volumes. This means that gem installation should still be fast, because the work has already been done in Dockerfile.

# Dockerfile.worker (continued)

# SYNC: Dockerfile/2 {
WORKDIR /srv/tunefl/
USER tunefl
ENV BUNDLE_APP_CONFIG /srv/tunefl.bundle/
# SYNC: }

usage

In this section, I note some common usage commands. I’m using Docker Compose 1.5.2, which means the --x-networking flag is required to activate the automatic handling of a segregated project-specific bridge network. If you’re using Docker Compose 1.6 onwards, you probably won’t need this flag.

To build and start all services, both internal and external:

docker-compose --x-networking up

To view all running containers and their names:

docker ps

To open a browser pointed to whichever ephemeral port Docker has connected the web service to (you can also map to host ports explicitly, if you prefer):

xdg-open "http://$(docker-compose port web 8080)" # Linux

To connect to PostgreSQL:

docker exec -it tunefl_postgres_1 psql -U postgres

To connect to Redis:

docker exec -it tunefl_redis_1 redis-cli

To migrate a database using rake db:migrate:

docker exec tunefl_web_1 bundle exec rake db:migrate

To monitor Resque or Sidekiq jobs from the command-line using my Sidekiq Spy:

docker exec -it tunefl_worker_1 sh \
    -c 'TERM=xterm bundle exec sidekiq-spy -h tunefl_redis_1.tunefl -n resque'

More usage notes, including an approach to handling multiple stacks such as for executing an automated test suite, can be found in the Tunefl README.

Code related to this post is available from the Tunefl repository.