Skip to main content

2015

lightweight database migrations using PostgreSQL

Update 2016-01-26: lightweight database migrations using PostgreSQL 2 details an update to the ideas outlined in this post, fixing various issues and streamlining the overall approach.


How can I run database migrations in a simple manner, with minimal dependencies, and using the database itself rather than a separate utility? To explore this, I use PostgreSQL. Not wanting to use anything particularly exclusive, I opt for PostgreSQL functions using SQL rather than PL/pgSQL. The latter would likely reduce duplication, but I don’t mind a little duplication so long as it’s centralised and gives me something in return. I’m also quite happy with using common utilities like cat and psql.

creating migrations

stage 0: initialisation

We need a way of storing metadata for migrations. We could use a dedicated table, but I opt for a more generic _meta table in case I want to store other metadata directly in the database in a similar format.

-- database/0_init.sql

CREATE TABLE IF NOT EXISTS _meta (
    key     VARCHAR     PRIMARY KEY,
    value   VARCHAR
);

INSERT INTO _meta (key, value) VALUES ('version', 0);

We could run this separately once, or just include it along with the migrations, if we’re not worried about the INSERT failure on subsequent runs. (If you’re running with psql -1, this would be a problem, but I’m specifying transactions explicitly.)

We create some temporary functions, which our migrations will be able to use:

-- database/0_init.sql

CREATE FUNCTION pg_temp.version() RETURNS integer AS $$
    SELECT value::integer FROM _meta WHERE key = 'version';
$$ LANGUAGE sql;

CREATE FUNCTION pg_temp.version(integer) RETURNS boolean AS $$
    UPDATE _meta SET value = $1 WHERE key = 'version';
    SELECT TRUE;
$$ LANGUAGE sql;

CREATE FUNCTION pg_temp.migrate(integer) RETURNS boolean AS $$
    SELECT CASE WHEN pg_temp.version() = $1 - 1 THEN NULL ELSE FALSE END;
$$ LANGUAGE sql;

The pg_temp.version() function gets the current migration version. We need to cast the type, because of the generic _meta table. The pg_temp.version(integer) function sets a migration version. It returns boolean to be compatible with how we’ll run the migrations, and will indicate that the migration actually executed. The pg_temp.migrate(integer) function returns whether a particular migration should be run. It should only be run if it is strictly the next migration in the sequence. This deals with excluding migrations which have previously been run, and whilst protecting against running migrations in the wrong order accidentally.

stage 1: migrations

We create an example migration:

-- database/1_migrate_1.sql

CREATE FUNCTION pg_temp.migrate_1() RETURNS boolean AS $$
    INSERT INTO _meta (key, value) VALUES ('tmp-1', 1);

    SELECT pg_temp.version(1);
$$ LANGUAGE sql;

This is the format that each of our migrations should be in: in its own file if you prefer; numbered consecutively starting from 1; wrapped in a temporary function named with the migration number suffix; returning boolean, with the final statement a SELECT calling the migration version setter. The INSERT INTO statement is merely an example of migration work, here.

stage 2: execution

We need a way of calling the migration functions:

-- database/2_exec.sql

BEGIN;
    SELECT
        pg_temp.version() "version_f",
        COALESCE(pg_temp.migrate(1), pg_temp.migrate_1()) "migrate_1",
        pg_temp.version() "version_t"
    ;
COMMIT;

We wrap this in a transaction, and chain the calls in the correct order. With PostgreSQL, even DDL statements will get rolled back if the transaction fails. The calls to the pg_temp.version() function report the starting and ending migration versions. The COALESCE() handles only running a migration if it should be run next and hasn’t been run already, using lazy-evaluation of parameters.

We make a small script to ensure we don’t forget which order to call the files in, returning the entire SQL code to be executed:

# database.sql.sh

#!/bin/sh

cat \
        database/0_init.sql \
        database/1_migrate_*.sql \
        database/2_exec.sql

This uses cat to concatenate the files in correct order of stages: 0, 1, 2. It doesn’t matter which order the individual migrations in stage 1 are concatenated, as they only define functions; it’s only in stage 2 they’re conditionally executed.

running migrations

Finally, we run the migrations against the database, using psql -x to transpose the output to list one line per migration:

database.sql.sh | psql -x
CREATE TABLE
INSERT 0 1
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
BEGIN
-[ RECORD 1 ]
version_f | 0
migrate_1 | t
version_t | 1

COMMIT

The CREATE TABLE and INSERT lines indicate that the _meta table was initialised. The CREATE FUNCTION lines indicate that our migrations have been loaded. The RECORD 1 shows that the migration version started at 0, ran migrate_1, and ended at 1.

If we run the migrations again:

database.sql.sh | psql -x
CREATE TABLE
ERROR:  duplicate key value violates unique constraint "_meta_pkey"
DETAIL:  Key (key)=(version) already exists.
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
BEGIN
-[ RECORD 1 ]
version_f | 1
migrate_1 | f
version_t | 1

COMMIT

The _meta table initialisation gives an error, but we can ignore that. The RECORD 1 shows that the migration version started at 1, didn’t run migrate_1, and ended at 1.

concluding thoughts

I like the simplicity of this approach, with SQL files able to be used without being wrapped in another language, and requiring only cat and psql as dependencies. I don’t like the repetition in database/2_exec.sql, although it does have its advantages in being able to define a custom name for each migration to display in the final report. Alternatively, a more complex function using PL/pgSQL could likely be used, wrapping the calls in a loop. The use of side-affects when calling using SELECT is rather hacky; perhaps I’d prefer it split into multiple statements. That would also make me more comfortable about evaluation-order guarantees of parameters; COALESCE provides the necessary assurances, but I’m not sure about SELECT. We could wrap a single transaction around the outside of everything using psql -1, but I’d be worried that I’d forget to type that one day and run migrations outside of a transaction by mistake.

The code in this post is also available in my coding notebook.

bank-account-statement 1.0.0 (Ruby) released

Dear Pixelings,

bank-account-statement: Bank account statement format transformation (HTML, TXT to OFX, CSV). (MIT Licence)

https://rubygems.org/gems/bank-account-statement
https://github.com/tiredpixel/bank-account-statement-rb

I’m pleased to announce bank-account-statement 1.0.0, the first major release!

Bank Account Statement is a program for transforming the format of bank account statements. For some reason, many banks don’t offer online bank statements in a readily-consumable format (this seems to be especially true in the UK). For example, despite it being possible to view bank statements online for UK bank The Co-operative Bank, statements for personal current (checking) accounts cannot be downloaded except in HTML. Business accounts often don’t fare much better.

Bank Account Statement mitigates this problem by providing input parsers and output generators, with a simple executable. Unlike various other similar programs, I am prepared to accept pull-requests for other banks and output formats. (Please remember to sanitise test fixtures!)

Changelog

1.0.0

  • [#2] add input format HTML/CPBKGB22/Personal/CreditCard/V_2011_04_09, parsing HTML The Co-operative Bank (GB) Personal Credit Card bank account statements downloaded from 2011-04-09 onwards [tiredpixel]

  • [#2] add input format HTML/CPBKGB22/Personal/CreditCard/V_2015_05_27, parsing HTML The Co-operative Bank (GB) Personal Credit Card bank account statements downloaded from 2015-05-27 onwards [tiredpixel]

  • [#2] extend output format OFX/V_2_1_1 to support credit card statements [tiredpixel]

  • [#7] fix HTML/CPBKGB22/Personal negative balances throughout [tiredpixel]

  • [#5] add input format TXT/CPBKGB22/Business/Current/V_2015_12_06, parsing TXT The Co-operative Bank (GB) Business Current account statements downloaded from 2015-12-06 onwards [tiredpixel]

  • [#4] add output format CSV/Column_2, generating ‘2-column’ (separate withdrawals and deposits columns) CSV files [tiredpixel]

  • first major release! :D supporting: The Co-operative Bank (GB) Personal account HTML input (2 Credit Card formats, 2 Current formats, 2 Savings formats); The Co-operative Bank (GB) Business account TXT input (1 Current format); OFX (Open Financial Exchange) 2.1.1 output; CSV output

Checksums

MD5

45cc7497b8e31f18fd14ee0bb7e17787  bank-account-statement-1.0.0.gem

SHA-256

c83d8272d3dad1167194d4963e9d557cb8167079386ef86ee916ed70917b651d  bank-account-statement-1.0.0.gem

Security

In addition to the checksums above, the Git tags are signed using my OpenPGP key, and the RubyGems packages are signed supporting

gem install bank-account-statement -P MediumSecurity

Peace,
tiredpixel

bank-account-statement 0.1.1 (Ruby) released

Dear Pixelings,

bank-account-statement: Bank account statement format transformation (HTML to OFX). (MIT Licence)

https://rubygems.org/gems/bank-account-statement
https://github.com/tiredpixel/bank-account-statement-rb

I’m pleased to announce bank-account-statement 0.1.1, the very first release! :D

Bank Account Statement is a program for transforming the format of bank account statements. For some reason, many banks don’t offer online bank statements in a readily-consumable format (this seems to be especially true in the UK). For example, despite it being possible to view bank statements online for UK bank The Co-operative Bank, statements for personal current (checking) accounts cannot be downloaded except in HTML. Business accounts often don’t fare much better.

Bank Account Statement mitigates this problem by providing input parsers and output generators, with a simple executable. Unlike various other similar programs, I am prepared to accept tested pull-requests for other banks and output formats. (Please remember to sanitise test fixtures!)

Changelog

0.1.1

  • update packaging; new version so tags match built packages

0.1.0

  • first release! :D MIT Licence

  • support Ruby 2.2.3 only

  • bank-account-statement executable: --in, --in-format, --out, --out-format, --in-formats, --out-formats, --help, --version [tiredpixel]

  • add input format HTML/CPBKGB22/Personal/Current/V_2011_05_07, parsing HTML The Co-operative Bank (GB) Personal Current account pre-2015-03-03 statements [tiredpixel]

  • [#1] add input format HTML/CPBKGB22/Personal/Current/V_2015_03_03, parsing HTML The Co-operative Bank (GB) Personal Current account 2015-03-03 onwards statements [tiredpixel]

  • [#3] add input format HTML/CPBKGB22/Personal/Savings/V_2011_05_07, similar to HTML/CPBKGB22/Personal/Current/V_2011_05_07 [tiredpixel]

  • [#3] add input format HTML/CPBKGB22/Personal/Savings/V_2015_03_03, similar to HTML/CPBKGB22/Personal/Current/V_2015_03_03 [tiredpixel]

  • add output format OFX/V_2_1_1, generating OFX (Open Financial Exchange) 2.1.1 [tiredpixel]

Checksums

MD5

28416921aa047bea26f849c5421a7ade  bank-account-statement-0.1.1.gem

SHA-256

67dd600f569b26e49968fe1a629340ee469ed7c7224115a87e1b15ff29efbc1f  bank-account-statement-0.1.1.gem

Security

In addition to the checksums above, the Git tags are signed using my OpenPGP key, and the RubyGems packages are signed supporting

gem install bank-account-statement -P MediumSecurity

Peace,
tiredpixel

GPG Key Transition Statement 3BD520F6 - 2F44FAFC

/2015/11/25/gpg-key-transition-statement-3bd520f6-2f44fafc.md.asc

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I’ve set up a new GPG key and email address, and am transitioning away from the
old one.

Old key (<tp@tiredpixel.com>):

        pub   2048R/3BD520F6 2013-02-02 [expires: 2015-12-04]
              Key fingerprint = 03B4 023C A1B3 53B9 9A56  044E 5F38 11EC 3BD5 20F6
        uid                  tiredpixel <tp@tiredpixel.com>

New key (<tiredpixel@posteo.de>):

        pub   2048R/2F44FAFC 2015-09-05 [expires: 2015-12-04]
              Key fingerprint = 36D4 DEB7 320A B51A C28D  7F8E D68A 70E3 2F44 FAFC
        uid                  tiredpixel@posteo.de <tiredpixel@posteo.de>

To download new key directly:

        https://www.tiredpixel.com/files/gpg/2F44FAFC.asc

To download new key using keyserver:

        gpg --keyserver keys.gnupg.net --recv-key 2F44FAFC

To display fingerprint of new key (check matches fingerprint above):

        gpg --fingerprint 2F44FAFC

To display signatures of new key (to check old key has signed):

        gpg --check-sigs 2F44FAFC

To verify this message:

        curl -sSL https://www.tiredpixel.com/2015/11/25/gpg-key-transition-statement-3bd520f6-2f44fafc.md.asc | gpg --verify

Peace, tiredpixel
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1

iQEcBAEBAgAGBQJWVamdAAoJEOFolTkanF7VihEIALOCX8k+5RvEPBznq2ZI3ESQ
dJHxXNDJPz+JLLPKpvw4uV/3n7FxE5p2aTh/O6d0exALrCbzXSHoLw+fYYpdfZ7K
BVl2c2TvbBcyTI+j1XDeEYD+ZtHQ5X1P3XVHX2+DDX7bU8G90YNzjjeQLuv/gKyn
65N4sovlPVqN3OqA9ZHD6ztqI+DD5t6P2bALpjJ3HU9JYMYllTGzDJ1MfdwPD82O
jMzqeh93fMa3Bl9D8kaAB73ixWB+H9IMQhF6hPGHWlIghK24yOimPaUfJqyLYRHY
glwwkx5wJdUbm7biJTsXvWEUCCvbPPU12p0jZXGy/YJde6XZNioFtTY5plXDXvKJ
ARwEAQECAAYFAlZVqZ0ACgkQ1opw4y9E+vyKEQgAiydjrRjgA7yu7HGkiMGaSe6s
Cupus55x34kdTXUQjPC2lKXtzzckAEkd516/kbW+gT5qBxmwZ8SLF6IvPIZn130O
b3G/AQgELXV+dH5PsMlVgxiNwoLK6ytX1bfusCbAXCKzTCw2xxhgleuaFUm5nEck
X7wqu9OW2SXgWKGz6FtKZzMpClJ6xpDmixlch+y7Rts28Hso+HUnrEyxvyph4T1s
PiXZm5eCDYJ+vl/rjr4xsa80PIe8tIhk8XlPUTvt/phBXnmXYV2T3wzefdbHpOnG
ll3eH4TIsRHnlfDSTESvGNxc35xBJY2p0On40UPf7GMBAKu6R40Je4dHdbAJLw==
=HqFT
-----END PGP SIGNATURE-----

pikka-bird-puppet 0.1.0 Python ops monitoring tool released


Dear Pixelings,

pikka-bird-puppet: Pikka Bird ops monitoring tool Puppet module.

https://forge.puppetlabs.com/tiredpixel/pikka_bird
https://github.com/tiredpixel/pikka-bird-puppet

I’m pleased to announce pikka-bird-puppet 0.1.0—the very first release, providing support for pikka-bird-collector 0.2.0 and pikka-bird-server 0.1.0.

Changelog:

  • first release! :D

  • configure a fully-functional metrics collector using just 1 parameter; around 30 parameters supported in total [tiredpixel]

  • configure a fully-functional metrics server using just 1 parameter (actually, you could do it with 0); around 25 parameters supported in total [tiredpixel]

  • support for Pikka Bird Collector 0.2.0, including: creating system user, upgrading Pip, installing Collector package, defining service, bringing online and sending metrics to Server [tiredpixel]

  • support for Pikka Bird Server 0.1.0, including: installing package bindings dependencies, installing Server package; defining service, migrating database, bringing online and receiving metrics from Collector [tiredpixel]

  • automatic database schema migrations [tiredpixel]

  • no Collector service configs (yet), but already automatic support for: load average; CPU usage; memory usage; disk usage [tiredpixel]

Peace,
tiredpixel

pikka-bird-collector 0.2.0 Python ops monitoring tool released


Dear Pixelings,

pikka-bird-collector: Pikka Bird ops monitoring tool Collector component.

https://pypi.python.org/pypi/pikka-bird-collector
https://github.com/tiredpixel/pikka-bird-collector-py

I’m pleased to announce pikka-bird-collector 0.2.0—a release providing out-the-box support for gathering metrics from MongoDB, MySQL, PostgreSQL, RabbitMQ, and Redis.

Changelog:

  • added MongoDB collector, supporting core status and replication status

  • added MySQL collector, supporting core status, master status, slave status, slave hosts, and variables

  • added PostgreSQL collector, supporting core status, replication status, and settings

  • added RabbitMQ collector, supporting core status and cluster status

  • added Redis collector, supporting core status and cluster status

  • reduced System collector payload size, deducting around 26% when using binary

  • added --conf support, reading single-file collector configs in JSON and YAML formats, or multi-file conf.d/-style directories

  • made numerous refactorings and test improvements throughout, including extending documentation extensively to provide examples of collector payloads

Peace,
tiredpixel