Tasks of a schema migration tool

The last time I wrote about how database schema migration tools could do more to help us. The way I see it, any schema migration solution has to cover three tasks: creating, managing, and executing migrations.

Creating migrations. The first task is to create a migration script. If your scripts consist of DDL commands in SQL files, you’ll probably write them by hand. A linter like Squawk can help you to ensure the migrations do not cause unnecessary downtime.

Things get more interesting if you can describe your target schema in another programming language or as data. For example, when using Django’s ORM or automigrate for Clojure, the tool generates the migration automatically by comparing the target schema to the current schema. Sometimes you may need to edit the generated migration by hand, but mostly the tool does the job for you.1

Managing migrations. You need to keep track of which migrations have been applied to each of your databases (local, staging, production). When you want to migrate a database, the tool is able to give you a list of required migrations and the order in which they should be performed. The tools tackling this problem end up having essentially same features.

One problem that migration managers need to solve is how to support multiple developers creating migrations at once. If you number your migrations incrementally like Flyway does, migrations in concurrenty-developed branches may end up using the same number.

To avoid this, some tools such as dbmate and Migratus use timestamps in migration names. Ben Kuhn’s sketch of a migration tool has an alternative solution: it uses incrementing numbers, but it provides a rebase command to automatically renumber the migrations based on Git history. That’s a nice touch!

Executing migrations. At the basic level this means running the DDL commands from your SQL files. However, as we learned the last time, you need to be careful to avoid downtime and do things like set lock_timeout in PostgreSQL.

While every tool that manages migrations can also execute them, there are specialist tools such as gh-ost for MySQL which focus only on executing migrations in a robust way without touching any of the other tasks.

Again, not using DDL for migrations seems to open the door for innovation. For example, Reshape for PostgreSQL has a novel way of executing migrations: by using schemas, views, and triggers, it allows you to use both the old and the new schema at the same time. This means that your application does not have to be backwards-and-forwards compatible with database changes.

As seen, there are plenty of tools out there for dealing with these tasks. There’s no one-size-fits-all solution. If your needs are simple, a simple tool will suffice. However, if you’re looking to ensure safety and efficiency, you should look further.

  1. You could describe the target schema with a CREATE TABLE command and have a tool generate the migration by comparing that against the current schema. I’m not aware of such tool, however. ↩︎

Comments or questions? Send me an e-mail.