When you’re writing code with a modern editor or an IDE, you can count on having
a number of convenient features such as syntax highlighting, autocomplete, and
code navigation. Unless you’re writing SQL, of course!
A lot of editors and IDEs, such as IntelliJ IDEA, have really nice support for
SQL. However, it’s not so easy to benefit from it in practice. If you’re writing
Clojure and you want to use SQL to query a database, you have a few options:
Embed SQL into strings in your Clojure code.
Put SQL into .sql files and import them with HugSQL or similar.
Use a high-level query builder or an ORM such as Toucan.
Let’s look into each of them in more detail.
Embed SQL into strings. A basic query would look like this:
(jdbc/querydb["SELECT title, rating FROM movies WHERE movie_id = ?"movie-id]))
This is simple, but most likely you aren’t getting syntax highlighting or other
features for the string. For example, IntelliJ IDEA does support SQL inside Java
strings, but it does not work in Cursive.
Furthermore, if you’d need to parametrize query in a way that is not allowed by
the ? placeholder, you’ll have to resort to string templating, which is prone
to SQL injection bugs.
Put SQL into .sql files. With HugSQL, the SQL file would look something
What is great about this approach is that now you get the full editor support
for SQL. In Cursive, however, you can’t jump from Clojure code that refers to
get-movie-by-id to its definition in SQL. Jumping to definition works with
normal Clojure functions, but Cursive does not know how to deal with functions
defined by HugSQL.1
HugSQL has advanced support for parametrizing your queries using snippets and
Clojure expressions. If you use Clojure expressions, though, you now have the
opposite problem: there’s Clojure code embedded into your SQL comments and
there’s no editor support for it.
Use a low-level query builder. With HoneySQL, the query would be built
There’s a lot to like about this. Your editor’s Clojure support works with this.
You won’t get autocomplete for database identifiers, but completion for commonly
used keywords can be good enough. Queries are Clojure data, so you can use the full power of Clojure to generate them.
The main problem is that you need to learn a new, non-canonical syntax for your
SQL queries. You know how to write the query you want in SQL, but now you need
to figure out how to map it to HoneySQL. It shouldn’t be too hard, but over the
years me and my colleagues have struggled with it.
Use a high-level query builder or an ORM. With Toucan, the query would look like this:
From the editor support perspective, this is about the same as using HoneySQL.
It does save you quite a bit of boilerplate. Toucan relies on HoneySQL for
advanced paremetrization, so the syntax problem remains.
None of the approaches seems like an obvious winner. In practice, every big
project I’ve seen has used a mix of them.
Contrast this with Datomic and MongoDB: the query languages of both can be
represented cleanly enough as Clojure data and so that’s what you use. This
assortment of options does not exist for them because it’s not needed.
In his article Against SQL, Jamie Brandon argues that SQL’s drawbacks cause
“a massive drag on quality and innovation in runtime and tooling”. He does not
mention editor support, but I can’t help but think that it’s an example of the
effects of that drag.
I think jumping to definition works in CIDER and/or Calva, but have not verified it. ↩︎
At first I thought that this is the NaN != NaN feature1 of IEEE 754 floats, but
that is not the case.
clojure.core/identical? checks the reference equality of its arguments.
Its docstring says:
Tests if 2 arguments are the same object
##NaN refers to the constant Double/NaN, which is a primitive double. That is, it’s not
an object. When a primitive value is passed to a Clojure function as an
argument, it gets wrapped into an object. This is called boxing. Concretely this means calling
Double/valueOf, which converts the primitive double into a
The two snippets evaluate to different values because in the first snippet
##NaN gets boxed only once, but in the second snippet each function argument
is boxed separately. This comes down to implementation details of Clojure.
You can see the behavior in the disassembled byte code I posted on Ask Clojure.
When the reference equality of two boxed doubles is compared, they’re considered
not equal even if they wrap the same value. This explains the results we saw.
Here’s a bonus exercise: what does this evaluate to and why?
I’ve used Clojure for a decade and there are still nooks and crannies I’m not familiar with.
I guess it just takes a while to learn a programming language properly.
There are good reasons for the feature, or at least they were good back in the day. A lot of programmers dislike floats, but my hot take is that they’re actually successful solution to a complicated problem. What we should do is to start using decimal floats, which would match programmer intuitions better than binary floats. ↩︎
Creating migrations. The first task is to create a migration script. If
your scripts consist of DDL commands in SQL files, you’ll probably write them by hand.
A linter like Squawk can help you to ensure the migrations do not cause unnecessary downtime.
Things get more interesting if you can describe your target schema in another
programming language or as data. For example, when using Django’s ORM or
automigrate for Clojure, the tool generates the migration automatically
by comparing the target schema to the current schema. Sometimes you may need
to edit the generated migration by hand, but mostly the tool does the job for
Managing migrations. You need to keep track of which migrations have been
applied to each of your databases (local, staging, production). When you want to migrate a database, the tool is able
to give you a list of required migrations and the order in which they should be
performed. The tools tackling this problem end up having essentially same
One problem that migration managers need to solve is how to support multiple
developers creating migrations at once. If you number your migrations
incrementally like Flyway does, migrations in concurrenty-developed
branches may end up using the same number.
To avoid this, some tools such as dbmate and Migratus use timestamps in migration names.
Ben Kuhn’s sketch of a migration tool has an alternative solution: it
uses incrementing numbers, but it provides a rebase command to
automatically renumber the migrations based on Git history. That’s a nice touch!
Executing migrations. At the basic level this means running the DDL commands
from your SQL files. However, as we learned the last time, you need to be
careful to avoid downtime and do things like set lock_timeout in
While every tool that manages migrations can also execute them, there are
specialist tools such as gh-ost for MySQL which focus only on executing
migrations in a robust way without touching any of the other tasks.
Again, not using DDL for migrations seems to open the door for innovation.
For example, Reshape for PostgreSQL has a novel way of executing
migrations: by using schemas, views, and triggers, it allows you to use both the
old and the new schema at the same time. This means that your application does
not have to be backwards-and-forwards compatible with database changes.
As seen, there are plenty of tools out there for dealing with these tasks.
There’s no one-size-fits-all solution. If your needs are simple, a simple tool
will suffice. However, if you’re looking to ensure safety and efficiency, you
should look further.
You could describe the target schema with a CREATE TABLE command and have a tool generate the migration by comparing that against the current schema. I’m not aware of such tool, however. ↩︎