What does `identical?` do?

Dear fellow Clojure enthusiasts, do you know what the following two code snippets evaluate to? And why is that the result?

(identical? ##NaN ##NaN)
(let [x ##NaN] (identical? x x))

I didn’t know it a couple of days ago and it took me a while to understand it. Now I want to share my understanding.

Go on, make a guess and check it with a REPL. When you’re ready – or if you already saw me post about it on Twitter – scroll past the photos below for an explanation.

The photos in this post are from Lill-Skorvan island near Porkkalanniemi, Finland.

Here are the results on Clojure 1.11.1:

(identical? ##NaN ##NaN)          ; => true
(let [x ##NaN] (identical? x x))  ; => false

At first I thought that this is the NaN != NaN feature¹ of IEEE 754 floats, but that is not the case.

clojure.core/identical? checks the reference equality of its arguments. Its docstring says:

Tests if 2 arguments are the same object

##NaN refers to the constant Double/NaN, which is a primitive double. That is, it’s not an object. When a primitive value is passed to a Clojure function as an argument, it gets wrapped into an object. This is called boxing. Concretely this means calling Double/valueOf, which converts the primitive double into a java.lang.Double object.

The two snippets evaluate to different values because in the first snippet ##NaN gets boxed only once, but in the second snippet each function argument is boxed separately. This comes down to implementation details of Clojure. You can see the behavior in the disassembled byte code I posted on Ask Clojure.

When the reference equality of two boxed doubles is compared, they’re considered not equal even if they wrap the same value. This explains the results we saw.

Here’s a bonus exercise: what does this evaluate to and why?

(identical? 1.0 1.0)

I’ve used Clojure for a decade and there are still nooks and crannies I’m not familiar with. I guess it just takes a while to learn a programming language properly.

There are good reasons for the feature, or at least they were good back in the day. A lot of programmers dislike floats, but my hot take is that they’re actually successful solution to a complicated problem. What we should do is to start using decimal floats, which would match programmer intuitions better than binary floats. ↩︎