This is a weblog about computing, culture et cetera, by . Read more.

Split tokens in Clojure

On Dhole Moments, there’s a nice post about a recent password reset vulnerability. Via the post, I learned about a simple technique called split tokens for making your password reset token validation more resistant to timing attacks. I wanted to poke at it a bit and ended up creating a tiny Clojure library for generating and validating split tokens, called split-token. Check it out if you’re into generating random tokens!

Enjoying the silence

A coffee dripper on top of a Nalgene bottle on a bench outdoors.
I brewed some coffee while on the go.

Last year, Finland closed down the week I had my winter vacation. This year, the government was debating movement restrictions. Since COVID-19 broke out in Finland, I’ve thought so many times that “surely this will be over by date X” just to see the date X to come and go. I’m not going to speculate about the unprecedented restrictions we’re going to see on my winter vacation the next year.

I spent a night at Liesjärvi. My mom said that it must have been nice to enjoy the silence in the nature. To which I say: I don’t know about that.

I was camping next to a lake and I was alone, except for the (quiet) mouse that wanted to inspect my backpack to see if there’s anything edible.

But still, somebody was camping on the other side of the lake and they chopped firewood. During the night, there were cranes calling. In the morning chickadees were singing and during the day woodpeckers pecked the wood. The lake was frozen and the ice was creaking, booming, and banging.

So much for the silence.

clojure.xml and untrusted input

Clojure’s standard library includes the namespace clojure.xml, which implements a XML parser. It’s not used much – which is great, because it’s vulnerable to XML external entity (XXE) attacks. It’s something that you want to be aware of if you’re using clojure.xml to process untrusted input.

Juha Jokimäki tweeted about this already back in 2014. However, I still see clojure.xml occassionally used, so I thought it’s a good idea to blog about it.

Note: clojure.xml is not to be confused with data.xml, which is a separate library. data.xml has disabled XXE by default.

XML external entity attacks

XML external entities allow you to refer to resources outside of the file that you’re processing. For example, you can include the content of an external file. Here’s an example from OWASP:

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "file:///etc/hostname" >]>

Let’s try it out:

;; I saved the example above as "hostname.xml"
(require 'clojure.xml '[ :as io])
(with-open [input (io/input-stream "hostname.xml")]
  (clojure.xml/parse input))
;; => {:tag :foo, :attrs nil, :content ["nixos\n"]}

My laptop’s hostname is nixos, so that checks out!

If you point the file:/// reference to a directory instead of a file, you get a listing of the directory contents. In principle, you could use http:// URLs too, but that did not work on my machine.

If you use a domain name in the file:// URL, Java tries to connect to it over FTP.

<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE foo [
  <!ELEMENT foo ANY >
  <!ENTITY xxe SYSTEM "file://" >]>

You might able to exfiltrate data using this mechanism. At least it’s a way to call home and if your FTP server is suitably broken, the parser seems to get stuck forever.

XML bombs

Juha Jokimäki’s example code also demonstrates a small XML bomb. An XML bomb is a short XML file gets expanded to a extremely large one when processed.

Luckily JDK defines some limits on the entity expansion to hinder this attack. The Wikipedia article has an example with a billion-time expansion, but JDK limits the expansion factor to 64 000 by default.

Thus, the Wikipedia example does not work, but here’s a 1.4 KB file gets expanded to 47 megabytes:

<?xml version="1.0"?>
<!DOCTYPE lolz [
 <!ELEMENT lolz (#PCDATA)>
 <!ENTITY lol
 <!ENTITY lol1 "&lol0;&lol0;&lol0;&lol0;&lol0;">
 <!ENTITY lol2 "&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;&lol1;">
 <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
 <!ENTITY lol4 "&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;&lol3;">
 <!ENTITY lol5 "&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;&lol4;">

Let’s try it:

;; Save the example above as "lol.xml"
(with-open [input (io/input-stream "lol.xml")]
  (-> (clojure.xml/parse input) (:content) (first) (count)))
;; => 50000000

(/ 50000000 1024.0 1024.0)
;; => 47.6837158203125

It’s not catastrophic: a single XML document won’t crash your server. Still, you might want to think about it if you process XML files from untrusted sources.


Juha Jokimäki shows how to create a parser that disallows the document type declarations (DTDs) required by the attacks above:

(defn startparse-sax-no-doctype [s ch]
    (doto (javax.xml.parsers.SAXParserFactory/newInstance)
      (.setFeature javax.xml.XMLConstants/FEATURE_SECURE_PROCESSING true)
      (.setFeature "" true))
    (parse s ch)))
(with-open [input (io/input-stream "hostname.xml")]
  (clojure.xml/parse input startparse-sax-no-doctype))
;; Execution error (SAXParseException) at (
;; DOCTYPE is disallowed when the feature "" set to true.

However, my recommendation is to replace clojure.xml with data.xml. It has a couple of benefits:

XXE processing is disabled by default:

;; clj -Sdeps '{:deps {org.clojure/data.xml {:mvn/version "0.2.0-alpha6"}}}'
(require '
(with-open [input (io/input-stream "hostname.xml")]
  ( input))
;; => #xml/element{:tag :foo}

XML bombs are subject to the same limits as clojure.xml, since both the libraries use JDK’s XML parsing facilities. If you want to prevent them altogether, you can disable DTDs by setting :support-dtd false:

(with-open [input (io/input-stream "lol.xml")]
  ( input :support-dtd false))
;; Error printing return value (XMLStreamException) at (
;; ParseError at [row,col]:[11,13]
;; Message: The entity "lol5" was referenced, but not declared.

Update: As a follow-up, see CLJ-2611 which aims to disable XXE processing in clojure.xml.

For more posts, see archive.