clojure.spec and untrusted input

If you’re going to use clojure.spec to validate or conform untrusted input, you should be careful. It’s easy to write code that looks correct, but opens the door for denial-of-service (DoS) attacks. For example, if you have implemented a HTTP API in Clojure and you use spec to check the incoming requests, you should be aware of this.

I believe that this is well-known among the experienced practitioners. For example, Dominic Monroe recently mentioned the issue in the defn podcast recently (the section starts around 12:30). However, I have not seen blog posts about this before.

In clojure.spec, specs for entity maps are open. This means that they are allowed to have keys that are not included in the spec. For example:

(require '[clojure.spec.alpha :as s])

;; Let's define a spec for an empty map
(s/def ::my-map (s/keys))

;; Empty map is valid, as expected
(s/valid? ::my-map {}) ; => true

;; Extra keys are allowed as well
(s/valid? ::my-map {:example "dog"}) ; => true

If the extra keys have specs, they will be validated as well:

(s/def ::my-map (s/keys))
(s/def ::number int?)
(s/valid? ::my-map {::number 2}) ; => true
(s/valid? ::my-map {::number "two"}) ; => false

This is great for many use cases, but it’s problematic for validating untrusted inputs. There are two potential problem:

  1. An attacker may be able to set fields that they were not supposed to set.
  2. An attacker may be able to make the validation very slow.

The first problem is nothing new – I’ve seen it in hand-written validation code as well. The second problem is clojure.spec-specific and I’m going to focus on it here

clojure.spec has support for structural regular expression specs with s/cat, s/+ and others. They’re usually used for writing specs for functions and implementing parsers in macros. Unfortunately they also make spec vulnerable to regular expression denial of service (ReDoS) attacks.

We can come up with a pathalogical regex. Here’s a Clojure vector equivalent of the regular expression (0+)+1:

(s/def ::slow (s/def ::slow (s/cat :ones (s/+ (s/+ #{0})) :zero #{1})))
(time (s/valid? ::slow [0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1]))
;; "Elapsed time: 1932.420446 msecs"
;; true

That’s slow, considering the input is a vector of 18 integers. Worse, the asymptotic complexity of the algorithm seems to be O(2^n). If you add one more zero to the input, it takes twice as long to validate it.

Now, most likely you would not use regular expressions specs to validate untrusted input. However, spec validates the extra map keys in the entity maps, so if you have loaded a library that defines a slow spec, an attacker may be able to craft an input that gets validated against it.

For example, Ghostwheel comes with some slow specs.

;; To load Ghostwheel: clj -Sdeps '{:deps {gnl/ghostwheel {:mvn/version "0.3.9"}}}'
(require '[ghostwheel.core :as g])
(time (s/valid? ::g/some-unsafe-ops (repeat 35 'let)))
;; "Elapsed time: 13896.968285 msecs"
;; true

Note: I’m not here to pick on Ghostwheel. Ghostwheel is using clojure.spec to parse macro input, which is an intended use case of clojure.spec. It was the first example of pathologically slow specs in the wild that I could find, but I’m sure there are more examples out there.

The input is a long list of let symbols. If your untrusted input comes in as JSON, it won’t get converted into symbols. However, many Clojure services use EDN or Transit, which support symbols. The input is about 300 bytes of Transit.

Pulling a denial-of-service attack based on this requires specific circumstances:

  • you process untrusted input encoded in EDN or in Transit, for example via a HTTP API,
  • you use entity maps (s/keys) to validate the input, and
  • you have loaded a library that defines slow specs.

That does not describe every Clojure backend service I have ever seen, but it’s not an unheard combination either.

As far as I can tell, Clojure itself does not come with pathologically slow specs. Still, there are some slow-ish specs available in core.specs.alpha, which always gets loaded.

(time (s/valid? (s/keys) {:clojure.core.specs.alpha/ns-clauses
                          (repeat 100000 (list :gen-class))}))
;; "Elapsed time: 4048.091583 msecs"
;; => true

;; How big is the input? About 1.24 MiB
(/ (count (pr-str (repeat 100000 (list :gen-class)))) 1024.0 1024.0)
;; => 1.2397775650024414

Workaround

One of the new features in spec 2 is the support for closed maps. That should improve the situation once it gets released. Meanwhile, you can use select-spec from spec-tools to remove the extra keys:

;; clj -Sdeps '{:deps {metosin/spec-tools {:mvn/version "0.10.5"}}}'
(require '[clojure.spec.alpha :as s] '[spec-tools.core :as st])

(s/def ::good int?)
(s/def ::good-map (s/keys :opt-un [::good]))

(st/select-spec ::good-map {:good 1, :bad 2})
;; => {:good 1}

Comments or questions? Tweet to me or send me an e-mail.