Say you’re going to write some code to parse and validate logical boolean clauses as data. Maybe you’re writing a DSL to allow users to express some rules for a rules engine, e.g. “date is today and junk is true”. Wouldn’t it be nice to be able to validate these logical declarations, perhaps by defining a specification declaratively?!

Clojure.spec gives us the tools to define a specification, evaluate arbitrary input against it, and (by way of clojure/test.check) even generate random sample data accordant with the specification.

Simplifying Assumptions

Let’s assume a simple case where each logical expression is just a keyword, and they can be joined by boolean operators (also represented as keywords) in infix notation. We’ll assume only and and or operators are allowed.

[:x :and :y]

We can pretend :x and :y refer to some other logical expression for now.

Of course users want to group and nest these logical expressions.

[:x :and :y [:a :or :b [:all :or :nothing]]]

And of course we’ll assume we’re able to parse user input into this vector/keyword form.

Declaring Specs

When we consider this problem, there are only two unique ingredients to our expressions:

  1. Expressions
  2. Boolean operators

Clojure.spec defines a simple language of easily-composed primitives to allow very rich specifications over data. We can start by defining a spec per ingredient, and let’s start simple with the boolean operators.

(def op-keys #{:and :or})

Sets can be used as specs, so our spec for operators is simply a set of keywords.

Those operators need operands. Here we’ll define a spec called ::expression:

(s/def ::expression
  (s/and keyword?
         #(not (contains? op-keys %))))

Specs are resolved by name and they are named by namespaced keywords. The double-colon prefix is shorthand for whatever the current namespace is.

Our expression is just a keyword but it can’t be one of the operators, so we use spec’s and to combine two logical predicates.

We have our two ingredients but we need a recipe for how they can be combined.

(s/def ::group
  (s/cat :head ::expression
         :tail (s/*
                 (s/cat :op     op-keys
                        :clause (s/or :expr  ::expression
                                      :group ::group)))))

This ::group spec defines our overall expression as a concatenation of elements. It must always have some head element which conforms to our ::expression spec above. Maybe just one thing must be true, and that’s fine. 🤷‍♂️ After the head there can be any number * of additional expressions, which are also concatenations of an operator and… you guessed it, more expressions, which may contain more expressions, which may contain more expressions. Very expressive.

Our specification is recursive—a grouping of expressions is itself an expression. Luckily, defining a recursive spec is trivial! On the last line we simply refer to the ::group spec from within its own definition.

Trial & Error

Whenever I’m writing a spec, I’m always playing with it in the REPL as I go. Unsurprisingly I went through many (oft non-working) iterations before landing on the oh-so-elegant-and-plainly-correct spec above.

(s/conform ::group [:x])
=> {:head :x}

Clojure.spec’s conform function takes a spec (either a namespaced keyword that refers to one or an inline spec) and some data to be conformed. A very interesting aspect of clojure.spec is that the process of conforming input data to a spec can produce arbitrarily different output data. In this case, the cat and or functions will tag our output as you can see above; we get a map back telling us where our :head is.

(s/conform ::group [:x :y])
=> :clojure.spec.alpha/invalid

When the input data does not conform, we get back a special namespaced keyword. If we simply wanted a yes or no predicate we could use s/valid?.

Let’s put it to some slightly tougher tests:

(s/conform ::group [:x :and :y])
=> {:head :x, :tail [{:op :and, :clause [:expr :y]}]}
(s/conform ::group [:x :and :y :or]) ;; no good trailing operator
=> :clojure.spec.alpha/invalid
(s/conform ::group [:x :and :y :or :z])
=> {:head :x, :tail [{:op :and, :clause [:expr :y]} {:op :or, :clause [:expr :z]}]}
(s/conform ::group [:x :and :xx :and :yy :and [:y :or :z :or [:foo :and :bar]]])
=>
{:head :x,
 :tail [{:op :and, :clause [:expr :xx]}
        {:op :and, :clause [:expr :yy]}
        {:op :and,
         :clause [:group
                  {:head :y,
                   :tail [{:op :or, :clause [:expr :z]}
                          {:op :or, :clause [:group {:head :foo, :tail [{:op :and, :clause [:expr :bar]}]}]}]}]}]}

Generation

Surely after you validate these input clauses you’re going to want to do something with them? What if you could conjure up a universe of test case inputs for that something? (Spoiler: you can.) Using test.check we can randomly generate samples of our spec:

 (sgen/sample (s/gen ::group))
 => ;; abbreviated output
 (:o/h?)
 (:E :and :T :or (:X :or (:?* :or :f1/a :and (:+ :and :s))) :and (:z- :or (:r/D :and (:+M) :or :+W/-0)))
 (:s*/PC :or :-P :or (:N-/_ :and :W/_-) :and (:!))
 (:J__/e1 :and :K :and (:K) :and :X7m.W/Dt :or (:W))

Nice.

Bonus Round

What if you needed to emit some sort of string query for these expressions?

(defn clause-str [clause] ;; it's too early for doc strings
  (walk/postwalk
    (fn [elem]
      (if (coll? elem)
        (format "(%s)" (cs/join " " (map name elem)))
        elem))
    clause))

(clause-str [[:x :or :y] :and :x])
=> "((x or y) and x)"

Here we use clojure.walk’s postwalk to perform a depth-first traversal of our data strucure (a vector of keywords and/or vectors). The depth-first part is important because we want our transformed output to reflect the nested group structure of our data, which we accomplish here by simply parenthesizing the group strings. For the group elements we just take the name of each keyword joined by spaces.

Let’s exercise clause-str with those randomly generated samples.

(map clause-str (sgen/sample (s/gen ::group)))
=>
("(e)"
 "(u)"
 "(!8 or xQ and (b and ?! and (b)))"
 "(L1)"
 "(!M)"
 "(?0V)"
 "(+J or (v? or A and (+4 or (z+ or I or (d+))) and (uhd and ! or (i or (RS1)))) or *82 or SC and (B or _ or G+! or Lj or (_ or j and (O?* or +hK and (P) or Y8 or z and (p4+) and o-r) and (z8 or y*2 and n or (G-) or TFi and (FH)) or GSc or (ZF or (!-3)) and +?)))"
 "(OrT or (h-! or (c or (- and me and G) and (-6 and (G) and ?)) and M and (b or (+*+) and (c* and (n8-) or (v_)) and pKM or a7t or (l or o and (Fz) and (e7x) or (+9Y) and Cl or Mm) and (E or (-) or (_7G) and (M6q) or Uv and +m and (a*3))) or (!+ and ! or (+D? and (e) and (l) and (D)))) or (-K or (_* and p5 and (Wv and (b8+) or k! and W or R7? and l8V or oli) and p3p or (u or (w7P) or +Vi and n and x6 and (?) and (+_))) or n and !7v and rKQ) and t!3 or (c and (x2) or (_+ and (rE and (*2X) or (V) and C or (MW)) and (vx_ and (vZ3) and W5 or (+8?) or (G6)) or (X or d3A or J*9 and zS+) or (G_l or EFr) or (B-H and (T) or (!-C))) or Ne5) or Q and u)"
 "(-1 or o- or ?xE and -X8 or gY and w or (kD* and (hxj and (? or v or (N) and (_) or (z+0) and (+) and (rhc) or ?) or t+ and *) or T4 or d*))"
 "(?J and (A) or (c2 or c or - or (U or (kT and (q!) or (t) and (xF) and g! and -60 and d3H and (R6?)) and A5) and gW8))")

Or just call it with some bespoke inputs:

(s/conform ::group [[:x
                     :or :y
                     :or :z
                     :or [:q :and :z]]
                    :and :x
                    :and :y
                    :and :z
                    :and [:x
                          :or :y
                          :or :z
                          :or [:q :and :z]]])
=> "((x or y or z or (q and z)) and x and y and z and (x or y or z or (q and z)))"

Callers Behave

Now with a spec in hand, we can ensure clause-str is called with valid arguments using clojure.spec.test’s instrument.

(s/fdef clause-str                              ;; a spec for our function
        :args (s/cat :clause (s/spec ::group))) ;; it takes one argument and it better be a ::group
(stest/instrument `clause-str)                  ;; assert valid args on each call
(clause-str [:x :and :y])
=> "(x and y)"

An invalid call will now throw an exception with information:

(clause-str [:x :and])
CompilerException clojure.lang.ExceptionInfo: Call to #'playground.test/clause-str did not conform to spec:
In: [0] val: () fails spec: :playground.test/group at: [:args :clause :tail :clause] predicate: (or :expr :playground.test/expression :group :playground.test/group)...

Generative/Property-based Testing

For the sake of example, let’s add a (unfortunately meaningless/tautological) return value assertion to our function spec, then check the function’s properties:

(s/fdef clause-str
        :args (s/cat :clause (s/spec ::group))
        :ret  string?)
(stest/check `clause-str)

After a minute or so you’ll realize this isn’t going to finish any time soon. The issue is that test.check is generating very complex, deeply nested samples for our recursive spec. Luckily clojure.spec has a workaround via dynamic binding clojure.spec.alpha/*recursion-limit*.

(binding [s/*recursion-limit* 1]
  (stest/check `clause-str))

Putting a tight upper bound on generative recursion allows check to run in a few seconds.

Property-based testing is great, but you must accurately define the properties. In this case, our properties are derived from our spec, and our spec isn’t quite right. Calling our instrumented function with hand-generated inputs reveals such a problem:

(clause-str [[:x :or :y] :and :z])
CompilerException clojure.lang.ExceptionInfo: Call to #'playground.test/clause-str did not conform to spec:
In: [0 0] val: [:x :or :y] fails spec: :playground.test/expression at: [:args :clause :head] predicate: keyword?

Our ::group spec is a concatenation expecting a :head element that conforms to ::expression. In this case, our first/:head element is actually a sub-::group of expressions, which doesn’t conform to ::expression. We can fix it by revising our ::group spec to be recursive in the :head position too:

(s/def ::group
  (s/cat :head (s/or :g ::group :e ::expression)
         :tail (s/*
                 (s/cat :op     op-keys
                        :clause (s/or :expr  ::expression
                                      :group ::group)))))

Or we can get rid of that duplicated s/or with another forward-referencing spec:

(s/def ::subgroup (s/or :g ::group :e ::expression))
(s/def ::group
  (s/cat :head ::subgroup
         :tail (s/* (s/cat :op op-keys :clause ::subgroup))))

Try generating samples of ::group before and after this change to see the difference; the latter samples are much more complex.

Full Example

Many thanks to the awesome Clojure community! When I ran into issues, the maintainer of test.check chimed in instantly on Clojurians Slack with advice.