I had an itch to implement some basic CSS selectors for Hickory, a Clojure HTML representation. Having done something similar in F# for XPath and HTML I thought the same approach would be a good starting point.

My first step was to build a parser for the CSS selector syntax I was interested in. After some entertaining trial and error in the REPL I had a Minimum Viable Parser using Instaparse.

Then I started thinking about how I could translate its syntax tree into an evaluator against a document tree. Hickory has a concept of selectors which very conveniently share similar semantics with CSS selectors.

What surprised me most was how easily I could leverage the similarities between the Instaparse syntax tree and Hickory’s selector combinators (in the syntax->selector map below) to turn the syntax tree into a functional evaluator. I’d expected much more of a struggle to get something working, but the bulk of the code ended up being declarative grammar and plain syntax transformation:

And using it is easy too:

(->> (slurp "https://clojure.org")
     (h/parse)
     (h/as-hickory)
     (s/select (parse-css-selector "a[href~=reference]")))
=>
[{:type :element,
  :attrs
  {:href "/reference/documentation", :class "w-nav-link clj-nav-link"},
  :tag :a,
  :content ["Reference‍"]}]

Nice

I thought this was a good example of how two well-factored Clojure libraries and a little combinatory glue can accomplish a relatively complex goal concisely.

See the repo and tests for more examples.