diary at Telent Netowrks

I can't believe it's not buttons#

Thu, 13 Oct 2016 23:03:30 +0000

"Second will be some slightly nicer play/next/previous buttons". As predicted. I don't claim they're pretty, just that they're an advance on what went before.

Next: through a combination of dogfooding and Agile backlog reprioritisation, I as the product owner and customer representative have determined that I as the development team should next tackle "make it not behave apparently randomly when adding tracks to the playqueue after it has played what was previously in the queue".

Maybe in the process I can reduce the number of arbitrary reference cursors that are littered thoughout it.


Wed, 12 Oct 2016 23:02:46 +0000

In between my ongoing attempts to buy a house I am still playing on and off with making Sledge into a thing I might actually want to use. First on the list is showing the currently-playing track name and not just its length. Second will be some slightly nicer play/next/previous buttons. Third, maybe making it more robust against pressing "refresh". Ongoing and not yet finished, stripping out the cheesy CSS text shadows etc to make it look less like a 1990s OSF/Motif application and more like "flat UI" - a design trend of which I thoroughly approve. I will be terribly disappointed when the fashion pendulum swings back towards requiring artistic skills for a web page to look modern, because (as you can see) I have none.

(Why is this blog entry titled "Plank"? What else would you call a flat sledge?)

Mostly because I like to see my referrers#

Wed, 21 Sep 2016 21:56:44 +0000

I can think of no security justification for encrypting the pages served by this site: the data is public, viewing it is unlikely to get you arrested in most jurisdictions, it doesn't have any kind of forms or upload facilities or anything - but, I would like to see referers(sic) in my server logs. So now it's all HTTPS, using the rather fantastic Let's Encrypt service.

Let me know if you see anything broken as a result (i.e. anything broken that wasn't already broken) or can think of a principled reason by which I can justify the (admittedly rather paltry) time I spent on doing it.

Giving clojurescript the boot#

Mon, 19 Sep 2016 21:58:57 +0000

Recently I decided to unearth Sledge and fix enough of the bugs in it that I will actually want to use it day-to-day, and because changing only one thing at a time is too easy I thought I'd try updating all its dependencies and moving it from Leinginen to the new shiny Boot Clojure build tooling.

First impressions of Boot:

Point 1 is in my judgment so far a compelling reason to perservere through the pain of points 2 and 3.

I did a little gist already for making an uberjar with Boot and you should definitely pay attention to line 25. But today I want to talk about defining your own tasks, and as an example I'm going to add support for building Clojurescript.

The well-informed reader will know that there is already a Boot task to compile ClojureScript programs on Github. I chose not to use this, mostly because I was getting error messages I didn't understand and also partly because the Clojurescript Quick Start wiki page so strongly recommends understanding the fundamentals of compiling Clojurescript before plugging in a tool-based workflow.

So. Here are some things you may or may not already know about Boot if you've previously been using it by cargo-cult:

  1. You tell Boot what you want it to do by composing tasks into a build pipeline
  2. Each task defines a handler . A handler accepts a fileset, does something (e.g. compiles some files to create output files) and then calls the next handler with a new fileset which represents the passed fileset plus the result of whatever it did. It's a lot like Ring nested middleware. The first handler in the pipeline gets a fileset consisting only of the input files (your project source files), calls the second which calls the third etc, and once all the nested handlers have run then the returned fileset contains all the output files.
  3. From the end-user or even the task author's point of view, filesets are immutable. Handlers never directly change or create files in the project directory - instead they always create a new fileset, and Boot copies/moves files around in temporary directories behind your back to maintain the abstraction. As a task writer you don't have to care too much how this works: there are functions to map between fileset entries and the full pathname you should use to read the corresponding file; also to create a new temporary directory for output and then to add the files in that directory to a fileset.

And here are some things about the Clojurescript compiler which probably should be apparent from reading the Quick Start and the code it refers to:

  1. The compiler API lives in ns cljs.build.api and the important bits are inputs which creates a 'compilable object' from the directories/files you give it, and compile which does the compilation.
  2. Contrary to anything you might have thought by reading the doc string of inputs - it does not accept "a list", it accepts multiple arguments. If you have a list you will need to use apply here. I wasted a lot of time on this by not reading the code properly.

So, how do we marry the two up? Look upon my task ye mighty and despair ....

A task is a function that returns a middleware, which is a function that returns a handler. This is not super-obvious from the code on display here, because we're using a small piece of handy syntax sugar called with-pre-wrap which lets us provide the body of the handler and returns a suitable middleware.

What else? Not much else. This code lives in the sledge.boot-build namespace and gets required by build.boot . We have to override and/or augment what the user passes for output-to and output-dir to make sure it ends up somewhere it'l get added to the fileset instead of writing straight into the project working directory. And I haven't decided how to do a repl yet. I will probably add that (one way or another) before merging the das-boot branch into master.

The trouble with triples#

Sat, 26 Mar 2016 16:29:15 +0000

The other day I had occasion to write

(defn triples-to-map [triples]
  (reduce (fn [m row]
            (update-in m (butlast row)
                       (fn [old new] (if old (conj old new) [new]))
                       (last row)))

and be surprised and delighted that it ran first time with the expected result. As witness:

foo.search=> (clojure.pprint/pprint triples_)
([:bnb:016691109 :published "2014"]
 [:bnb:016691109 :title "The Seven Streets of Liverpool"]
 [:bnb:016691109 :publisher "Orion"]
 [:bnb:016691109 :schema :shlv:Book]
 [:bnb:016691109 :author "Lee, Maureen"]
 [:bnb:016594932 :published "2013"]
 [:bnb:016594932 :title "Stephen Guy's forgotten Liverpool"]
 [:bnb:016594932 :publisher "Trinity Mirror"]
 [:bnb:016594932 :schema :shlv:Book]
 [:bnb:016594932 :author "Guy, Stephen"]
 [:bnb:016242841 :published "2012"]
  "Robbed : my Liverpool life : the Rob Jones story"]
 [:bnb:016242841 :publisher "Kids Academy Publishing"]
 [:bnb:016242841 :schema :shlv:Book]
 [:bnb:016242841 :author "Jones, Rob, 1971-"]
 [:bnb:016744037 :published "2012"]
 [:bnb:016744037 :title "Steven Gerrard : my Liverpool story"]
 [:bnb:016744037 :publisher "Headline"]
 [:bnb:016744037 :schema :shlv:Book]
 [:bnb:016744037 :author "Gerrard, Steven, 1980-"])
foo.search=> (clojure.pprint/pprint (triples-to-map triples_))
 {:published ["2014"],
  :title ["The Seven Streets of Liverpool"],
  :publisher ["Orion"],
  :schema [:shlv:Book],
  :author ["Lee, Maureen"]},
 {:published ["2013"],
  :title ["Stephen Guy's forgotten Liverpool"],
  :publisher ["Trinity Mirror"],
  :schema [:shlv:Book],
  :author ["Guy, Stephen"]},
 {:published ["2012"],
  :title ["Robbed : my Liverpool life : the Rob Jones story"],
  :publisher ["Kids Academy Publishing"],
  :schema [:shlv:Book],
  :author ["Jones, Rob, 1971-"]},
 {:published ["2012"],
  :title ["Steven Gerrard : my Liverpool story"],
  :publisher ["Headline"],
  :schema [:shlv:Book],
  :author ["Gerrard, Steven, 1980-"]}}

(Now I write that code down for the second time I wonder whether using update-in is slightly overkill when I know the map will only ever be two levels deep. But that's not something I'm interested in right now.)

What I'm interested in right now is that the input list for this function is itself the output of some other code which - mostly thanks to Instaparse - was unexpectedly easy to write. I've been playing around lately with RDF and the Semantic Web, and needed a way of parsing N-Triples - which looks superficially simple enough that Awk could do it, until you start thinking about comments and strings with spaces in them and escaped special characters and ...

Anyway, Instaparse steps in to save the day again. I believe I have written previously to give my opinion that Instaparse is awesome and I will go on record to say that this fresh experience merely serves to cement my first impression.

N-Triples has a published EBNF grammar . I had to monkey with this a bit to get it into Instaparse

Here's the final result

ntriplesDoc 	::= line*
line ::= WS* triple? EOL
triple 	::= 	subject WS* predicate WS* object WS* '.' WS*
predicate 	::= 	IRIREF
object 	::= 	IRIREF | BLANK_NODE_LABEL | literal
LANGTAG 	::= 	'@' #"[a-zA-Z]"+ ('-' #"[a-zA-Z0-9]"+)*
EOL 	::= 	#"[\n\r]"+ 
WS 	::= 	#"[ \t]" | #"#.*"
IRIREF 	::= 	'<' IRI '>'
IRI ::= (#"[^\u0000-\u0020<>\"{}|^`\\]" | UCHAR)*
STRING_LITERAL ::= ( #"[^\u0022\u005C\u000A\u000D]" | ECHAR | UCHAR)*
BLANK_NODE_LABEL 	::= 	'_:' (PN_CHARS_U | #"[0-9]") ((PN_CHARS | '.')* PN_CHARS)?
ECHAR ::= "\\" #"[tbnrf\"\'\\]"

HEX ::= #"[0-9A-Fa-f]"

PN_CHARS_BASE ::= #"[A-Z]" | #"[a-z]" | #"[\u00C0-\u00D6]" | #"[\u00D8-\u00F6]" | #"[\u00F8-\u02FF]" | #"[\u0370-\u037D]" | #"[\u037F-\u1FFF]" | #"[\u200C-\u200D]" | #"[\u2070-\u218F]" | #"[\u2C00-\u2FEF]" | #"[\u3001-\uD7FF]" | #"[\uF900-\uFDCF]" | #"[\uFDF0-\uFFFD]" | #"[\x{10000}-\x{EFFFF}]"

PN_CHARS_U ::= PN_CHARS_BASE | ":" | "_"

PN_CHARS ::= PN_CHARS_U | "-" | #"[0-9]" | "\u00B7" | #"[\u0300-\u036F]" | #"[\u203F-\u2040]"

Calling insta/parse with this grammar on a sample line gets you something looking like

    [:IRIREF "<" [:IRI  "h" "t" "t" "p" ":" "/" "/" "b" "n" "b" "."
                        "d" "a" "t" "a" "."  "b" "l" "."  "u" "k" "/" "i" "d"
                        "/" "r" "e" "s" "o" "u" "r" "c" "e" "/" "0" "1" "6" "7"
                        "0" "6" "8" "5" "5"] ">"]]
   [:WS " "]
    [:IRIREF "<" [:IRI "h" "t" "t" "p" ":" "/" "/" "l" "o" "c" "a" "l"
                       "h" "o" "s" "t" ":" "3" "0" "3" "0" "/" "p" "u"
                       "b" "l" "i" "s" "h" "e" "d"] ">"]]
   [:WS " "] [:object [:literal [:STRING_LITERAL_QUOTED
      "\"" [:STRING_LITERAL "2" "0" "1" "4"] "\""]]] [:WS " "]
  [:EOL "\n"]]]

which clearly is going to need some more attention before it's usable. We do this in two passes: first we visit the entire tree node-by-node to do things like turn literal node values into strings and IRI nodes into URI objects.

(defn visit-node [branch]
  (if (vector? branch)
    (case (first branch)
      (let [[_< [_iri_tok & letters] _>] (rest branch)
            iri (str/join letters)]
        (or (prefixize iri)
            (URI. iri)))
      :STRING_LITERAL (str/join (rest branch))
      :STRING_LITERAL_QUOTED (let [[_ string _] (rest branch)] string)
      :literal (second branch)
      :WS ""
      :UCHAR (let [[_ & hexs] (rest branch)]
                 (Integer/parseInt (str/join (map second hexs)) 16))))
      :triple (let [m (reduce (fn [m [k v]] (assoc m k v)) {}
                              (rest branch))]
                [:triple [(:subject m) (:predicate m) (:object m)]])

Then we transform the tree into a seq and filter the seq to get only the :triple nodes. Putting it all together:

(defn parse-n-triples [in-string]
  (->> in-string
       (insta/parse n-triple-parser)
       (walk/postwalk visit-node)
       (tree-seq #(and (vector? %)
                       (keyword? (first %))
                       (not (= (first %) :triple)))
                 #(rest %))
       (filter #(= (first %) :triple))
       (map second)))

I'm reasonably confident that the grammar is correct: I pushed all the official N-Triples Test Suite through it without error. My post-parsing massage passes, though, are possibly not correct and certainly not complete, which is one reason I'm just blogging about it instead of publishing it as a standalone library somewhere. Things I already know it doesn't do: blank node support, language tags, datatypes, escaped characters. Things I don't know it doesn't do: don't know. But it seems to work for my use case - of which, more later.