Information


Blog Posts


Collections



Contact


Things Ian Says

Building a Graph in Clojure, for the Kiwiland Railway

I was recently looking at a programming test, where the challenge is to represent a railway system (called “Kiwiland”) and then deduce various facts about it (for example the distance for a certain route). I wanted to develop this in Clojure, and thought it would be an interesting thing to work through on my blog.

The Problem

Various statements of this problem can be found around the web, for example in this github repo. I’ve copied the problem statement if you want to read it, but to summarise, the problem is about a railway network which looks like this:

Railway Map

This is described in terms of pairs of stations (e.g. A and B) and the distance between them (which is 5 in this case). That section is described with the notation AB5. The whole railway network is therefore described as:

Graph: AB5, BC4, CD8, DC8, DE6, AD5, CE2, EB3, AE7

We then need to provide the answers to the following:

  1. The distance of the route A-B-C.
  2. The distance of the route A-D.
  3. The distance of the route A-D-C.
  4. The distance of the route A-E-B-C-D.
  5. The distance of the route A-E-D.
  6. The number of trips starting at C and ending at C with a maximum of 3 stops.
  7. The number of trips starting at A and ending at C with exactly 4 stops. In
  8. The length of the shortest route (in terms of distance to travel) from A to C.
  9. The length of the shortest route (in terms of distance to travel) from B to B.
  10. The number of different routes from C to C with a distance of less than 30.

The problem also gives us the answers we should produce:

Output #1: 9
Output #2: 5
Output #3: 13
Output #4: 22
Output #5: NO SUCH ROUTE
Output #6: 2
Output #7: 3
Output #8: 9
Output #9: 9
Output #10: 7

Data Representation

The first thing I decided to do, was think about how I would represent the graph itself. There are two common representations for graphs — an adjacency matrix, and an adjacency list.

Adjacency Matrix

An adjacency matrix has all nodes (or railway stations) on both axes, with the edge (the distance between stations in our case) represented as a value at the intersection.

We would therefore start with a blank matrix like this:

      A B C D E  
    +----------  
  A |            
  B |            
  C |            
  D |            
  E |            

We add in our AB5 edge, by putting the 5 in the appropriate cell:

      A B C D E  
    +----------  
  A |            
  B | 5          
  C |            
  D |            
  E |            

If we do that for all edges, we end up with this matrix:

      A B C D E  
    +----------  
  A | 0 0 0 0 0  
  B | 5 0 0 0 3  
  C | 0 4 0 8 0  
  D | 5 0 8 0 0  
  E | 7 0 2 6 0  

Adjacency List

An adjacency list (as the name suggests), is a list of nodes, where we also store a list of the connections against each node. Our railway network would look like this:

A: B5, D5, E7
B: C4
C: D8, E2
D: C8, E6
E: B3

Comparison of Adjacency Matrix and Adjacency List

The adjacency matrix is a compact way of storing the data, and is particularly good for a graph with lots of connections. An adjacency list is more efficient for storing sparsely connected graphs.

In terms of operation speed, the adjacency matrix is very efficient at answering questions such as “what is the distance between A and B” or “are points C and D connected” (since these things can be looked up directly from the matrix). The adjacency list is more efficient at answering questions like “which stations can I reach from station A” (since it can be read from the list, whereas the matrix would need to be iterated over).

Given the number of connections, and that many of the questions are about finding routes, I decided to use an adjacency list.

The best way to represent this in Clojure, I felt, would be to use a hash map:

{:A {:B 5, :D 5, :E 7},
 :B {:C 4},
 :C {:D 8, :E 2},
 :D {:C 8, :E 6},
 :E {:B 3}}

But I’m getting ahead of myself. Now that I have decided on a representation, I need to create my project and start coding.

Creating my Kiwiland Clojure project

I use Leiningen to create and manage my Clojure projects, so I create my new project like this:

$ lein new kiwiland
Generating a project called kiwiland based on the 'default' template.
The default template is intended for library projects, not applications.
To see other templates (app, plugin, etc), try `lein help new`.
lein

This gives me the following directory structure:

$ tree kiwiland
kiwiland
├── CHANGELOG.md
├── LICENSE
├── README.md
├── doc
│   └── intro.md
├── project.clj
├── resources
├── src
│   └── kiwiland
│       └── core.clj
└── test
    └── kiwiland
        └── core_test.clj

6 directories, 7 files
tree

If I cd into my kiwiland directory, I can see a (failing) test file has been generated for me:

$ cat test/kiwiland/core_test.clj
(ns kiwiland.core-test
  (:require [clojure.test :refer :all]
            [kiwiland.core :refer :all]))

(deftest a-test
  (testing "FIXME, I fail."
    (is (= 0 1))))

I can then run this, with the expected failure message:

$ lein test

lein test kiwiland.core-test

lein test :only kiwiland.core-test/a-test

FAIL in (a-test) (core_test.clj:7)
FIXME, I fail.
expected: (= 0 1)
  actual: (not (= 0 1))

Ran 1 tests containing 1 assertions.
1 failures, 0 errors.
Tests failed.

A placeholder source file has also been created:

$ cat src/kiwiland/core.clj
(ns kiwiland.core)

(defn foo
  "I don't do a whole lot."
  [x]
  (println x "Hello, World!"))

When I try to run this, though, I get an error:

$ lein run
No :main namespace specified in project.clj.

So, let’s take a quick look at the project.clj file:

$ cat project.clj
(defproject kiwiland "0.1.0-SNAPSHOT"
  :description "FIXME: write description"
  :url "http://example.com/FIXME"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.8.0"]])

I’ll add in the :main namespace to this, and also tidy up the default boilerplate text. While I’m doing that, I’ll also initialise the REPL with my project too:

(defproject kiwiland "0.1.0-SNAPSHOT"
  :description "Clojure implementation of Kiwiland Rail problem"
  :url "https://github.com/ianfinch/kiwiland"
  :license {:name "Eclipse Public License"
            :url "http://www.eclipse.org/legal/epl-v10.html"}
  :dependencies [[org.clojure/clojure "1.8.0"]]
  :main kiwiland.core
  :repl-options {:init-ns kiwiland.core})

I also have to modify my core.clj file, since leiningen will now be looking for a function called -main to run by default:

$ cat src/kiwiland/core.clj
(ns kiwiland.core)

(defn -main
  "I don't do a whole lot."
  []
  (println "Hello, World!"))

Now I can run the project:

$ lein run
Hello, World!

Since we already have an input string and the expected result, provided in the problem statement itself, we can now change our test to reflect this. I want to create a (generate-answers ...) function, which should take the graph definition as an input, and return a list of answers. My -main function will then simply print out the result of that function. To reflect this, I want two tests — one for (generate-answers ...) which is straightforward, and one for -main, which will need to redirect stdout to test what println does:

(ns kiwiland.core-test
  (:require [clojure.test :refer :all]
            [kiwiland.core :refer :all]))

(deftest check-generated-answers
  (testing "Check generated answers"
    (is (= (generate-answers "AB5, BC4, CD8, DC8, DE6, AD5, CE2, EB3, AE7")
           '(9 5 13 22 "NO SUCH ROUTE" 2 3 9 9 7)))))

(deftest check-main
  (testing "The main function"
    (is (= (with-out-str (-main))
           "(9 5 13 22 NO SUCH ROUTE 2 3 9 9 7)\n"))))

The simplest solution which passes these tests, is to return the expected result as a fixed string:

(ns kiwiland.core)

(defn generate-answers
  "Generate the answers to the Kiwiland Rail test questions"
  [graph-string]
  '(9 5 13 22 "NO SUCH ROUTE" 2 3 9 9 7))


(defn -main
  "Print out the test answers"
  []
  (println (generate-answers "AB5, BC4, CD8, DC8, DE6, AD5, CE2, EB3, AE7")))

This is obviously not satisfactory as a solution, but it gets me to a point where I can park this core.clj file and start looking at the graph module itself. I’ll park it for now, start building the graph module, then come back to this later.

Now when I run the tests, I get the following:

$ lein test

lein test kiwiland.core-test

Ran 2 tests containing 2 assertions.
0 failures, 0 errors.

So, a good place to leave this, and turn my attention to the graph itself.

The Graph Module

So, now I’m going to create a graph module, I need to create a source file and a test file:

$ cat src/kiwiland/graph.clj
(ns kiwiland.graph)

$ cat test/kiwiland/graph_test.clj
(ns kiwiland.graph-test
  (:require [clojure.test :refer :all]
            [kiwiland.graph :refer :all]))

Adding nodes

I already talked earlier about how I wanted to represent the graph using an adjacency list. The first thing I need to work out, is to get from nothing to a graph with my first node in it. The idiomatic way to represent no graph in Clojure would be nil, so that can be my starting point. A node which connects to nothing would similarly be {:x nil}.

The other condition I have imposed on myself, is that I want my solution to use immutable data as much as possible. So, instead of adding in a node to a data structure, the addition of a node will create a new data structure (which incorporates that node).

The above allows me to write my first test:

(deftest create-first-node
  (testing "Creating the first node in a graph"
    (is (= (add-node nil :a)
           {:a nil}))))

Now I’ve got a statement of what creating the first node should look like, I can now think about adding a second node to that:

(deftest add-a-node-to-graph
  (testing "Adding a node to an existing graph"
    (is (= (add-node {:a nil} :b)
           {:a nil :b nil}))))

Finally, if we try to add a node which already exists, it should have no effect:

(deftest add-an-existing-node
  (testing "Trying to add an existing node to a graph"
    (is (= (add-node {:a {:b 1} :b nil} :a)
           {:a {:b 1} :b nil}))))

Note that I’ve put the graph as the first parameter to the function, to allow threading of this to build up a graph:

(-> nil
    (add-node :a)
    (add-node :b)
    (add-node :c))

The function to implement this behaviour is simple. It just checks if the node is already in the graph (using the (get ...) function). If it’s already in the graph, then just return the graph. If it’s not in the graph use (assoc ...) to add it:

(defn add-node
  "Add a node to the graph"
  [the-graph node-key]
  (cond (get the-graph node-key) the-graph
        :else (assoc the-graph node-key nil)))

And run our tests to make sure we’re good:

$ lein test

lein test kiwiland.core-test

lein test kiwiland.graph-test

Ran 5 tests containing 5 assertions.
0 failures, 0 errors.

We can explore this in the REPL too:

$ lein repl
nREPL server started on port 38147 on host 127.0.0.1 - nrepl://127.0.0.1:38147

kiwiland.core=> (require '[kiwiland.graph :as graph])
nil

kiwiland.core=> (-> nil (graph/add-node :a) (graph/add-node :b) (graph/add-node :c))
{:a nil, :b nil, :c nil}

Or more compactly, we can use (reduce ...) to achieve the same result:

kiwiland.core=> (reduce graph/add-node nil [:a :b :c])
{:a nil, :b nil, :c nil}

This gives me some reassurance that my function is idiomatic, since it works well with these standard constructs.

Adding Edges

Adding isolated nodes is a necessary part of constructing a graph, but I also need to be able to add edges. So now I’m going to think about what that will look like.

I’m going to keep the same approach of the graph being the first parameter (to allow threading for addition of both nodes and edges), then supply source node, target node and distance as the next parameters.

As with adding nodes, the first thing we need to be able to do is create a graph from nothing, by specifying the first edge:

(deftest add-an-edge-to-empty
  (testing "Adding an edge to an empty graph"
    (is (= (add-edge nil :a :b 1)
           {:a {:b 1} :b nil}))))

The result we are looking for — {:a {:b 1} :b nil} — can be reached from other starting positions too. The source (a) could already exist, the target (b) could already exist, or both a and b could already exist. So, I’ll add in tests for these too. To make the tests as comprehensive as possible, I’ll also add in an extra node (:c), to ensure that it isn’t corrupted by the addition of a new edge:

(deftest add-edge-from-existing-source
  (testing "Adding an edge from an existing source node"
    (is (= (add-edge {:a {:c 1} :c nil} :a :b 1)
           {:a {:b 1 :c 1} :b nil :c nil}))))

(deftest add-edge-to-existing-target
  (testing "Adding an edge to an existing target node"
    (is (= (add-edge {:b {:c 1} :c nil} :a :b 1)
           {:a {:b 1} :b {:c 1} :c nil}))))

(deftest add-edge-between-existing-nodes
  (testing "Adding an edge between two existing nodes"
    (is (= (add-edge {:a {:c 1} :b {:c 1} :c nil} :a :b 1)
           {:a {:b 1 :c 1} :b {:c 1} :c nil}))))

Another situation is that we could already have a graph, but we are adding in two new nodes plus an edge between them in one go:

(deftest add-disjoint-edge-into-existing
  (testing "Adding two new nodes and an edge into an existing graph"
    (is (= (add-edge {:a {:b 1} :b nil} :c :d 4)
           {:a {:b 1} :b nil :c {:d 4} :d nil}))))

Finally, I need to decide how to handle a situation where I try to add an edge which already exists, but with a new value associated with it. For now, I’ve decided that in this situation, the new value overrides the old value. This means my test is:

(deftest update-existing-edge
  (testing "Updating an edge which already exists"
    (is (= (add-edge {:a {:b 1} :b {:c 1} :c nil} :a :b 2)
           {:a {:b 2} :b {:c 1} :c nil}))))

So, let’s build up the (add-edge ...) function for all these test cases. At its most basic, it needs to construct the adjacent edge (e.g. {:b 1}), then add that into the graph against the source node (e.g. :a):

(defn add-edge
  "Add an edge to the graph"
  [the-graph source-key target-key value]
  (assoc
    the-graph
    source-key
    (assoc nil target-key value)))

The problem with this, is that if the target does not already exist in the graph, it won’t get added. However, the (add-node ...) function we wrote earlier will add a node if it doesn’t already exist. So, instead of just using the graph passed in, we can call (add-node ...) to add the target node if it doesn’t already exist:

(defn add-edge
  "Add an edge to the graph"
  [the-graph source-key target-key value]
  (assoc
    (add-node the-graph target-key)
    source-key
    (assoc nil target-key value)))

If I now run my tests, I get the following:

$ lein test

lein test kiwiland.core-test

lein test kiwiland.graph-test

lein test :only kiwiland.graph-test/add-edge-from-existing-source

FAIL in (add-edge-from-existing-source) (graph_test.clj:27)
Adding an edge from an existing source node
expected: (= (add-edge {:a {:c 1}, :c nil} :a :b 1) {:a {:b 1, :c 1}, :b nil, :c nil})
  actual: (not (= {:a {:b 1}, :c nil, :b nil} {:a {:b 1, :c 1}, :b nil, :c nil}))

lein test :only kiwiland.graph-test/add-edge-between-existing-nodes

FAIL in (add-edge-between-existing-nodes) (graph_test.clj:37)
Adding an edge between two existing nodes
expected: (= (add-edge {:a {:c 1}, :b {:c 1}, :c nil} :a :b 1) {:a {:b 1, :c 1}, :b {:c 1}, :c nil})
  actual: (not (= {:a {:b 1}, :b {:c 1}, :c nil} {:a {:b 1, :c 1}, :b {:c 1}, :c nil}))

Ran 11 tests containing 11 assertions.
2 failures, 0 errors.
Tests failed.

Taking the first failure, the test is expecting:

{:a {:b 1, :c 1}, :b nil, :c nil}

But it actually gets:

{:a {:b 1}, :c nil, :b nil}

We can see that the problem is that our function ignores any pre-existing connections from the source node. We therefore need to update this part of our function:

(assoc nil target-key value)

Instead of using nil, we need to grab the existing connections for the source node, which gives us:

(assoc (get the-graph source-key) target-key value)

If the source node does not exist, or exists and has no connections, that expression will give us nil, so our existing tests will still pass. For readability, I will change it to use threading:

(-> (get the-graph source-key)
    (assoc target-key value))

Putting this back into our function gives us this:

(defn add-edge
  "Add an edge to the graph"
  [the-graph source-key target-key value]
  (assoc
    (add-node the-graph target-key)
    source-key
    (-> (get the-graph source-key)
        (assoc target-key value))))

And, running our tests, we successfully pass them all:

$ lein test

lein test kiwiland.core-test

lein test kiwiland.graph-test

Ran 11 tests containing 11 assertions.
0 failures, 0 errors.

String Representation for Edges

If you recall the original problem statement, edges were represented using a notation like AB5, so I want another function which will add an edge in this format. It still needs to handle all the scenarios from (add-edge ...), so I’ll just replicate those tests using this new notation. If I call my function (add-edge-from-string ...), the tests will look like this:

(deftest add-an-edge-string-to-empty
  (testing "Adding an edge to an empty graph"
    (is (= (add-edge-from-string nil "AB1")
           {:a {:b 1} :b nil}))))

(deftest add-edge-string-from-existing-source
  (testing "Adding an edge from an existing source node"
    (is (= (add-edge-from-string {:a {:c 1} :c nil} "ab1")
           {:a {:b 1 :c 1} :b nil :c nil}))))

(deftest add-edge-string-to-existing-target
  (testing "Adding an edge to an existing target node"
    (is (= (add-edge-from-string {:b {:c 1} :c nil} "ab1")
           {:a {:b 1} :b {:c 1} :c nil}))))

(deftest add-edge-string-between-existing-nodes
  (testing "Adding an edge between two existing nodes"
    (is (= (add-edge-from-string {:a {:c 1} :b {:c 1} :c nil} "ab1")
           {:a {:b 1 :c 1} :b {:c 1} :c nil}))))

(deftest add-disjoint-edge-string-into-existing
  (testing "Adding two new nodes and an edge into an existing graph"
    (is (= (add-edge-from-string {:a {:b 1} :b nil} "cd4")
           {:a {:b 1} :b nil :c {:d 4} :d nil}))))

(deftest update-existing-edge-string
  (testing "Updating an edge which already exists"
    (is (= (add-edge-from-string {:a {:b 1} :b {:c 1} :c nil} "ab2")
           {:a {:b 2} :b {:c 1} :c nil}))))

I’ll also add in a couple of extra tests, where the format is not correct. In this case, I expect the edge to be ignored:

(deftest add-incorrect-edge-string
  (testing "Adding an edge from an incorrectly formed string"
    (is (= (add-edge-from-string {:a {:c 1} :c nil} "1ab")
           {:a {:c 1} :c nil}))))

(deftest add-edge-string-with-extra-digit
  (testing "Adding an edge with an extra digit"
    (is (= (add-edge-from-string {:a {:c 1} :c nil} "ab11")
           {:a {:c 1} :c nil}))))

The function to do this is pretty simple, just chop up the string into its parts, parse the integer from the third character, and call (add-edge ...) like before:

(defn add-edge-from-string
  "Add an edge to the graph, provided as string of format AB5"
  [the-graph edge-def]
  (let [character (str/split edge-def #"")
        source-name (first character)
        target-name (second character)
        distance (Integer/parseInt (last character))]
    (add-edge the-graph (keyword source-name) (keyword target-name) distance)))

If I run my tests, I get two failures:

$ lein test

lein test kiwiland.core-test

lein test kiwiland.graph-test

lein test :only kiwiland.graph-test/add-edge-string-with-extra-digit

FAIL in (add-edge-string-with-extra-digit) (graph_test.clj:87)
Adding an edge with an extra digit
expected: (= (add-edge-from-string {:a {:c 1}, :c nil} "ab11") {:a {:c 1}, :c nil})
  actual: (not (= {:a {:c 1, :b 1}, :c nil, :b nil} {:a {:c 1}, :c nil}))

lein test :only kiwiland.graph-test/add-incorrect-edge-string

ERROR in (add-incorrect-edge-string) (NumberFormatException.java:65)
Adding an edge from an incorrectly formed string
expected: (= (add-edge-from-string {:a {:c 1}, :c nil} "1ab") {:a {:c 1}, :c nil})
  actual: java.lang.NumberFormatException: For input string: "b"
 at java.lang.NumberFormatException.forInputString (NumberFormatException.java:65)
    java.lang.Integer.parseInt (Integer.java:580)
    java.lang.Integer.parseInt (Integer.java:615)
    ...

The first failure is because the function is interpreting ab11 as ab1. So I need to add a check on the string length. To do that, I’ll rename my existing function to (add-edge-from-valid-string ...), then wrap that with the check for three characters:

(defn add-edge-from-valid-string
  "Add an edge to the graph, provided as string of format AB5"
  [the-graph edge-def]
  (let [character (str/split edge-def #"")
        source-name (first character)
        target-name (second character)
        distance (Integer/parseInt (last character))]
    (add-edge the-graph (keyword source-name) (keyword target-name) distance)))

(defn add-edge-from-string
  "Add an edge to the graph, provided as string, after running validation checks"
  [the-graph edge-def]
  (cond (= (count edge-def) 3) (add-edge-from-valid-string the-graph edge-def)
        :else (do (println "Skipping" edge-def "since it has more than 3 characters")
                  the-graph)))

Let’s split that down further, for a bit better readability:

(defn add-edge-from-valid-string
  "Add an edge to the graph, provided as string of format AB5"
  [the-graph edge-def]
  (let [character (str/split edge-def #"")
        source-name (first character)
        target-name (second character)
        distance (Integer/parseInt (last character))]
    (add-edge the-graph (keyword source-name) (keyword target-name) distance)))

(defn skip-too-many-characters
  "Display an error message, and return the passed in graph"
  [the-graph edge-def]
  (println "Skipping" edge-def "since it has more than 3 characters")
  the-graph)

(defn add-edge-from-string
  "Add an edge to the graph, provided as string, after running validation checks"
  [the-graph edge-def]
  (cond (= (count edge-def) 3) (add-edge-from-valid-string the-graph edge-def)
        :else (skip-too-many-characters the-graph edge-def)))

Now we need to trap the error we get when an edge string can’t be parsed. To do, this I’ll create a new function which wraps the call to (add-edge-from-valid-string ...) in a (try ... (catch ...)):

(defn try-add-edge-from-string
  "Add an edge from a string, trapping parsing errors"
  [the-graph edge-def]
  (try (add-edge-from-valid-string the-graph edge-def)
    (catch Exception e (do (println "Skipping" edge-def "could not parse:" (.getClass e))
                           the-graph))))

Now I’ll update (add-edge-from-string ...) to use this new function instead of calling (add-edge-from-valid-string ...):

(defn add-edge-from-string
  "Add an edge to the graph, provided as string, after running validation checks"
  [the-graph edge-def]
  (cond (= (count edge-def) 3) (try-add-edge-from-string the-graph edge-def)
        :else (skip-too-many-characters the-graph edge-def)))

For completeness, here it all is in one place:

(defn add-edge-from-valid-string
  "Add an edge to the graph, provided as string of format AB5"
  [the-graph edge-def]
  (let [character (str/split edge-def #"")
        source-name (first character)
        target-name (second character)
        distance (Integer/parseInt (last character))]
    (add-edge the-graph (keyword source-name) (keyword target-name) distance)))

(defn skip-too-many-characters
  "Display an error message, and return the passed in graph"
  [the-graph edge-def]
  (println "Skipping" edge-def "since it has more than 3 characters")
  the-graph)

(defn try-add-edge-from-string
  "Add an edge from a string, trapping parsing errors"
  [the-graph edge-def]
  (try (add-edge-from-valid-string the-graph edge-def)
    (catch Exception e (do (println "Skipping" edge-def "could not parse:" (.getClass e))
                           the-graph))))

(defn add-edge-from-string
  "Add an edge to the graph, provided as string, after running validation checks"
  [the-graph edge-def]
  (cond (= (count edge-def) 3) (try-add-edge-from-string the-graph edge-def)
        :else (skip-too-many-characters the-graph edge-def)))

And here is the output from my tests:

$ lein test

lein test kiwiland.core-test

lein test kiwiland.graph-test
Skipping ab11 since it has more than 3 characters
Skipping 1ab could not parse: java.lang.NumberFormatException

Ran 19 tests containing 19 assertions.
0 failures, 0 errors.

Building a Graph from Strings

So, the final thing I need to do is provide a function which takes an array of these edge strings, and creates the graph from it. The test for this is easy to write, since I’ve described the input and output right at the start of this article.

It needs to take this as input:

AB5, BC4, CD8, DC8, DE6, AD5, CE2, EB3, AE7

And generate this:

{:A {:B 5, :D 5, :E 7},
 :B {:C 4},
 :C {:D 8, :E 2},
 :D {:C 8, :E 6},
 :E {:B 3}}

I’m going to use an array for input (rather than a comma separated string), but apart from that, the test is just stating the above:

(deftest build-complete-graph
  (testing "Building the graph from an array of edges"
    (is (= (build-graph '("AB5" "BC4" "CD8" "DC8" "DE6" "AD5" "CE2" "EB3" "AE7"))
           {:A {:B 5, :D 5, :E 7},
            :B {:C 4},
            :C {:D 8, :E 2},
            :D {:C 8, :E 6},
            :E {:B 3}}))))

And the function to achieve that, is pretty much the same as the (reduce ...) example I gave earlier:

(defn build-graph
  "Builds a graph from an array of string expressions"
  [edge-list]
  (reduce add-edge-from-string nil edge-list))

Running my tests (and including code coverage reporting this time), I get this:

$ lein cloverage
Loading namespaces:  (kiwiland.core kiwiland.graph)
Test namespaces:  (kiwiland.core-test kiwiland.graph-test)
Loaded  kiwiland.graph  .
Loaded  kiwiland.core  .
Instrumented namespaces.

Testing kiwiland.core-test

Testing kiwiland.graph-test
Skipping ab11 since it has more than 3 characters
Skipping 1ab could not parse: java.lang.NumberFormatException

Ran 20 tests containing 20 assertions.
0 failures, 0 errors.
Ran tests.
Writing HTML report to: /usr/src/lein/target/coverage/index.html

|----------------+---------+---------|
|      Namespace | % Forms | % Lines |
|----------------+---------+---------|
|  kiwiland.core |  100.00 |  100.00 |
| kiwiland.graph |  100.00 |  100.00 |
|----------------+---------+---------|
|      ALL FILES |  100.00 |  100.00 |
|----------------+---------+---------|

So I’m fairly happy with that in terms of building the graph. Now let’s move on to answering some of the questions from the exercise.

Calculating Distances

The first five questions are all about calculating distances for journeys of multiple stations, so I’m going to start with them. It seems pretty obvious that as part of that, I’m going to need to calculate the distance between two stops, so I’ll start with that. I’ll call the function (one-stop ...), and write a test to check it returns the right distance:

(deftest distance-between-two-stops
  (testing "Calculating the distance between two stops"
    (let [g (build-graph ["ab1"])]
      (is (= (one-stop g :a :b) 1)))))

If it is not possible to travel from the first stop to the second, I want it to return nil. This could be because one or both of the stops do not exist, or because there is no connection:

(deftest distance-two-stops-no-start
  (testing "The distance between two stops, when the start does not exist"
    (let [g (build-graph ["ab1"])]
      (is (= (one-stop g :x :b) nil)))))

(deftest distance-two-stops-no-finish
  (testing "The distance between two stops, when the finish does not exist"
    (let [g (build-graph ["ab1"])]
      (is (= (one-stop g :a :x) nil)))))

(deftest distance-two-stops-none-exist
  (testing "The distance between two stops, when neither exist"
    (let [g (build-graph ["ab1"])]
      (is (= (one-stop g :x :y) nil)))))

(deftest distance-two-stops-no-connection
  (testing "The distance between two stops, when there is no connection"
    (let [g (build-graph ["ab1" "bc1"])]
      (is (= (one-stop g :a :c) nil)))))

The function to achieve this is simple to write. I just need to find the start node in the adjacency list, then look in the hash map associated with that start node to find the distance:

(defn one-stop
  "Gets the distance for one stop from the starting node"
  [the-graph start-node stop-node]
  (-> the-graph
      (get start-node)
      (get stop-node)))

I get all the different ways of returning nil for free, because (get ...) will return nil if it can’t find a match — and if the start node can’t be found, the second (get ...) on the resulting nil is still valid and returns nil as a result.

Now I can find the distance for one stop, I’ll move on to look at multiple stops. Obviously, the basic requirement is that we get the right answer:

(deftest distance-for-journey
  (testing "The distance for a multi-stop journey is correctly calculated"
    (let [g (build-graph ["ab1" "bc1" "cd1" "de1"])]
      (is (= (distance g '(:a :b :c :d)) 3)))))

If the journey is not possible, I want to return nil:

(deftest distance-broken-journey
  (testing "The distance for an impossible journey is returned as nil"
    (let [g (build-graph ["ab1" "bc1" "cd1" "de1"])]
      (is (= (distance g [:a :c :b :d]) nil)))))

I’m also going to put in a test for the edge case, where we only travel one stop:

(deftest distance-one-stop-journey
  (testing "The distance for a one stop is correctly calculated"
    (let [g (build-graph ["ab1" "bc1" "cd1" "de1"])]
      (is (= (distance g [:a :b]) 1)))))

So, how do I write the function to perform this calculation? I think the basic approach will need to break the set of steps (e.g. [:a :b :c :d]) into single steps (i.e. :a to :b, :b to :c, and :c to :d), work out the distances for each step, then add them all together. After thinking about different ways of doing this (for example, using (reduce ...)), I ended up creating a list for the starts of each step, then a list of finishes for each step, then using (map ...) to calculate the distance for each step.

It’s easiest to demonstrate this in the REPL. I’ll define a simple graph, and a route across it:

kiwiland.core=> (def g (graph/build-graph ["ab1" "bc1" "cd1"]))
#'kiwiland.core/g
kiwiland.core=> (def route [:a :b :c :d])
#'kiwiland.core/route

I can get all the starts by keeping all the steps in the route, apart from the last one:

kiwiland.core=> (drop-last 1 route)
(:a :b :c)

Similarly, I can get all the finishes by keeping all the steps apart from the first one:

kiwiland.core=> (drop 1 route)
(:b :c :d)

Applying (map ...) to this, gives me a list of the step lengths:

kiwiland.core=> (map #(graph/one-stop g %1 %2) [:a :b :c] [:b :c :d])
(1 1 1)

The version with the function calls embedded looks like this:

kiwiland.core=> (map #(graph/one-stop g %1 %2) (drop-last 1 route) (drop 1 route))
(1 1 1)

Worth also noting at this point, that if one or more parts of the route are not possible, I will get one or more nils in there too:

kiwiland.core=> (def route [:a :b :c :x])
#'kiwiland.core/route
kiwiland.core=> (map #(graph/one-stop g %1 %2) (drop-last 1 route) (drop 1 route))
(1 1 nil)

To make the code more understandable, I can use a couple of intermediate variables:

(let [starts (drop-last 1 route)
      finishes (drop 1 route)
      segments (map #(one-stop g %1 %2) starts finishes)]
  segments)

For a successful journey, I need to add up all the segments of the journey, but if any segment is nil, I want to return nil for the whole function. I can use the (nil? ...) function to test whether a value is nil, and the (some ...) function returns true if one or more items in a list matches a success criteria. So, I can check whether there is a nil value in any of the segments with the expression:

(some nil? segments)

To add the values in a list, I can just use the (apply ...) function and pass in a +:

(apply + segments)

Putting that together, I have:

(cond (some nil? segments) nil
      :else (apply + segments))

So the whole function looks like this:

(defn distance
  "Calculate the distance for a multi-stop journey"
  [the-graph route]
  (let [starts (drop-last 1 route)
        finishes (drop 1 route)
        segments (map #(one-stop the-graph %1 %2) starts finishes)]
    (cond (some nil? segments) nil
          :else (apply + segments))))

And running my tests, everything seems to work:

$ lein cloverage
Loading namespaces:  (kiwiland.core kiwiland.graph)
Test namespaces:  (kiwiland.core-test kiwiland.graph-test)
Loaded  kiwiland.graph  .
Loaded  kiwiland.core  .
Instrumented namespaces.

Testing kiwiland.core-test

Testing kiwiland.graph-test
Skipping ab11 since it has more than 3 characters
Skipping 1ab could not parse: java.lang.NumberFormatException

Ran 28 tests containing 28 assertions.
0 failures, 0 errors.
Ran tests.
Writing HTML report to: /usr/src/lein/target/coverage/index.html

|----------------+---------+---------|
|      Namespace | % Forms | % Lines |
|----------------+---------+---------|
|  kiwiland.core |  100.00 |  100.00 |
| kiwiland.graph |  100.00 |  100.00 |
|----------------+---------+---------|
|      ALL FILES |  100.00 |  100.00 |
|----------------+---------+---------|

Adding Distance Answers into our app

Now I’m going to return to my src/kiwiland/core.clj file, which had this function in it:

(defn generate-answers
  "Generate the answers to the Kiwiland Rail test questions"
  [graph-string]
  '(9 5 13 22 "NO SUCH ROUTE" 2 3 9 9 7))

I’ve now got the functions I need to build up a graph from the string which is passed in, and to also generate the first five answers (the ones based on journey distance). The first thing I need to do is include my graph module:

(ns kiwiland.core
  (:require [kiwiland.graph :as graph]))

Now I can use the (build-graph ...) function to create the graph from the input string. The only additional thing I need to do, is split the supplied comma-separated string into an array. The (split ...) function is defined in clojure.string, so I need to require that too:

(ns kiwiland.core
  (:require [kiwiland.graph :as graph])
  (:require [clojure.string :as str]))

So, I can now generate the graph:

(defn generate-answers
  "Generate the answers to the Kiwiland Rail test questions"
  [graph-string]
  (let [g (-> graph-string
              (str/split #", *")
              (graph/build-graph))]
    '(9 5 13 22 "NO SUCH ROUTE" 2 3 9 9 7)))

And I can also add in the first five function calls to the result:

(defn generate-answers
  "Generate the answers to the Kiwiland Rail test questions"
  [graph-string]
  (let [g (-> graph-string
              (str/split #", *")
              (graph/build-graph))]
    (list (graph/distance g [:A :B :C])
          (graph/distance g [:A :D])
          (graph/distance g [:A :D :C])
          (graph/distance g [:A :E :B :C :D])
          (graph/distance g [:A :E :D])
          2 3 9 9 7)))

I don’t need to change the tests, since the output should be exactly the same. However, when I run the tests, I get a couple of failures. This is because for the fifth answer, my function is returning nil, but the expected output is "NO SUCH ROUTE". I don’t want to change the function in the graph module, since that will make its use too specific. So, I’ll add a function to convert the value of nil within the core file itself. I’ll just use a standard pattern of leveraging the (or ...) function:

(or (graph/distance the-graph route)
    "NO SUCH ROUTE")))

Since I’m going to need a function for this, I’ll take the opportunity to write it as a closure, so that I can pass in the graph once, then partial application will give me a function I can call without needing to supply the graph again:

(defn distance
  "Get the distance for a journey, converting a nil response to an error string."
  [the-graph]
  (fn [route]
    (or (graph/distance the-graph route)
        "NO SUCH ROUTE")))

I can then modify (generate-answers ...) to use this function:

(defn generate-answers
  "Generate the answers to the Kiwiland Rail test questions"
  [graph-string]
  (let [g (-> graph-string
              (str/split #", *")
              (graph/build-graph))
        journey-distance (distance g)]
    (list (journey-distance [:A :B :C])
          (journey-distance [:A :D])
          (journey-distance [:A :D :C])
          (journey-distance [:A :E :B :C :D])
          (journey-distance [:A :E :D])
          2 3 9 9 7)))

This now allows all my tests to pass:

$ lein cloverage
Loading namespaces:  (kiwiland.core kiwiland.graph)
Test namespaces:  (kiwiland.core-test kiwiland.graph-test)
Loaded  kiwiland.graph  .
Loaded  kiwiland.core  .
Instrumented namespaces.

Testing kiwiland.core-test

Testing kiwiland.graph-test
Skipping ab11 since it has more than 3 characters
Skipping 1ab could not parse: java.lang.NumberFormatException

Ran 28 tests containing 28 assertions.
0 failures, 0 errors.
Ran tests.
Writing HTML report to: /usr/src/lein/target/coverage/index.html

|----------------+---------+---------|
|      Namespace | % Forms | % Lines |
|----------------+---------+---------|
|  kiwiland.core |  100.00 |  100.00 |
| kiwiland.graph |  100.00 |  100.00 |
|----------------+---------+---------|
|      ALL FILES |  100.00 |  100.00 |
|----------------+---------+---------|

Summary

This article has discussed how I built a simple graph module in Clojure, in the context of solving an exercise based on Kiwiland Trains. I’ve shown how I built a graph (adding nodes and edges), and how I calculated the distance for various journeys. In the next two articles, I will look at the remaining questions from the exercise — how to find different routes within the graph, and how to find the shortest routes.

The code I have discussed in this article is available at ianfinch/kiwiland-clojure (commit #e751 …) .