Clojure for Newbs

Why Clojure?

Clojure is interesting for a couple of reasons.

That last point bears examination. Concurrency is hard and most programmers (I include myself in their number) have difficulty reasoning about it. However, concurrency has become incredibly important because programs need to take advantage of multi-CPU machines.

Java's concurrency model is based on multiple threads and synchronizing access to shared resources. This is error prone because if you ever forget to synchronize access anywhere in the code, you're doomed. It can also lead to instances where you synchronize when you don't need to.

How does Clojure help with this?

Clojure is a functional language, a programming style that minimizes the need for shared state. Clojure's data types are immutable. But Clojure recognizes the need for shared state, and supports Software Transactional Memory (STM). STM is like a a little mini-database for your objects. It allows multiple threads to access and update the same state, and re-try in case of conflicts.

STM in Clojure is awesome because if you attempt to set shared state without starting a transaction, it throws an exception!

In other words, once you've figured out which parts of your program actually need to be mutable, Clojure will ensure access to them is synchronized.

It has been suggested that STM is like garbage collection. I think in 10 years that analogy will seem preicient.

Why not Clojure?

Clojure is a Lisp. Many people will dismiss it out-of-hand because of this.

You may also perfer an multi-process Actor-based concurrency model, like Erlang. (Clojure supports Actors but multiprocess stuff like Erlang is not there yet.)

Or if you're into type safety and want a "better Java", maybe Scala is a better fit.

Syntax (and syntax-like structures)

Lisp heads like to say their language(s) don't have syntax. It's all beautiful, pure s-expressions! Well, that may be true, but for the Newb, it sure looks like syntax, even when it's really a reader macro or a "special form".

Here's enough Clojure syntax to get you started.

Collections

Clojure has several built-in data structures.

Lists:

("a" "b" "c")

Vectors are like Arrays. They allow quick access to elements via numbered indexes:

["a" 22 "c"]

Maps:

{:a "123" :b "xyz"}

Commas are optinal, but increase readability:

{:a "123", :b "xyz"}

Sets:

#{:a :b :c}

These are familiar to Java programmers, but they are persistent -- they preserve previous versions of themselves when modified. This makes Clojure's immutable datastructures efficient. For example, when you add a new item to a list, the old list is not copied to create the new one.

Defining things

(def foo 1234)

Creates a Var foo with the value 1234.

(defn square [x] (* x x))

Creates a function square with one argument. Arguments are specified as a vector. defn is actually a macro, so the above could also be written as:

(def square (fn [x] (* x x))

You will also see defn- which creates a private function, only visible within its namespace.

Weird things

Here's a few things that I saw in source code but couldn't figure out right away. Some of the "weird stuff" you'll see is actually a reader macro so it's worth checking out that page to see what's available.

To literally make a list, use single quote: '("a" "b" "c"). This is a macro for (quote ("a" "b" "c"). It's necessary to distinguish between a list and a function call.

:keyword - kind of like a symbol in Ruby.

Optional arguments are indicated with &

(def my-things [this that & others] ...)

Inside my-things the optional argument others will be a list of extra arguments passed in. For example:

(my-things "cat" "yarn")

this => "cat" that => "yarn" others => nil

(my-things "cat" "yarn" "ball")

others => ("ball")

(my-things "cat" "yarn" "ball" "tinsel")

others => ("ball" "tinsel")

You can also have a single function with default arguments like this:

(defn fx
  ([] (fx "default" 123))
  ([arg1 arg2] (... do stuff ...)))

fx can be called with no arguments:

(fx)

Or with two:

(fx "asdf" 8)

Dereferencing a Ref or Agent can be short-cut with @.

Given:

(def foo (ref {})

(deref foo) and @foo do the same thing.

Testing

Testing with the test-is library will be built-in to Clojure 1.1. If you install Clojure from source, you will get it in the clojure.test package. Otherwise, you will need to install the clojure-contrib libraries and use clojure.contrib.test.

I've found testing functional code to be very easy -- because there's almost no setup! Since most functions have no side-effects, it's easy to pass them whatever arguments they expect and compare with the expected results.

(deftest get-words-test
  (is (= '("foo" "bar" "baz") (get-words "foo BAR BaZ")))
  (testing "short words are omitted"
           (is (= '("the" "foo") (get-words "to the foo"))))

  (testing "long words are omitted"
           (is (= '("huh") (get-words "twentycharactersisreallylongforawordindeed huh")))))

Gotchas

Functions must be declared before they are called, because symbols are resolved at compile time.

Watch out for lazy sequences. If you're working with a lazy sequence and need to get all its values, you can use dorun or doall.

Concurrency

If you're mapping a function across a list:

(map + '(1 2 3) '(1 2 3)) => (2 4 6)

You can parallelize it with pmap

(pmap + '(1 2 3) '(1 2 3)) => (2 4 6)

Now this is multithreaded! Of course, it only makes sense when the function is compulationally expensive.

Learning more

I'm keeping track of Clojure resources at Delicious: http://delicious.com/lukefrancl/clojure

Here's a few that I've found extremely helpful in getting started.

When trying to figure out what something means, check the Clojure API but don't forget the Special Forms, Reader Macros, and Java Interop pages.

Real world Clojure

If you're looking to do something real with Clojure, check out the following software: