Retroactive favorites: 2010 books

I've been keeping track of the books I read each year for a while now. In 2011, I started picking out a few favorites as well. I can't go back and add my favorites from previous years, but I can look back from now and pick out the books that still stand out to me, years later. This is necessarily colored by the passage of time, but that's all part of the fun. I've decided to start with the most recent year and go backwards to the beginning. Here's my retroactive favorites from 2010.

Book Cover

The Happiness Hypothesis: Finding Modern Truth in Ancient Wisdom, Jonathan Haidt

For a while, I was on a positive psychology kick, and read probably half a dozen books about happiness research. This was probably my favorite. Haidt surveys ancient philosophies and religions and considers their wisdom in light of modern psychology research.

Book Cover

American Aurora: A Democratic-Republican Returns. The Suppressed History of our Nation's Beginnings and the Heroic Newspaper that Tried to Report It, Richard N. Rosenfeld

I have never read a history book like American Aurora before. It is composed almost entirely of period newspaper editorials, strung together with in character first person observations by the editor of the newspaper. Does that make it historical fiction? It's strange to read, but it works. The book centers around the debate between the Democratic-Republicans (heirs to the Anti-Federalists) and the ruling Federalists as the Federalist government comes close to war with Revolutionary France. An important lesson I learned from this book is that American politics has been more partisan than it is now (and not just during the Civil War). Rival militias marched, newspaper offices were torched, people were assaulted.

Book Cover

Wolf Hall, Hilary Mantel

Mantel takes Thomas Cromwell, the villian of A Man For All Seasons, and makes him into a rich, three dimensional character who will do anything to make sure his (rather feckless) master Henry VIII gets what he wants -- while, meanwhile, promoting his own Protestant agenda.

Book Cover

The Promise of Sleep: A Pioneer in Sleep Medicine Explores the Vital Connection Between Health, Happiness, and a Good Night's Sleep, William C. Dement

This is a strange book. Written by one of the scientists who started the field of sleep medicine and did a lot of pioneering research into sleep and sleep disorders, it's half fascinating insight into sleep research (for example, he tells the the story of how he was gearing up for a big study on melatonin as a prescription sleep medicine...and then the FDA categorized it as a supplement, kicking off the largest uncontrolled sleep study in history) and half an old man's reminisces. Nevertheless, I have recommended it to others several times over the years. It also has what Dement says is a fool-proof cure for jet lag, which is hard to carry out because doctors are reluctant to prescribe sleeping pills.

Book Cover

No Simple Victory: World War II in Europe, 1939-1945, Norman Davies

Americans get the "good war" version of World War II, where heroic democracies fight back against overwhelming odds against evil fascists. Davies, a historian of Poland, sets out to correct that interpretation. In his telling, the war in Europe was primarily a contest between two totalitarian states, with the Western allies relegated to ineffectual aerial bombardment. He presents a strong case -- in terms of men and materiel, the Eastern front was the axis of the conflict. This version of history is not one that we in the West get much of, and it colors the outcome of the war significantly.

— October 9, 2017

Google Vizier: A service for black-box optimization

Google Vizier performs hyperparameter tuning. And makes cookies:

At the tail end of the paper we find something else very much of our times, and so very Google. If you ever find yourself in the Google cafeteria, try the cookies. The recipe for those cookies was optimised by Vizier, with trials sent to chefs to bake, and feedback on the trials obtained by Googler’s eating the results.

It has already proven to be a valuable platform for research and development, and we expect it will only grow more so as the area of black-box optimization grows in importance. Also, it designs excellent cookies, which is a very rare capability among computational systems.

— October 2, 2017

Hotfix releases with Travis CI

If your engineering organization uses a staged release process, it might be implemented like this:

Development work happens on the master branch, and periodically, releases are created and tagged. In order to reach the production environment, a release must be promoted from Dev to QA to Production. This could mean that release 0.1 is running on production while release 0.2 is being QAed, and the developers just finished 0.3 and are starting to work on new features for the next release.

With this release model, when a bug is discovered in production, a hotfix might be necessary, because otherwise you'd be promoting the bugfix along with a bunch of unreleased changes straight to production. Instead, typically, a branch is created from the tag that is running on production and the bug is fixed there. This hotfix release can go through the same promotion process. The theory is that this will be smoother because there are fewer changes.

One challenge with this approach is getting your continuous integration system to build an artifact in order to make the hotfix deployable. By default on Travis CI, only master branch builds run the deploy stage. If you don't know the name of the branch to deploy ahead of time, you need to configure the deploy stage for all branches with a condition:

deploy:
  provider: script
  script: your_deploy_script.sh
  on: 
    branch: all_branches
    condition: $FOO = "awesome"

In order to get hotfix deployments working, I assumed I could use git rev-parse to check the name of the branch:

deploy:
  provider: script
  script: your_deploy_script.sh
  on: 
    branch: all_branches
    condition: $(git rev-parse --abbrev-ref HEAD) =~ ^(master|.*hotfix.*)$

However, this doesn't work because of how Travis CI checks out your project to build. Travis CI clones your branch and then checks out the ref to build directly. This makes sense -- it guarantees that Travis builds exactly what triggered it -- but doesn't work with rev-parse.

Fortunately, you can check the $TRAVIS_BRANCH environment variable and use that instead. But be careful, because pull request builds (which represent the merge of the branch with master) also set $TRAVIS_BRANCH.

Here's a working solution for releasing master and branches containing hotfix:

deploy:
  provider: script
  script: your_deploy_script.sh
  on:
    all_branches: true
      # Deploy when this is Push build and the branch is either master or contains "hotfix"
      # NOTE: The check for TRAVIS_PULL_REQUEST is necessary because for a pull request build
      # TRAVIS_BRANCH will be the target of the pull request (typically master).
      # See: https://docs.travis-ci.com/user/environment-variables/#Default-Environment-Variables
      #      https://docs.travis-ci.com/user/deployment/#Conditional-Releases-with-on%3A
      condition: $TRAVIS_PULL_REQUEST = "false" && $TRAVIS_BRANCH =~ ^(master|.*hotfix.*)$

If you use this process with Travis CI, hopefully this will save you some time. Time you can use to switch to continuous deployment instead of rolling releases. Sorry, sore subject.

— August 8, 2017

Understanding orders of magnitude

Adrian Colyer has a cool way to help understand orders of magnitude: translate them into human terms. For example, to understand how long a second takes in computer time, upscale nanoseconds to seconds:

If we set up a ‘human time’ scale where one nanosecond is represented by one second, then we have:

1 nanosecond = 1 second
1 microsecond = 16.7 minutes
1 millisecond = 11.6 days
1 second = 31.7 years

That slow page render time starts looking different when you think about it taking 15 years...

He has similar analogies for latency and throughput.

— July 10, 2017

Build a query parser

I recently published a long tutorial about how and why to build your own query parser for your application's search feature. I wrote it because I've seen the pain of not doing it.

In the first application I worked on in my career, I added full-text search using Lucene. This was before Solr or Elasticsearch, so I learned a lot building my own indexing and search directly. When we started, we used the venerable Lucene QueryParser. However, we ran into problems -- mainly around user input. I looked at how QueryParser was implemented. It used JavaCC to parse user input and build the underlying Lucene query objects. Following the built-in parser as a model, I built a simpler one that we could safely expose to users. It was my first experience building a parser for real.

I always wanted to build a query parser at my last job, but there was never time. At my current job, we also ran into similar issues. So I decided to write a tutorial.

First, I wanted to write the code. I started reading about parser generators in Ruby (my strongest language right now) and quickly settled on Parslet because of its great documentation and easy-to-use API. Writing the code was surprisingly easy. I ran into hurdles along the way, of course, but I was able to get something working really quickly. Writing tests and building a query generation harness helped me find problems I missed earlier.

Writing the tutorial took longer. I had to do a lot of research into PEG parsers to avoid writing something incorrect. I sent drafts to former coworkers and got some great feedback that I incorporated into the tutorial.

Along with the code for the parser itself and the tutorial, I also wrote a mini publishing system. The tutorial is written in Markdown with a preprocessor to include snippets of code and SVGs (for the diagrams), then put it into an HTML layout. The Markdown is processed by CommonMarker (one of the nicest Markdown parsers) and the source code is colorized by Rouge. Writing this little publishing system reminded me how fun it can be to write software for yourself! It's janky, but it was fun to write and it gets the job done.

— July 1, 2017

Purely algebraic abstractions

Haskell Prime is a collection of ideas for future versions of Haskell, including a proposal to generalize the numeric typeclasses by removing their semantic content, replacing Num and most of its subclasses with purely algebraic classes…

This proposal brings the straightforward clarity of Monad to arithmetic, an area where Haskell has long suffered from comprehensibility bordering on practicality.

— June 29, 2017

Exploring infinite sequences

I was recently watching John Hughes's talk Why Functional Programming Matters. One thing that jumped out at me was his definition of the Newton-Raphson square root approximation algorithm using a lazy sequence of estimates. This is an example from his influential paper of the same name.

The iterative algorithm looks like this:

def sqrt(n)
  # terminate when two successive guesses differ by less than epsilon
  # Initial guess can be anything as long as abs(x - y) > epsilon
  epsilon = 0.0001
  x = 1.0
  y = x + 2.0 * epsilon

  while (x - y).abs > epsilon
    y = x
    x = (x + n/x)/2.0
  end

  x
end

Hughes describes this program as "indivisible" -- the estimation is coupled with the iteration and the selection of the final estimate.

What impressed me about his example of doing this lazily in Haskell was how simple it was:

sqrt n = limit eps (iterate next 1.0)
  where next x = (x + n/x) / 2

I wanted to learn more, so I found his paper which has a more verbose version.

First, he defines a next function that produces an estimate:

next n x = (x + n/x)/2

Then, he defines an infinite sequence of better and better estimates. If the first estimate is 1.0 and the estimation function is called f, the sequence is:

[1.0, f 1.0, f (f 1.0), f (f (f 1.0)), ...]

This sequence can be generated using the Haskell function iterate applied to the curried function next n:

iterate (next n) 1.0

This is fun to play with. Here's the first 5 approximations of the square root of 1000 starting with an estimate of 1.0:

take 5 $ iterate (next 1000) 1.0
[1.0,500.5,251.249000999001,127.61455816345908,67.72532736082603]

Next, Hughes defines a limit function (called within in the original paper) to compare the difference of the first two estimates in the sequence to the tolerance, and return one if it's close enough, otherwise, call itself recursively with the rest of the sequence (which, again, is infinite). As long as the function converges, this will eventually return a result.

-- (x:y:ys) is destructuring. 
-- `x` is the first element, `y` is the second element, and `ys` is the rest
limit epsilon (x:y:ys) = 
  if abs (x - y) <= epsilon then 
    y 
  else 
    limit (y:ys)

Putting it together, square root can be defined as:

epsilon = 0.0001
sqrt n = limit epsilon (iterate (next n) 1.0)

The really cool thing about this is that it separates the generation of the sequence from the selection of the value. The limit function can be used for other estimation problems with the same algorithm, such as calculating the derivative:

deriv f x = 
  limit epsilon (map slope (iterate (/2) 1.0)) 
  where slope h = (f (x+h) - f x) / h

-- (^2) is partial application: https://wiki.haskell.org/Section_of_an_infix_operator
-- So this says "compute the derivative of x^2 at 5"
deriv (^2) 5
10.0009765625

-- f(x) = x^2
-- f'(x) = 2x
-- f'(5) => 10
-- It works!

I have found lazy evaluation confusing, but this is a powerful example of how it can simplify code. I wanted to see how it could be applied in a language with eager evaluation, so I translated the code to Ruby.

def iterate(a, &f)
  Enumerator.new do |yielder|
    loop do
      yielder << a
      a = f.call(a)
    end
  end.lazy
end

def limit(epsilon, seq)
  # NOTE: To maintain the same behavior as the Haskell code,
  #       peek at the second estimate so it's not consumed.
  a = seq.next
  b = seq.peek
  if (a - b).abs <= epsilon
    b
  else
    limit(epsilon, seq)
  end
end

def sqrt(n)
  limit(0.0001, iterate(1.0) { |a| (a + n/a.to_f)/2.0 })
end

This is not as simple as the Haskell code!

Most of the complexity is due to the difficulty of generating an infinite sequence. Ruby is eagerly evaluated, so the iterate method uses Enumerator with a loop. But as seen in the limit method, the Enumerator takes special care in order to produce the same behavior as the Haskell code. A method to split an infinite sequence into first and rest would make things a lot easier.

This was an enlightening exercise. I learned more about laziness -- and how hard it can be to shoehorn it into an eagerly evaluated language. Ruby doesn't lend itself towards lazy sequences, so its support for programming with lazy sequences is weak.

Further reading

I posted the source code from this blog post if you'd like to play around with it.

I adapted the Haskell code for this post from John Hughes's slides, original paper, and Heinrich Apfelmus's post about lazy evaluation.

For the Ruby code, I learned a lot by reading Joël Quenneville's Functional Ciphers in Ruby and Ross Kaffenberger's Infinite Sequences in Ruby (he also has several more articles about Enumerator).

Thanks to Paul Cantrell and Charles Capps for reviewing drafts of this post. Any errors are my own.

— May 24, 2017

Goodnight Dune

Goodnight Dune

— May 18, 2017

You can't average percentiles. Period.

Gil Tene on the common practice of averaging percentiles for monitoring:

At first blush, you may think that the hourly average of the per-5-second-interval 99%ile value has some meaning. you know averages can be inaccurate and misrepresenting at times, but this value could still give you some "feel" for the 99%'lie system, right?

Wrong....

If you had the Maximum value recorded for each 5 second interval over a 5000 second run, and want to deduce the Maximum value for the whole run, would an average of the 1,000 max values produce anything useful? No. You'd need to use the Maximum of the Maximums. Which will actually work, but only for the unique edge case of 100%...

The only thing you can deduce about the 99%'lie of a 5000 second run for which you have 1,000 5 second summary 99%'lie values is that it falls somewhere between the Maximum and Minimum of those 1,000 99%'lie values...

Bottom line: You can't average percentiles. It doesn't work. Period.

— May 15, 2017

Skipping Travis CI builds for snapshot release commits with sbt-release

If you use sbt-release with the MAJOR.MINOR.PATCH and MAJOR.MINOR.PATCH+1-SNAPSHOT style of release versioning, this creates two commits when you run sbt release that differ only in the contents of the version.sbt file.

Unfortunately, if you're using a continuous integration system like Travis CI, those two commits both trigger full builds. There's really no point for that, since there's no difference in the code. Fortunately, Travis CI (and other systems) support marking a commit with [ci skip] to skip the build.

You can customize the commit message by changing the releaseCommitMessage setting. However, releaseCommitMessage is used for both the commitReleaseVersion and commitNextVersion actions, so some conditional logic is needed.

The default value looks like this:

releaseCommitMessage := s"Setting version to ${if (releaseUseGlobalVersion.value) (version in ThisBuild).value else version.value}"

Change releaseCommitMessage so that if the version ends with "-SNAPSHOT" then " [ci skip]" is appended to the message:

releaseCommitMessage := "Setting version to " +
  s"${if (releaseUseGlobalVersion.value) (version in ThisBuild).value else version.value}" +
  s"${if (if (releaseUseGlobalVersion.value) (version in ThisBuild) else version).value.endsWith("-SNAPSHOT")) " [ci skip]" else ""}"

Now you'll end up with release commits that look like "Setting version to 1.2.3" and snapshot commits that look like "Setting version to 1.2.4-SNAPSHOT [ci skip]".

— March 18, 2017