Hotfix releases with Travis CI

If your engineering organization uses a staged release process, it might be implemented like this:

Development work happens on the master branch, and periodically, releases are created and tagged. In order to reach the production environment, a release must be promoted from Dev to QA to Production. This could mean that release 0.1 is running on production while release 0.2 is being QAed, and the developers just finished 0.3 and are starting to work on new features for the next release.

With this release model, when a bug is discovered in production, a hotfix might be necessary, because otherwise you'd be promoting the bugfix along with a bunch of unreleased changes straight to production. Instead, typically, a branch is created from the tag that is running on production and the bug is fixed there. This hotfix release can go through the same promotion process. The theory is that this will be smoother because there are fewer changes.

One challenge with this approach is getting your continuous integration system to build an artifact in order to make the hotfix deployable. By default on Travis CI, only master branch builds run the deploy stage. If you don't know the name of the branch to deploy ahead of time, you need to configure the deploy stage for all branches with a condition:

deploy:
  provider: script
  script: your_deploy_script.sh
  on: 
    branch: all_branches
    condition: $FOO = "awesome"

In order to get hotfix deployments working, I assumed I could use git rev-parse to check the name of the branch:

deploy:
  provider: script
  script: your_deploy_script.sh
  on: 
    branch: all_branches
    condition: $(git rev-parse --abbrev-ref HEAD) =~ ^(master|.*hotfix.*)$

However, this doesn't work because of how Travis CI checks out your project to build. Travis CI clones your branch and then checks out the ref to build directly. This makes sense -- it guarantees that Travis builds exactly what triggered it -- but doesn't work with rev-parse.

Fortunately, you can check the $TRAVIS_BRANCH environment variable and use that instead. But be careful, because pull request builds (which represent the merge of the branch with master) also set $TRAVIS_BRANCH.

Here's a working solution for releasing master and branches containing hotfix:

deploy:
  provider: script
  script: your_deploy_script.sh
  on:
    all_branches: true
      # Deploy when this is Push build and the branch is either master or contains "hotfix"
      # NOTE: The check for TRAVIS_PULL_REQUEST is necessary because for a pull request build
      # TRAVIS_BRANCH will be the target of the pull request (typically master).
      # See: https://docs.travis-ci.com/user/environment-variables/#Default-Environment-Variables
      #      https://docs.travis-ci.com/user/deployment/#Conditional-Releases-with-on%3A
      condition: $TRAVIS_PULL_REQUEST = "false" && $TRAVIS_BRANCH =~ ^(master|.*hotfix.*)$

If you use this process with Travis CI, hopefully this will save you some time. Time you can use to switch to continuous deployment instead of rolling releases. Sorry, sore subject.

— August 8, 2017

Understanding orders of magnitude

Adrian Colyer has a cool way to help understand orders of magnitude: translate them into human terms. For example, to understand how long a second takes in computer time, upscale nanoseconds to seconds:

If we set up a ‘human time’ scale where one nanosecond is represented by one second, then we have:

1 nanosecond = 1 second
1 microsecond = 16.7 minutes
1 millisecond = 11.6 days
1 second = 31.7 years

That slow page render time starts looking different when you think about it taking 15 years...

He has similar analogies for latency and throughput.

— July 10, 2017

Build a query parser

I recently published a long tutorial about how and why to build your own query parser for your application's search feature. I wrote it because I've seen the pain of not doing it.

In the first application I worked on in my career, I added full-text search using Lucene. This was before Solr or Elasticsearch, so I learned a lot building my own indexing and search directly. When we started, we used the venerable Lucene QueryParser. However, we ran into problems -- mainly around user input. I looked at how QueryParser was implemented. It used JavaCC to parse user input and build the underlying Lucene query objects. Following the built-in parser as a model, I built a simpler one that we could safely expose to users. It was my first experience building a parser for real.

I always wanted to build a query parser at my last job, but there was never time. At my current job, we also ran into similar issues. So I decided to write a tutorial.

First, I wanted to write the code. I started reading about parser generators in Ruby (my strongest language right now) and quickly settled on Parslet because of its great documentation and easy-to-use API. Writing the code was surprisingly easy. I ran into hurdles along the way, of course, but I was able to get something working really quickly. Writing tests and building a query generation harness helped me find problems I missed earlier.

Writing the tutorial took longer. I had to do a lot of research into PEG parsers to avoid writing something incorrect. I sent drafts to former coworkers and got some great feedback that I incorporated into the tutorial.

Along with the code for the parser itself and the tutorial, I also wrote a mini publishing system. The tutorial is written in Markdown with a preprocessor to include snippets of code and SVGs (for the diagrams), then put it into an HTML layout. The Markdown is processed by CommonMarker (one of the nicest Markdown parsers) and the source code is colorized by Rouge. Writing this little publishing system reminded me how fun it can be to write software for yourself! It's janky, but it was fun to write and it gets the job done.

— July 1, 2017

Purely algebraic abstractions

Haskell Prime is a collection of ideas for future versions of Haskell, including a proposal to generalize the numeric typeclasses by removing their semantic content, replacing Num and most of its subclasses with purely algebraic classes…

This proposal brings the straightforward clarity of Monad to arithmetic, an area where Haskell has long suffered from comprehensibility bordering on practicality.

— June 29, 2017

Exploring infinite sequences

I was recently watching John Hughes's talk Why Functional Programming Matters. One thing that jumped out at me was his definition of the Newton-Raphson square root approximation algorithm using a lazy sequence of estimates. This is an example from his influential paper of the same name.

The iterative algorithm looks like this:

def sqrt(n)
  # terminate when two successive guesses differ by less than epsilon
  # Initial guess can be anything as long as abs(x - y) > epsilon
  epsilon = 0.0001
  x = 1.0
  y = x + 2.0 * epsilon

  while (x - y).abs > epsilon
    y = x
    x = (x + n/x)/2.0
  end

  x
end

Hughes describes this program as "indivisible" -- the estimation is coupled with the iteration and the selection of the final estimate.

What impressed me about his example of doing this lazily in Haskell was how simple it was:

sqrt n = limit eps (iterate next 1.0)
  where next x = (x + n/x) / 2

I wanted to learn more, so I found his paper which has a more verbose version.

First, he defines a next function that produces an estimate:

next n x = (x + n/x)/2

Then, he defines an infinite sequence of better and better estimates. If the first estimate is 1.0 and the estimation function is called f, the sequence is:

[1.0, f 1.0, f (f 1.0), f (f (f 1.0)), ...]

This sequence can be generated using the Haskell function iterate applied to the curried function next n:

iterate (next n) 1.0

This is fun to play with. Here's the first 5 approximations of the square root of 1000 starting with an estimate of 1.0:

take 5 $ iterate (next 1000) 1.0
[1.0,500.5,251.249000999001,127.61455816345908,67.72532736082603]

Next, Hughes defines a limit function (called within in the original paper) to compare the difference of the first two estimates in the sequence to the tolerance, and return one if it's close enough, otherwise, call itself recursively with the rest of the sequence (which, again, is infinite). As long as the function converges, this will eventually return a result.

-- (x:y:ys) is destructuring. 
-- `x` is the first element, `y` is the second element, and `ys` is the rest
limit epsilon (x:y:ys) = 
  if abs (x - y) <= epsilon then 
    y 
  else 
    limit (y:ys)

Putting it together, square root can be defined as:

epsilon = 0.0001
sqrt n = limit epsilon (iterate (next n) 1.0)

The really cool thing about this is that it separates the generation of the sequence from the selection of the value. The limit function can be used for other estimation problems with the same algorithm, such as calculating the derivative:

deriv f x = 
  limit epsilon (map slope (iterate (/2) 1.0)) 
  where slope h = (f (x+h) - f x) / h

-- (^2) is partial application: https://wiki.haskell.org/Section_of_an_infix_operator
-- So this says "compute the derivative of x^2 at 5"
deriv (^2) 5
10.0009765625

-- f(x) = x^2
-- f'(x) = 2x
-- f'(5) => 10
-- It works!

I have found lazy evaluation confusing, but this is a powerful example of how it can simplify code. I wanted to see how it could be applied in a language with eager evaluation, so I translated the code to Ruby.

def iterate(a, &f)
  Enumerator.new do |yielder|
    loop do
      yielder << a
      a = f.call(a)
    end
  end.lazy
end

def limit(epsilon, seq)
  # NOTE: To maintain the same behavior as the Haskell code,
  #       peek at the second estimate so it's not consumed.
  a = seq.next
  b = seq.peek
  if (a - b).abs <= epsilon
    b
  else
    limit(epsilon, seq)
  end
end

def sqrt(n)
  limit(0.0001, iterate(1.0) { |a| (a + n/a.to_f)/2.0 })
end

This is not as simple as the Haskell code!

Most of the complexity is due to the difficulty of generating an infinite sequence. Ruby is eagerly evaluated, so the iterate method uses Enumerator with a loop. But as seen in the limit method, the Enumerator takes special care in order to produce the same behavior as the Haskell code. A method to split an infinite sequence into first and rest would make things a lot easier.

This was an enlightening exercise. I learned more about laziness -- and how hard it can be to shoehorn it into an eagerly evaluated language. Ruby doesn't lend itself towards lazy sequences, so its support for programming with lazy sequences is weak.

Further reading

I posted the source code from this blog post if you'd like to play around with it.

I adapted the Haskell code for this post from John Hughes's slides, original paper, and Heinrich Apfelmus's post about lazy evaluation.

For the Ruby code, I learned a lot by reading Joël Quenneville's Functional Ciphers in Ruby and Ross Kaffenberger's Infinite Sequences in Ruby (he also has several more articles about Enumerator).

Thanks to Paul Cantrell and Charles Capps for reviewing drafts of this post. Any errors are my own.

— May 24, 2017

Goodnight Dune

Goodnight Dune

— May 18, 2017

You can't average percentiles. Period.

Gil Tene on the common practice of averaging percentiles for monitoring:

At first blush, you may think that the hourly average of the per-5-second-interval 99%ile value has some meaning. you know averages can be inaccurate and misrepresenting at times, but this value could still give you some "feel" for the 99%'lie system, right?

Wrong....

If you had the Maximum value recorded for each 5 second interval over a 5000 second run, and want to deduce the Maximum value for the whole run, would an average of the 1,000 max values produce anything useful? No. You'd need to use the Maximum of the Maximums. Which will actually work, but only for the unique edge case of 100%...

The only thing you can deduce about the 99%'lie of a 5000 second run for which you have 1,000 5 second summary 99%'lie values is that it falls somewhere between the Maximum and Minimum of those 1,000 99%'lie values...

Bottom line: You can't average percentiles. It doesn't work. Period.

— May 15, 2017

Skipping Travis CI builds for snapshot release commits with sbt-release

If you use sbt-release with the MAJOR.MINOR.PATCH and MAJOR.MINOR.PATCH+1-SNAPSHOT style of release versioning, this creates two commits when you run sbt release that differ only in the contents of the version.sbt file.

Unfortunately, if you're using a continuous integration system like Travis CI, those two commits both trigger full builds. There's really no point for that, since there's no difference in the code. Fortunately, Travis CI (and other systems) support marking a commit with [ci skip] to skip the build.

You can customize the commit message by changing the releaseCommitMessage setting. However, releaseCommitMessage is used for both the commitReleaseVersion and commitNextVersion actions, so some conditional logic is needed.

The default value looks like this:

releaseCommitMessage := s"Setting version to ${if (releaseUseGlobalVersion.value) (version in ThisBuild).value else version.value}"

Change releaseCommitMessage so that if the version ends with "-SNAPSHOT" then " [ci skip]" is appended to the message:

releaseCommitMessage := "Setting version to " +
  s"${if (releaseUseGlobalVersion.value) (version in ThisBuild).value else version.value}" +
  s"${if (if (releaseUseGlobalVersion.value) (version in ThisBuild) else version).value.endsWith("-SNAPSHOT")) " [ci skip]" else ""}"

Now you'll end up with release commits that look like "Setting version to 1.2.3" and snapshot commits that look like "Setting version to 1.2.4-SNAPSHOT [ci skip]".

— March 18, 2017

Acing the technical interview

No time for descriptive variables, examples, or docstrings here. In the whiteboard interview, time is of the essence. Pretend you are a Haskell programmer, as your grandmother was, before her continuation passed.

— March 17, 2017

My favorite books of 2016

Time for my annual round up of books read in the past year with a few favorites noted.

Cadillac Desert

Cadillac Desert: The American West and Its Disappearing Water, Marc Reiser

This is an eye-opening story of man's hubris in the American West. Attempting to make the desert bloom led to uncountable environmental destruction. As the climate changes and water patterns change, will we attempt to hold on to the "Cadillac desert" at huge expense, or let farmers and communities wither? The present Oroville Dam disaster would fit right in as a chapter in this book.

Sapiens

Sapiens: A Brief History of Humankind, Yuval Noah Harari

Breezy and fun, but with a lot to think about. This book stuck with me.

My Ántonia

My Ántonia, Willa Cather

Based on reading this back in 2007, I consider it one of my favorite novels. So much so that we named our daughter after the title character. I was worried that I wouldn't like it, but, if anything, the nostogia of the novel hit me harder now that I'm older.

From Bauhaus to Our House

From Bauhaus to Our House, Tom Wolfe

I am sure Tom Wolfe is an enormous dick in real life, but wow is his skewering of modernist architecture funny and satisfying.

San Francisco: La grille sur les collines<

San Francisco: La grille sur les collines / San Francisco: The Grid meets the Hills, Florence Lipsky

This is a fascinating and unique book about how San Francisco's street grid is adapted to the landscape. Through a combination of greed, incompetence, hubris, and Yankee ingenuity, San Francisco's city planners overlaid the standard American gridiron street pattern onto the penninsula's topography. The result is San Francisco's wildly undulating streets, which sometimes dead end spectacularly. Despite their best efforts, sometimes the grid could not be made to fit the landscape. These deformations are a major focus of the book and it features photos and illustrations of the effects in many neighborhoods across the city:

Telegraph Hill

For more pages from the book, check out this post on Map Library.

Below is the complete list of books I read in 2015:

Cadillac Desert: The American West and Its Disappearing Water, Marc Reisner

Grounded: The Case for Abolishing the United States Air Force, Robert M. Farley

Sapiens: A Brief History of Humankind, Yuval Noah Harari

A Shortage of Engineers, Robert Grossbach

The Wise Man's Fear, Patrick Rothfuss

Snowpiercer: The Escape, Jacques Lob, Jean-Marc Rochette (illustrator), and Virginie Selavy (translator)

Hyperbole and a Half: Unfortunate Situations, Flawed Coping Mechanisms, Mayhem, and Other Things That Happened, Allie Brosh

My Ántonia, Willa Cather

The Alchemist, Paolo Bacigalupi

Straight Man, Richard Russo

Decisive: How to Make Better Choices in Life and Work, Chip Heath and Dan Heath

Alice's Adventures in Wonderland Lewis Carroll

Through the Looking-Glass and What Alice Found There, Lewis Carroll

Liar's Poker: Rising Through the Wreckage on Wall Street, Michael Lewis

The Night Sessions, Ken MacLeod

How We Got to Now: Six Innovations That Made the Modern World, Steven Johnson

Terminal World, Alastair Reynolds

Success and Luck: Good Fortune and the Myth of Meritocracy, Robert H. Frank

Deadpool: The Complete Collection Volume 1, Daniel Way et al.

Minority Report, Philip K. Dick

From Bauhaus to Our House, Tom Wolfe

Wolf: The Lives of Jack London, James L. Haley

San Francisco: La grille sur les collines / San Francisco: The Grid Meets the Hills, Florence Lipsky

Pump Six and Other Stories, Paolo Bacigalupi

Bébé Day by Day: 100 Keys to French Parenting, Pamela Bruckerman

Glup: Adventures in the Alimentary Canal, Mary Roach

Wine Wars: The Curse of the Blue Nun, the Miracle of Two Buck Chuck, and the Revenge of the Terroirists, Mike Veseth

Disrupted: My Misadventure in the Start-up Bubble, Dan Lyons

The Swerve: How the World Became Modern, Stephen Greenblatt

You can also check out lists from previous years: 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, and 2015.

— March 9, 2017