| categories: organization, tldr, remote, culture

A good habit I picked up at the last place was keeping "snippets" - short notes about what I worked on in a particular week. Something like this:

Weekly snippets: $project cleanup; meetings & troubleshooting.

"Short" week, in that I carried 2 days' work into it. All caught up (down?) now.

  • Removed daemonization code from $project- we expect to run it under supervisor everywhere. $link

  • Updated $otherproject to depend on that, plus Puppet changes to run under supervisor, deployed to $cluster.

  • Ran and wrote up a $topic meeting. $link

  • A frustrating day investigating $protocol issues in $cluster, much excluded but nothing pinpointed. Also some troubleshooting support during an incident on Thursday night - opened $task and updated $doc for followup.

  • Wrote up a plan for $new_project.

  • Discussed $speculative_project ideas with Eman & Charlie.

  • Copious Free Time: started working on Simple ESP Security Association Manager.

I write about this here because snippets are brilliant. I love having them to look back over - to find PRs I worked on, or docs I wrote, or when what happened, what I was thinking at a particular time. Without them, I'd be at a loss to explain what I did in any particular month, or would need to reconstruct that from old notes or emails. Writing a retrospective of my first year at the new place - do I get to call it new after a year? - pulled this into focus.

Some of my colleagues keep daily snippets. I find that weekly is about the right granularity for me: real nitty-gritty detail is in a notebook I maintain day to day. We've taken to just submitting them to a Github repo, and made a robot to notify us when something's committed so we can follow along. As a remote worker, I need to make it easy for the team to stay in touch with my work, and snippets certainly help.

lists:reverse/1 performance in Erlang

| categories: code, erlang, monitoring

A common pattern in Erlang is to do some work on a list and accumulate results in "reverse order":

do_work(List) ->
    do_work(List, []).

do_work([], Acc) ->
do_work([Head | Tail], Acc) ->
    Result = do_something(Head),
    do_work(Tail, [Result | Acc]).

If you want to preserve the ordering of the input list, you need to reverse the accumulator before return:

do_work([], Acc) ->

A question that comes up (in this case on #erlang today) is how well lists:reverse/1 performs: when should we reach for another data structure?

Because this pattern is so common, lists:reverse/2, which lists:reverse/1 is written in terms of, is a BIF: definition in C (note that a pile of other list functions are defined in C in the same file).

This means it should run pretty fast. The Erlang efficiency guide mentions it in the context of tail recursion, but urges us to measure.

I did some simple measurement on an old Xeon E5620 running Linux:


run() ->
    Steps = [10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000],
    Sequences = [{N, lists:seq(1, N)} || N <- Steps],
    Timings = [{N, avg_reverse_timing(Seq, 100)} || {N, Seq} <- Sequences],
    io:format("~p~n", [Timings]).

avg_reverse_timing(List, N) ->
    Timings = repeat_tc(lists, reverse, [List], N, []),
    lists:sum(Timings) / length(Timings).

repeat_tc(_, _, _, 0, Acc) ->
repeat_tc(M, F, A, N, Acc) ->
    {Timing, _} = timer:tc(M, F, A),
    repeat_tc(M, F, A, N - 1, [Timing | Acc]).

and graphed it with the Google Charts API:

Graph plotting average microseconds taken over 100 lists:reverse/1 runs versus length of list.

Some log scales here, but as you can see, our list needs to be 100,000 elements before we hit 10ms average on this system: assuming we're doing something interesting when we process each list element, lists:reverse/1 is unlikely to dominate our runtime. Besides, since we mostly work with much smaller lists in real programs, this is definitely in the realm of "don't worry".

It's not super-scientific, but the simple answer to when to reach for something other than a list appears to be "not quickly." :o) Good luck, and measure: even naive measurement beats guessing.

Getting basic Erlang VM metrics into Hosted Graphite.

| categories: code, erlang, monitoring

I've been working with Erlang for the last quarter, and since we very much eat our own monitoring dogfood at Hosted Graphite, I've been exporting application and VM metrics from my software.

Exometer is a nicely designed monitoring library for Erlang which includes easy export to Graphite, so it's what I've been using.

The question of getting started with it seems to come up a bit on #erlang, so I've packaged up a simple library application:


Rather than using exometer's static configuration options, we build everything dynamically in the module. It's a little verbose as a result, but clear and self-contained. The exports there should also provide examples for your own application metrics.

This library depends upon and uses Fred Hébert's recon for additional metrics useful in production. See also Erlang in Anger.

The exometer integration is based on Brian Troutwine's notes in Monitoring with Exometer: An Ongoing Love Story. For lots more detail on using exometer, see that article and the exometer documentation.

Brian also has a great talk online covering some of the same details: video, slides.

Reading rollup, 2014-12-26.

| categories: reading

  • Epidemic broadcast trees - Leitâo et al.

    This came up in the context of Riak scalability; IIUC it's part of the 2.0 series. Basho have report from RICON 2013 on this.

    Plumtree: "push-lazy-push multicast tree."

    Combines an eager gossip push strategy (along the spanning tree path) with a lazy gossip pull strategy (for recovery). You have a gossip overlay network (where each node has a partial view of the whole set), and a tree of nodes built on top of that. Cuts in the tree can be healed quickly by using the underlying (pull) network.

    "The links used for eager push are selected in such a way that their closure effectively builds a broadcast tree embedded in the random overlay network."

    Heh, GRAFT and PRUNE messages. :o)

    They include detailed descriptions of simulations backing up their claims.


    "There is an inherent trade-off between epidemic and deterministic tree-based broadcast primitives. Tree-based approaches have a small message complexity in steady-state but are very fragile in the presence of faults. Gossip, or epidemic, protocols have a higher message complexity but also offer much higher resilience. This paper proposes an integrated broadcast scheme that combines both approaches. We use a low cost scheme to build and maintain broadcast trees embedded on a gossip-based overlay. The protocol sends the message payload preferably via tree branches but uses the remaining links of the gossip overlay for fast recovery and expedite tree healing. Experimental evaluation presented in the paper shows that our new strategy has a low overhead and that is able to support large number of faults while maintaining a high reliability."

  • Software Testing with QuickCheck - John Hughes

    I'm working with Erlang these days, so I've been able to get an evaluation license for Erlang QuickCheck.

    This paper is a tutorial, working from simple property-based tests through to testing properties of a stateful system using abstract state machines to model its behaviour.

    It's lucidly written: in tandem with the QuickCheck library documentation, I was able to work fairly confidently through the examples and most of the exercises, despite the work requiring the occasional brain-flip.

    I'm really enjoying the property-based testing approach, and this implementation has clearly had a lot of work and thought put into it. It feels like cheating to be able to do testing that feels so (executably!) formal and complete in this little code.

  • AQD: Remembrancer |

    "A Quite Disposition" is James Bridle's "weak AI for gathering information about drones". It's a fascinating project. The Remembrancer is a free newspaper built automatically from the database.

    "In this world, we must take responsibility not only for our own actions, but the actions of the vast non-human assemblages we have built around us—from corporations to complex software systems—and acknowledge the moral and physical limits of our technologies, and ourselves."

Generated from pinboard by pinroll.

Reading rollup, 2014-09-27.

| categories: reading

Some news: I will start writing code for Hosted Graphite next week.

It's a (mostly) remote position, which makes it possible for me to spend winter in France with the family. We found a lovely place to live in an old medieval town on the Vienne. Spinning up a little French fluency now, remarkable how fast it comes when you've got a definite goal. :o)

Hat-tip (as is often the case!) to for the first two links below.

  • Coreutils viewer


    A Linux tool to show progress for cp, rm, dd, ...

    It simply scans /proc for interesting commands, and then uses the fd/ and fdinfo/ directories to find opened files and seek position, and reports status for the biggest file.

  • In Search of an Understandable Consensus Algorithm - Diego Ongaro and John Ousterhout, Stanford University

    A great read. Lucid description of the replicated state machine problem, issues with Paxos, and how Raft works. Interesting that the primary goal was understandability. Maybe that should be the case for all software and algorithms ...

  • There's Just No Getting around It: You're Building a Distributed System - Mark Cavage, Joyent

    Good worked example of basic requirements & back-of-the-envelope work for an image resizing service. Separation of services, API-focused design: familiar approach.

    "the goal is simply to highlight the realities and look at the basics required to build a distributed system... Too many applications are being built by stringing together a distributed database system with some form of application, and hoping for the best."

    "Drawing parallels to a previous generation, services are the new process; they are the correct abstraction level for any discrete functionality in a system."

Generated from pinboard by pinroll.

Next Page »