Personal retrospectives

| categories: organization, tldr, remote, culture

As an expansion of my snippets habit, for the last 7 years I've written up a "yearly snippets" doc each year - a personal retrospective for the year.

I've used various formats - from "incident review" style through "4Ls" - but always with the intention of sharing more broadly than just to me and my manager. My colleagues have generally appreciated the perspective, and some have taken up the practice.

One of my favourite things about doing this is that it's for me. Yes, it's useful to my manager (and a useful way of "managing up"). I use it as input to the local performance and review system. But I'm the main audience and it's my practice. I hate the kinds of performance mangement systems we see in tech companies. Having something that "works around" them to my benefit is very freeing.

Anyhow, Stig and I wrote up some notes and guidance about the idea at $current:


Oncall compensation structures

| categories: operations, process, work

The subject of compensation for developers oncall comes up from time to time.

It can be difficult to find public examples of compensation structures to use.

These notes are from a quick survey of existing stuff I could find via discussions in opsy chats, the Internet, and direct questions to my network.

Asking questions

First, for those on the job hunt, a list of questions to ask about oncall, gathered from the Irish Tech Community:

  • Do you compensate being oncall (i.e. value the stress) or just when you get called (bullshit) or never (warning sign)?
  • What is the response time? Is it 5 mins (no life), 15-30 mins (some life, depending on if you have kids), or an hour (you can go to the cinema with your laptop)?
  • What percentage of your time is operations, when you’re oncall?
  • How many people are in the rotation? If < 6, is there a realistic plan in place to fix that?
    • You need at least 4 people for a reasonable shift pattern, plus one for maintenance (e.g. holidays) + one for emergency (e.g. attrition).
  • Is there one person oncall in a shift or is it a primary/secondary kind of thing?

Notes from the 'net

Second, some posts that cover oncall compensation in various detail:

Example structures

Finally, a set of example compensation structures from various companies.

A fintech company in south America:

  • If you are oncall but not working, +33% of equivalent hourly rate.
  • Paged and start working, +300% of your hourly for that period.
  • Some more extras for nights or weekends.
  • They just exported data from Pagerduty: time working was acknowledgement → resolution.
  • People would not resolve until they were finished any report or comms work that had to be done out-of-hours.
  • This apparently was just how labour laws in that country apply - works the same way for doctors.

A medium-sized SaaS company operating across US / EU:

  • Time off as standard if you actually get paged out of hours: ½ day per four hours or part thereof in responding.
  • Comp at 25% for oncall time regardless.
  • Comp → 100% for the time you’re responding.
  • Because of how their shift structure works, this all tends to amount to roughly a 10% lift in salary, plus time to recover.

A large multinational:

  • Some teams have business-hours only shifts for internal infra APIs.
  • Other teams have customer-facing services and much stricter on-call.
  • Those latter get paid per shift, get a mifi, and get time off etc.
  • ^ didn’t get exact comp structure here.

Another large multinational:

  • Three tiers of oncall, depending on pager SLA.
  • Tier 1: >= 99.9% availability SLA, 5min pager response SLA.
    • Comp paid at ⅔ for outside hours
    • That is, outside business hours accrue hours at 2h for every 3h oncall.
  • Tier 2: >= 99.9% availability SLA, > 5min but <= 15min pager response SLA.
    • Comp paid at ⅓ for outside hours.
    • That is, outside business hours accrue hours at 1h for every 3h oncall.
  • Tier 3: everything else, not comped.
  • Mon-Fri comp paid outside 9-6 core hours. Sat & Sun all comped.
  • So if you were oncall 6am-6pm Mon-Sun that’d be like
  • 3 x 5h for Mon-Fri
  • 2 x 12h for Sat-Sun
  • So 39h compensatable, converting into pay as 13h at tier 2 or 26h tier 1.
  • You could take this as either time in lieu (at 8h/day) or cash (pro-rated to salary).

A medium-sized SaaS multinational:

  • Shifts are either weekday or weekend.
  • Pay according to 60h week (hourly equiv. from salary) if weekday shift.
  • According to 40h week + 24h if weekend shift.
  • Payout doubles if schedule includes public/bank holidays.
  • Contact there mentioned this was very similar to structure in last job, another similar-sized SaaS.

Intercom's oncall implementation:

  • Former Ruby monolith sharded out over the last few years into services. Heavy on AWS and running less software.
  • An unusual structure, but interesting: specifically because they have modified their approach to avoid having “too many people/teams oncall”.
  • Virtual team, volunteers from any team in the org.
  • 6-month rotations in that virtual team, having taken a handful of shifts.
  • Oncall went from being spread across more than 30 engineers to just 6 or 7.
  • “We put in place a level of compensation that we were happy with for taking a week’s worth of on call shifts.”
    • Not sure of precise structure, presumably a bonus per week oncall.

Criteo, medium-sized Adtech HQ’d in France. This is from a 3y old Reddit thread:

  • SREs are oncall. Pager response time is 30 minutes. (!)
  • They are paid for oncall for nights/weekends etc. Exact comp unspecified.
  • If you are paged, you get comped time as well in exchange (½ day at least).
  • Internet & phone bill reimbursed for oncall engineers.
  • If you work during the night, you have to stay home until you get 11h consecutive rest (French law).

How I think about career development

| categories: career, thoughts

Over the years I’ve come up with a basic approach to my career:

  • Figure out what I want from life, and how work can support that;
  • Use role models and writing to imagine possible futures;
  • From those, map out the skills and capabilities I want to develop;
  • Lean on my current work environment - or change it - to support my development.

From time to time I talk with colleagues and friends about this. These are some notes to make sharing easier.

Assumptions

I’m writing about software engineering, systems engineering, and adjacent careers. Mainly for individual contributors, because that’s what I know best.

We are on the hook for managing our own careers: a common mistake for less experienced engineers is believing that their manager will do this. Sometimes, managers will be a help; often they won’t.

Everything is learnable. We’re not fundamentally limited by our current roles, skills, whatever.

Starting out

I think about what I want my work to enable in my life.

For some people, that’s wanting to travel, or to learn specific things, or to settle down in a beautiful place, or to found a company, or things related to family, or all of the above.

My personal list is rather general at this point, but well tested. I want to:

  • Support my family in a good life;
  • Help other human beings do things that are meaningful to them;
  • Work with teams I trust;
  • Be trusted in turn with a high degree of autonomy;
  • Learn a great deal;
  • Solve interesting problems;
  • Leave things better than I found them.

This is in rough priority order. Had I thought about this more clearly, earlier in my career, I think the order might be different. At times, particularly when changing jobs, I might have had more specific ideas. Many companies and types of work can support everything above. Some simply can't.

I like having this list because it helps orient me around the things that I need or am looking for. If I can check off everything above, or I am making positive progress, then I feel like my career is in a good place. If there are things lacking, maybe I need to take action to fix that.

It also makes the classic “where do you see yourself in five years?” feel somewhat tractable. :o)

Drawing a map

We have a starting point, but assuming that we want to meet some specific need, how do we figure out what moves to make, where to go?

I like to have a map, a way of identifying the gaps between where I am and my possible futures.

I often use role models for this - colleagues current and former, industry people I respect. What do I value about what they do and how they behave? What do I want to be when I grow up? :o)

Another useful resource is published "job ladders". These are a good way to look at the things various organizations value and how they see careers progressing. Examples:

It’s worth having a think about how much you value the skills and capabilities listed, and how different companies’ ladders differ, particularly at senior levels.

Finally, there’s an expanding literature around how senior engineers work:

There’s a lot to like, to learn, and to model in all of these.

Filling in the gaps

I tend to think in terms of skills and capabilities.

Given all of the above, which skills do I have but could develop further? Which do I lack entirely? What capabilities do I think my role models have that I don’t? Which do I value the most? Which would have helped me in my recent work? Which will enable me the most in future?

Now I have a list of specific things I want to be better at: the beginnings of a plan! I look for ways I can manoeuvre myself into work that will stretch me in those skills, build those capabilities.

Where I think it will be useful, I advertise! Let my manager, my mentors, my team know that I am trying to develop these skills. Can they help me find opportunities to improve? If I can find ways to work directly with some of the people I find inspiring, on those specific kinds of work, even better.

Note that this can form a good basis for annual or quarterly goals, so that’s another pain out of the way. ;o)

Considering context

While you work on your plans and your goals, pay attention to what’s going on around you.

Keep notes on what’s working and what’s not: in your own work, your team, your org, your production systems. This can be a useful source of ideas for specific projects or development opportunities.

If you can, find ways to make your goals work with those of the people around you. Developing your career can and should be something where everyone benefits.

Wrapping up

I have a particular perspective, and I’m certain I am missing a lot: I generally think in terms of skill acquisition and deployment, and I’m not sure how well any of this will apply to different kinds of work or different kinds of people. Take all of my advice with a grain of salt. We’re all just working this out.

To recap:

  • Think about what you want out of life, and try to make your career serve that.
  • Imagine possible futures by using role models, formal career “level” guides, and the best writing you can find on being an engineer.
  • Move towards those possibilities by mapping out a path in terms of skills and capabilities.
  • Ask for help getting what you want, and try to make it work well with what everyone around you wants.

With thanks to Tanya Reilly and Niall Richard Murphy.


Driving a "long incident" as an engineer

| categories: operations, process

Archiving a Twitter thread:

Wrote a few notes for a colleague about driving a "long incident" as an engineer.

That is, one of those "important AND urgent" things that's going to take weeks and multiple cooperating teams to get done.

Focus here is on senior IC behaviour, but cf. Managing Incidents from the SRE book.

Overcommunicate.

If you're running a meeting, have a clear agenda, a plan for what you want out of it. Take notes. If you need to switch audience, write something separately.

Maintain notes in Slack of what's going on in Zoom.

Summarize and update status frequently.

If you feel like you're communicating too much, it might be enough. :o)

Communicate clearly to your different audiences.

What do your team, or a sibling engineering team, or your support colleagues, or the execs need to know?

Think about their points of view, and frame things for them.

Engineer to engineer communications are vital. Keep managers and other parties informed, but focus on assembling and directing the engineering team who will solve the problem.

If necessary, fight for the people and resources you need to make progress.

Treat the problem with urgency, but push back on "we don't have enough time / we can't test X".

Take an engineering perspective: get as many facts on the table as you can.

Test assumptions about those facts.

Work constantly to reduce uncertainty.

Map out areas of risk: lean on your experts and help them identify questions we don't have answers for yet.

Push on getting answers to the tractable questions, given the time and resources available.

Keep an eye out for anyone spinning their wheels.

Stay calm and focused. Help everyone else to do the same. Always worth a re-read: Good Medics Don't Run.

If you've been somewhat insulated from ops, there's lots of literature out there to help reflect on incidents and long-running issues, and build up those production muscles.

I'd start probably with the Google SRE books, with the usual caveats about $megacorp vs. $tinycorp.

It comes down to people skills - hard skills, and the ones senior engineers most need to cultivate.

Since this is largely about communications, I must include my theme song: Write It Down.

I literally never tire of this video. Sorry / not sorry.


A little negativity about OKRs

| categories: process, thoughts

Archiving a Twitter thread:

Niall's thread is as thoughtful, judicious, and balanced as I'd expect. Allow me to be a bit more negative about Objectives and Key Results. :o)

First, the planning horizon. At $megacorp and anywhere that follows that playbook, OKRs are a quarterly process.

"Be careful what you measure": if your planning cadence is quarterly, there's a strong chance it'll become your delivery cadence too. That is too slow.

I love having a plan. Honestly, I enjoy the chaos as we watch it crash into reality, particularly in teams that are constrained by operational concerns (read: basically everyone with real customers).

Plans go out the window. That's fine, they're still useful.

But a quarter - for most teams and orgs I've worked with, including at $megacorp - is too big a chunk to let us respond well to inevitable change.

So OKRs become a big, expensive activity that swallow a bunch of team and management bandwidth without allowing room to react.

Second, "KRs considered harmful" (with caveats).

Making good KPIs is a known hard problem, right? As an engineering example, consider the difficulty of producing meaningful SLOs.

Coming up with a brand new set of good KPIs each quarter is high overhead and prone to failure.

"KR 1: produce the metrics we're going to use to track this objective." :o)

It can feel really hard to tie quantitative measurement of KRs - often proxy metrics - to the real outcomes we're trying to drive.

Finally though, it's not all gloom.

Objectives are a good tool for driving alignment. They can build a coherent narrative around strategy and tactics, from execs through to delivery teams.

I like to de-emphasize the measurement part though.

Where we do have KRs, ideally they're tied to longer-term KPIs that we can iterate on and come to understand better over time, on the order of months or years.

SLOs are a good example. Or from a team perspective, the DORA metrics. Or "north star" product / business metrics.

As for planning horizons: I am strongly attracted to ~6 week "bets" like Basecamp's or Intercom's.

Where I've worked like this, it's felt lighter-weight and more agile in the best senses.

All that said, whatever your particular team, org and/or company are doing to plan their time: get involved, and try to make it work as well as possible.

Channel your reservations into creativity. Use and improve the system. Try experiments. Share them.

Planning is good!


Next Page »