Effective communication and surprise

| categories: talks, career, work, thoughts

A ~15 minute talk intended for software engineers, delivered at an internal engineering summit at $current, September 2023.

A waving hand emoji

Hi folks, my name is Cian Synnott. I'm senior staff engineer with Runtime Infrastructure - that's our production compute, networking, identity & access management, and deployments systems. These days I'm focusing mainly on issues around cloud resource management.

The title of this talk is "Effective communication". I'll make a few general points, and then focus on the idea of surprise in communication.

I'll talk about ways to reduce the frequency and impact of surprise, and how to react productively to it. Finally I'll talk a little about the practice of empathy for our colleagues.

We'll have time at the end for a discussion. I'm curious to hear whether the ideas I present resonate with you.

A speech bubble emoji

Communication at work is an enormous topic. I'll touch on a couple of general points and then I'm going to focus on one interesting facet: surprise.

First up, what's "effective" communication? Not just "impactful" - it's easy to have a big impact with communication, positive or negative. "Effective" implies a goal: communication that has the desired effect.

There's a goal behind most communication at work: convince someone of something; share context so they understand something that we know; learn ourselves, or teach; make sure we have the same understanding of a problem, or shared priorities.

If you're trying to communicate something, particularly to a larger audience, and you can't express what your goal is in doing so, maybe get back to the drawing board.

Second, the myth of the lone genius programmer is persistent in our industry, but the truth is that getting large things accomplished means working as part of a team - where you are communicating constantly.

As such, communication skills are first order skills for software engineers. We tend to valorize pure technical capability, but communication skills are easily as important.

Third, everything we communicate is to some audience of fellow humans: another engineer, our manager, our team, partners, customers, our organization. Not one person in this world sees things exactly the way I do or you do. We each have our own background, context, worries, preoccupations.

So we're crossing a chasm when we communicate: trying to get our meaning, our understanding, our intention across to the people on the other side. To be effective - to achieve our goals - we have to try to understand where our audience is at and meet them there.

An exclamation mark emoji

Let's talk about surprise. Our team, our work group, our company is a community or a social system. And that combines with our technical systems to make our system of work - you hear people talk about "socio-technical" systems.

What do we call a big surprise in our technical systems? An incident! An incident requires response - often by many people - incident management, cleanup. It can be very chaotic. Usually an incident turns out to contain a series of surprises; some of those are in our social system. A good incident review will address these directly.

Our technical systems aside, communication is full of surprises. We can experience these surprises as positive or negative. Many are inconsequential. But big, negative surprises - just like a technical incident - can create a lot of chaos and upset.

As a few examples: unexpected negative feedback; realizing that you and a colleague have a totally different conception of a problem you've worked together on for weeks; finding out your company or your customers don't value something your team has sunk a lot of time into.

Surprise in communication is the enemy of effective communication. It gets in the way of our goals, creates churn that we have to manage, and potentially derails whole efforts. So broadly speaking "don't surprise people!"

Much easier said than done. Just like in our technical systems, surprise in our system of communication is inevitable. But what we can do is:

  • limit how often we surprise people;
  • limit the impact when we do; and
  • react productively to surprise - ours or others.

A megaphone emoji

How do we surprise people less often and less extremely? There's a framing from Elizabeth Ayer that I love and use: "radiating intent". Her analogy is a turn signal on the road: if you indicate what you intend to do, you won't surprise the people around you when you proceed.

So say you want to make a technical change. You say "I intend to do X". Repeat that in a few contexts: for example 1:1s, team meetings, customer meetings, an RFC for something bigger. You can get a few different reactions.

  • Radio silence? Sounds like assent to me, go ahead and do it. You might still surprise someone, but you created a lot of opportunity for feedback.
  • More commonly, you'll get some feedback: maybe your team says "yeah that's great, please do it"; maybe your manager says "Y is equally important and more urgent, could you take a look at that instead?" Maybe a colleague tells you that X is a terrible idea, we tried it three years ago, it'll never work. That's a good start to a conversation; you'll learn something.

Note how different this is from "asking forgiveness rather than permission". We're still not asking permission: just stating our intent. But by doing that up front we create the opportunity for feedback. And we turn a potentially big surprise - "I went ahead and did this" - into a small one - "I intend to do this".

Similarly, we can limit surprise by sharing the context we have with others. A big source of surprise in communication is unspoken assumptions, background we don't share, missing information, differing incentives.

So we can "radiate context" as well as "radiating intent": actively and readily sharing what we know as well as making connections between efforts and people.

For example, a little FYI or heads-up can go a long way - "in case you missed it, X is working on something similar from another direction, did you see their doc?".

Or realizing in a meeting with a partner team that they're unaware of the broader reasons behind how you're prioritizing, and sharing those directly.

A great note from Denise Yu on what makes a strong senior engineer: "in every conversation you're part of, create clarity and reduce chaos."

A question mark emoji

Reducing frequency and impact aside, surprise is inevitable. So what about when you've surprised someone, or you're surprised? Emotions are immediately heightened. Depending on the situation, you may experience a physical reaction: an adrenaline spike, heart racing, face flushing. And I don't know about you, but these are things I do not enjoy at work.

But: just like in technical incidents, every surprise is an opportunity to learn. And we learn most when we don't jump straight to blame and judgement. I won't call those a natural reaction - who knows - but they're certainly a common default.

Instead, we can take a breath, feel what we feel, and then set that aside and get very curious. What happened here? What does this person know that I don't? Have I misunderstood something? What's important to them in this situation?

A friend of mine calls this "leading with curiosity", which I love because of the double meaning: both the sense of starting and of leadership. I often think of it as a means of "lowering the temperature", calling back to Denise Yu's note on clarity and chaos.

I find people react well to curiosity. I mean, genuine curiosity - not necessarily 20 questions. But "Huh, what happened here?" and a real attempt to understand their perspective.

A heart emoji

Each of these ideas - radiating your intent and context; leading with curiosity - is a practical exercise of empathy: we're trying to meet our colleagues where they're at when we communicate.

That is frequently not where we'd like them to be: every person brings their own stuff with them to every conversation. So communicating effectively - meeting our goals - is always going to require navigating and dealing with surprise.

The good news is that these are skills! We can practice and get better at them. Just like blameless incident retrospectives are a trained response in an organization, we can train our personal response to surprise.

Sometimes, maybe particularly in our industry, we speak in a very "fixed" way about so-called soft skills: Cian "is" an empathetic engineer, like I was born that way. I assure you I was not.

We can learn to share our perspectives more effectively, and to understand other people’s better; to turn our rapid judgements into hypotheses fuelling our curiosity; and ultimately to achieve our goals more often when we communicate.

To summarize:

  • Don't surprise people BUT
  • Surprise is inevitable SO
  • Radiate your intent and context AND
  • Lead with curiosity AND
  • Remember these are learnable skills.

A folded hands emoji

Thank you!

We have a few minutes left, so I'm interested to hear any questions you have, or any thoughts this raised. Does this resonate? Have I over-simplified?

A few references:


A safe pair of hands

| categories: career, work, thoughts

A lot of ink has been spilled about progressing from one "level" of engineering to the next: junior to intermediate; intermediate to senior; and recently we see more about senior to staff.

There's an important common factor in all of these career steps: being seen as a safe pair of hands. This becomes central as you become more senior.

Yonatan Zunger presented a great talk at LeadDev last year that I find myself referencing a lot: Role and Influence: the IC trajectory beyond Staff.

Zunger's framing made sense of my roles over the last decade in a way that staff archetypes didn't. Rather than wondering about what archetype I fit in any particular quarter, it's much easier to think about the mix of technical, people, product, and project disciplines I'm applying.

The hidden fifth discipline is "adult supervision", and I think that's really what I'm talking about here.

A thing I love to see - and experience! - in my colleagues is where they take something on and I know it'll be done right. Not exactly like I would do it; not some kind of ideal that stands independent of our working context; but right.

The problem is solved; the crisis is handled; the relevant people are informed and involved; risks are surfaced early and when shit goes wrong - as it will! - no-one is caught out.

This is level-independent! It's perfectly possible for someone to do this in a level-appropriate way. The problems and relationships you're handling may get a lot hairier as you step up in seniority, but the basic ideas don't change much.

How trustworthy are you with your work? Do you often surprise people? Can I expect you to be accountable1 or do I need to rely on someone else for that?

If you can answer well - no matter where you are in your career - then you're building a solid foundation for your next step. You're a safe pair of hands.


1. I've heard the word "accountable" thrown around a lot in industry, often without definition. Here's mine:

Being accountable for an effort as an engineering leader has two components: ownership, and communication.

  1. Ownership: the effort is "yours", and you act that way. There may be sub-components spread across people and teams, but overall you're the one who's on the hook. Your performance is assessed against the results of the efforts you lead. Judiciously - not every effort will succeed, and that's OK.
  2. Communication: you can tell the detailed story of why we're doing it, how it relates to other efforts, how it is progressing. You actively raise blocking issues or risks and get the necessary people together to address them. Where you can't, you escalate effectively.

Writing your job description

| categories: career, work, thoughts

I joke sometimes that I rewrite my job description every 6 or 8 months. This is approximately true: it's roughly the cadence that my role and focus changes. Writing it down is about setting expectations and aligning on what my manager, peers, and other colleagues need from me.

The format is far from fixed, though. Here are a few examples from my current job in 2022.

Early on, I sketched an "engagement model" with three modes:

  • Consult:
    • Work alongside a team to help understand problems, direct energy, guide solutions.
    • Connect teams and individuals dealing with similar problems: enable a "system of theft" of good practice across teams.
  • Embed:
    • Given a specific problem, dig in with the team to help bootstrap initial work, or redirect/turn around struggling work.
    • Focus on disambiguating problems to the point that they are a 10-20% stretch for folks on the team.
    • Back off on details but continue to offer decision support.
  • Coach:
    • Work with individuals (for example, coming out of consult or embed modes), enabling them to effectively own specific problems and grow via them.
    • Focus mostly on senior engineers and team leads, with the goal of enabling them to do the same for less experienced engineers.

The framing of "engagement model" comes mainly from my work in Site Reliability Engineering, and borrows more recently from Team Topologies. I've found this resonates with other engineers too. A colleague was struggling with the transition to "more conversation and less code" in working more broadly across teams: thinking through their different scopes and priorities in terms of an engagement model proved useful.

A little later, my colleague Drew and I expanded on the above to share a longer "your staff engineers and you" doc explaining how we intended to support our group. An excerpt:

How you can use us

The staff engineer, team lead, and senior engineer roles are all "scaling" or "multiplicative": we help everyone around us to be more effective. Differences are mainly in scope, focus, and expected impact.

The amount and type of support a TL wants from a staff engineer depends on the TL's focus: some are more interested in the management path, others in the technical path. Similarly, senior engineers want different guidance depending on their experience and current projects.

When it comes to technical direction and decisions, each individual's appetite for accountability and responsibility is different. We want you to take on as much as you are able to, and support you in growing that capacity.

Things we can help with:

  • Batting ideas around;
  • Advice and direct support in navigating cross-team or cross-org issues;
  • Partnership and review on technical approaches, RFCs, roadmaps;
  • Advocacy and signal amplification for your ideas;
  • Coaching and mentoring.

Most recently, I proposed embedding with a specific team in my area. I wrote yet another "job description" for this, outlined as:

Why?

  • For me
  • For the team

How?

  • Timeline
  • Things I expect to do
  • Things I will not do

Other engagements

Success criteria

What you put into a "job description" like this depends a lot on the audience: the first was mainly for my manager and peers; the second for my whole group; the third for a specific team.

In all cases this is about transparency and alignment. "Very senior" engineering roles are frequently confusing, not just for us but for the people we work with. Articulating what we're trying to achieve and how is both personally and organizationally useful.

Note that Tanya Reilly covers this idea towards the end of chapter 1 of The Staff Engineer's Path, and offers a lot of useful guidance in figuring out what you do here.


Oncall compensation structures

| categories: operations, process, work

The subject of compensation for developers oncall comes up from time to time.

It can be difficult to find public examples of compensation structures to use.

These notes are from a quick survey of existing stuff I could find via discussions in opsy chats, the Internet, and direct questions to my network.

Asking questions

First, for those on the job hunt, a list of questions to ask about oncall, gathered from the Irish Tech Community:

  • Do you compensate being oncall (i.e. value the stress) or just when you get called (bullshit) or never (warning sign)?
  • What is the response time? Is it 5 mins (no life), 15-30 mins (some life, depending on if you have kids), or an hour (you can go to the cinema with your laptop)?
  • What percentage of your time is operations, when you’re oncall?
  • How many people are in the rotation? If < 6, is there a realistic plan in place to fix that?
    • You need at least 4 people for a reasonable shift pattern, plus one for maintenance (e.g. holidays) + one for emergency (e.g. attrition).
  • Is there one person oncall in a shift or is it a primary/secondary kind of thing?

Notes from the 'net

Second, some posts that cover oncall compensation in various detail:

Example structures

Finally, a set of example compensation structures from various companies.

A fintech company in south America:

  • If you are oncall but not working, +33% of equivalent hourly rate.
  • Paged and start working, +300% of your hourly for that period.
  • Some more extras for nights or weekends.
  • They just exported data from Pagerduty: time working was acknowledgement → resolution.
  • People would not resolve until they were finished any report or comms work that had to be done out-of-hours.
  • This apparently was just how labour laws in that country apply - works the same way for doctors.

A medium-sized SaaS company operating across US / EU:

  • Time off as standard if you actually get paged out of hours: ½ day per four hours or part thereof in responding.
  • Comp at 25% for oncall time regardless.
  • Comp → 100% for the time you’re responding.
  • Because of how their shift structure works, this all tends to amount to roughly a 10% lift in salary, plus time to recover.

A large multinational:

  • Some teams have business-hours only shifts for internal infra APIs.
  • Other teams have customer-facing services and much stricter on-call.
  • Those latter get paid per shift, get a mifi, and get time off etc.
  • ^ didn’t get exact comp structure here.

Another large multinational:

  • Three tiers of oncall, depending on pager SLA.
  • Tier 1: >= 99.9% availability SLA, 5min pager response SLA.
    • Comp paid at ⅔ for outside hours
    • That is, outside business hours accrue hours at 2h for every 3h oncall.
  • Tier 2: >= 99.9% availability SLA, > 5min but <= 15min pager response SLA.
    • Comp paid at ⅓ for outside hours.
    • That is, outside business hours accrue hours at 1h for every 3h oncall.
  • Tier 3: everything else, not comped.
  • Mon-Fri comp paid outside 9-6 core hours. Sat & Sun all comped.
  • So if you were oncall 6am-6pm Mon-Sun that’d be like
  • 3 x 5h for Mon-Fri
  • 2 x 12h for Sat-Sun
  • So 39h compensatable, converting into pay as 13h at tier 2 or 26h tier 1.
  • You could take this as either time in lieu (at 8h/day) or cash (pro-rated to salary).

A medium-sized SaaS multinational:

  • Shifts are either weekday or weekend.
  • Pay according to 60h week (hourly equiv. from salary) if weekday shift.
  • According to 40h week + 24h if weekend shift.
  • Payout doubles if schedule includes public/bank holidays.
  • Contact there mentioned this was very similar to structure in last job, another similar-sized SaaS.

Intercom's oncall implementation:

  • Former Ruby monolith sharded out over the last few years into services. Heavy on AWS and running less software.
  • An unusual structure, but interesting: specifically because they have modified their approach to avoid having “too many people/teams oncall”.
  • Virtual team, volunteers from any team in the org.
  • 6-month rotations in that virtual team, having taken a handful of shifts.
  • Oncall went from being spread across more than 30 engineers to just 6 or 7.
  • “We put in place a level of compensation that we were happy with for taking a week’s worth of on call shifts.”
    • Not sure of precise structure, presumably a bonus per week oncall.

Criteo, medium-sized Adtech HQ’d in France. This is from a 3y old Reddit thread:

  • SREs are oncall. Pager response time is 30 minutes. (!)
  • They are paid for oncall for nights/weekends etc. Exact comp unspecified.
  • If you are paged, you get comped time as well in exchange (½ day at least).
  • Internet & phone bill reimbursed for oncall engineers.
  • If you work during the night, you have to stay home until you get 11h consecutive rest (French law).

A scientific approach to debugging

| categories: debugging, work

Recently, a friend got in touch to ask for some help:

When it comes to debugging an issue, I'm able to set a breakpoint and debug into the test to see the error - like a stack trace or whatever - but when it comes to fashioning a fix, the actual issue is often something else. Like the stack trace is a symptom, not a cause. How do I train my brain to figure out where the actual root causes of things are?

I was curious what my team at $currentplace thought, so I forwarded the question to them over Slack. Colleagues from across our engineering teams dropped by to help, and we came up with some notes to pass back to my friend.

However, someone ratted me out to the corporate blog crew. :o) So this post was born:

A scientific approach to debugging

It includes my favourite debugging story, about Maurice Wilkes, which I first came across via Russ Cox.


Next Page »