I’ve seen my share of terrible status updates. As Founder and CEO of a SaaS platform for database performance management, VividCortex, I know the power of a great status update to build or destroy trust. Unfortunately, I see more bad status updates than good ones.
It’s a shame, really, because a bad status update can be worse than doing nothing at all. Status updates – especially in moments of potential crisis – are a key piece of your relationship with your customers. Doing a better job with status updates is a high-value activity in my book.
I prefer to buy, rather than build, solutions when possible. I therefore have lots of SaaS vendors running key parts of our business at VividCortex. From my perspective as a customer, most of their status updates are garbage. Most vendors do an awful job telling me the truth about how their systems are running. I can’t figure out the real situation and what the vendor’s doing about it. The signal I get is avoidance, which erodes my trust in them.
I haven’t always done a great job with our own status updates at VividCortex, and that’s been a learning experience for me too. Unfortunately, although there’s a lot of guidance and examples on how to conduct and write good postmortems, there doesn’t seem to be much about service status update messages. I’d like to share what I’ve learned, as both a service provider and a customer, in the hopes that it helps you avoid some of the quicksand I’ve stepped into or the bad customer experiences I’ve had.
Why good status updates are hard
Good status updates during outages or incidents require a delicate balance of several factors. A proper approach should be (but not limited to):
- Informative and clear, yet brief
- Frequent, yet not noisy
- Honest, without needlessly airing dirty laundry
Some of these things are tough to achieve. For example, status updates are best when someone with serious language skills writes them. Good writing is actually hard, and it matters a lot for status updates! Yet at the same time, publishing status updates can’t be the sole privilege of the micromanaging perfectionist CEO (just an example, you understand). If the CEO wants to review and approve all status updates and isn’t available, customers won’t be kept up-to-date.
The most common problems that lead to bad status updates, in my experience, are:
- Poor writing skills
- Fear of consequences from being too clear
- Cultural or leadership problems (perhaps no one is responsible for status updates)
- Actual, intentional avoidance
- Failure to realize how important concise status updates are
I’ve found that some points are more important than others to get right (or avoid getting badly wrong). Here are some of the ways I’ve seen status updates go extremely sideways.
The evergreen status page is a lie
Your status page needs to actually reflect when there’s an outage. I can’t remember the number of times I’ve seen people tweeting their disgust about a company that’s lying about ongoing outages.
Recently, for example, I saw someone tweet to say they were experiencing problems using a company’s platform, and asking whether the platform was known to be in degraded status. One of that company’s other customers responded, saying sarcastically that their favorite part of every outage the company has is the status page that says everything’s working. It’s a scathing indictment of the company’s lack of willingness to acknowledge and communicate their problems.
If you’re going to lie on your status page, it’s time to seriously think about why you have it in the first place. And remember, failure to tell the truth is a lie, from the customer’s point of view. If nobody actually intends to hide an outage, but nevertheless nobody updates the status page, it’s still just as much of a lie.
By using passive voice, truth is concealed
I’ve noticed that a lot of engineers don’t know what passive voice is or how to write actively. Unfortunately, passive voice is one of the best ways to look like you’re trying to hide something. Passive voice is, at root, a refusal or inability to say who’s doing what. That’s an instant signal of avoidance to customers, intentional or not.
Consider the following status update:
[status] Monitoring: a fix has been implemented and we are monitoring the results.
The phrase “has been” is a classic pattern of passive voice. It refuses to identify a person or party as taking the action. If you use the phrase “we have…” it automatically becomes more active:
[status] Monitoring: we have implemented a fix and we are monitoring the results.
If you’re not that great of a writer, a quick path to a more active voice is to use a free online editor like Hemingway to compose your status update. It will flag problems such as passive voice. Another way is to use the “by zombies” heuristic. Try that with the headline of this section.
Obfuscated verbiage flummoxes cogitation
Passive voice isn’t the only way to make status updates unclear. Ten-dollar junk words are another common problem. Most words ending in “ization” are so-called “nominalizations” that hide the clarity and directness of the root word by making a noun out of a verb. The following words and phrases are always replaceable by better ones, or can just be deleted without replacement: implement, utilize, employ, assist, make use of, in order to, and facilitate.
Let’s edit the earlier status update one more time to make it better:
[status] Monitoring: we have fixed the issue and we are monitoring the results.
A common problem is words that can have several meanings, especially to non-native English speakers. Consider the word “over,” for example. In everyday usage, people will say things like “over an hour,” which is very vague. Does it mean “more than an hour ago,” or does it mean “during the past hour?” If you’re a Spanish-speaking native, “over” is pretty much synonymous for “on the subject of,” too.
There are many such examples. “While” can mean either “at the same time as” or “although.” Another troublesome word is “since,” which can mean “because” or “in the time period after X until now.”
I prefer to fix these issues by using the unambiguous word or phrase (because, more than, although) and providing enough context to clarify if I can’t.
This might sound like nothing more than grammar nitpicking, but again, customers will read between the lines of a wordy or vague status update message. What they hear is more important than what you mean to say.
Infrequent updates create worried customers
Mark Imbriaco recommends status updates every 20 minutes if possible, depending on the nature and duration of the incident, and I agree. If I don’t hear anything from the vendor for a while, I start to wonder what’s going on. Surely someone knows that there are hundreds or thousands of customers hanging on every word? Surely they can take 30 seconds to just say they’re still alive and working on it?
If nothing’s changing and it’s going to be a long outage, you can signal that with a status update conveying that it’s going to take a while to know more. Then you can just follow up every couple of hours.
For example, during a scheduled maintenance period from HipChat, they posted updates such as “Our scheduled maintenance is ongoing. We are currently in the process of enabling connections to our newer database, and expect to re-enable logins shortly.” Customers reading this status update know exactly what to expect from the service. Another good example is the Etsy status updates, which show ongoing, reassuring communications during outages.
Vague updates leave lingering questions
Specific updates are always better than vague ones. “We have applied a fix.” What fix, exactly? Did you redeploy, patch, fail over, what did you do? “We are monitoring the results.” Great, but what are you seeing?
I’d prefer “we are monitoring the results as replication catches up,” or “we are monitoring the results and it looks good so far.” Tell me more whenever you can.
You can improve it even further, by removing the vague pronoun “we” and instead stating specifically who took action. This helps customers make a mental connection that real people and real teams are at work. Try this:
[status] Monitoring: Our SF-based SysAdmin team has redeployed the service and we are monitoring the results as we send partial traffic to it. It looks good so far
Non-apologies hurt credibility
You don’t have to apologize in status updates. Those kinds of things are often better to defer to a postmortem write-up. But if you’re going to say anything, say it directly and squarely. Your apology should never point fingers at another party, should never say “apologize,” should never be followed by a “but” phrase. Above all it should never make customer pain abstract or potential. Here are some really bad ways to do it:
- We apologize for… (apologize is a fancy word, and what’s this abstract “we” stuff?)
- I’m sorry that our provider is goofing up… (NO BUCK PASSING!)
- I’m sorry for any inconvenience… (oh, so the “inconvenience” is merely theoretical?)
- We know this is frustrating, but we… (but what? but nothing!)
I know sometimes it can be frustrating not to be able to fix a problem people are complaining about. I was once the target of a bunch of irate tweets because my marketing automation provider wasn’t sending out emails for ebook downloads. Everyone was feeling cheated and tricked. I felt there was little I could do because my provider was stonewalling me, refusing to acknowledge or respond. The temptation to blame them in public was strong, because I thought it might help motivate them. I gave in and pointed the finger at them, and sure enough, it immediately made the problem worse. The right thing to do was tweet back direct links to downloads, and an apology while I applied pressure on the vendor in private.
For more on this topic, read the related post on the Basecamp blog.
Casual, overly familiar tone destroys sense of caring
A status update is the last place to crack a joke, be flippant, or dismiss the seriousness of the incident. Every incident, no matter how minor, is worth treating seriously.
One of the most cringe-inducing moments I’ve had was when I saw someone tweet the following from our Twitter account:
We’re having temporary issues with a server. No data will be lost, but customers may be affected. No worries, our brainiacs are on it.
This was the moment when I realized I’d really screwed up as CEO. I wanted to bury my face in my hands, because I knew the ugly truth:
- A subset of our customers were definitely affected. And definitely had worries.
- We knew exactly what the server issues were and we didn’t say.
- The whole tone, although meant to be reassuring that it wasn’t a major issue, instead conveyed lack of urgency or awareness of the importance to customers.
- Vague. Passive voice. Theoretical acknowledgement of non-problem. Flippant. Could we do worse if we tried?
I knew it was my fault that this happened. I had granted Twitter access only to one person, who didn’t have enough context to understand how to communicate what the engineers were indicating was the problem. I hadn’t trained our staff. I hadn’t given people tools to communicate clearly (this was the day I resolved to finally sign up for a StatusPage account and decided that it was no longer a someday-priority). I hadn’t built a culture of transparency, honesty, and accountability to customers.
Avoiding the worst mistakes isn’t that hard
As I said in the beginning, most of the issues I’ve seen aren’t that hard to avoid, especially at a small company like VividCortex, where our headcount is currently about 25. Here’s my take on things at this point in time:
- A tool like StatusPage, wired up to a Twitter account and the company’s chat channel, is vital for both internal and external communication. We had some incidents that were much worse than they needed to be, both for us and for customers, precisely because we weren’t getting everyone on the same page (literally) internally and externally. It wasn’t hard to fix at all. We just had to do the basics.
- If you’re going to have a status page, make sure it’s updated honestly, immediately, and clearly. Put customers first: they need, above all, to know how serious things are, what’s happening NOW, that you’re working on it, and your best effort at a prognosis right now.
- Talk about service status updates with your team. Tell them your values. Tell them nobody’ll get fired for admitting what’s going on. Tell them customers come first. Tell them that their first action shouldn’t be to start investigating. It should be to say “There’s something to investigate” and then to investigate. (Call 911 before beginning to administer CPR!)
- Trust your team to do the right thing and make sure it’s done in public, in the main chat channel, so everyone learns by osmosis and understands the example they need to follow when it’s their turn. (We’re huge advocates of ChatOps at VividCortex.) Wiring up your chat bot to your StatusPage account means that everyone is empowered to send out an update at any time.
By the way, I wrote this blog post in part as an ongoing way to communicate to my own team how I want us to behave in the face of outages!
I hope my perspective has helped show the importance of avoiding some of the biggest mistakes, and grabbing some of the easiest wins, with service status update messages. Please leave your feedback in the comments or tweet to me at @xaprb!
Bonus: 50% off your StatusPage
We like to think StatusPage is an important tool for any any businesses. You could sweat the details on hosting and providing the page yourself. Or let us do it so you can focus on what really matters to your team and customers.
That’s why we want to offer 50% off the first three months of any StatusPage plan. Simply shoot us a note to email@example.com when you’re ready to active.
And, of course, our free trial is free forever.
This post was originally published on the StatusPage blog in 2016.