Building an effective on-call schedule
How to build an on-call schedule that balances customer experience and employee needs
An on-call schedule (or on-call shift) is a schedule that ensures the right person is always available, day or night, to quickly respond to incidents and outages.
In the medical profession, on-call doctors are expected to swoop in to deal with medical emergencies anytime during their shift. In the tech world, IT and development pros use on-call schedules to make sure someone’s always there to respond to major bugs, capacity issues, or product down time—or to escalate the issue if it’s something they can’t fix on their own.
Historically, on-call rotations have gotten a bad rap. The lack of flexibility can be a source of anxiety—or even panic—in case of emergencies. Employees worry about work-life balance when an alert could go off in the middle of a REM cycle or family dinner. And tensions between operations engineers and developers using traditional on-call schedules has caused problems.
The good news is that these downsides are solvable problems. Companies that get on-call shifts right enjoy increased uptime, more customer satisfaction, and employees with a good work-life balance.
Common mistakes of on-call scheduling
One reason on-call makes some people nervous is because many companies just get it wrong. They don’t leave room for emergencies. They don’t prioritize work-life balance. Or their schedules simply don’t work for their teams.
Four of the most common mistakes companies make are:
1. Forcing a one-size-fits-all approach
Each organization and team is different, and your on-call schedule should reflect that truth. Companies with locations around the world will operate differently than teams in a single location. Large teams will operate differently than small teams. For on-call rotations to be effective, they need to be tailored to your organization and team.
2. Relying solely on operations engineers
If you want a recipe for burnout, this is it. Relying on any one small group or person to handle your full infrastructure needs is asking a lot. And knowing you’ll be on-call means developers have some built-in incentive to ship stable code.
3. Not allowing for flexibility in the schedule
There are times when small schedule changes might need to be made. People may need to swap shifts. Personal emergencies might mean an employee needs to pass an issue off to a backup person. And sometimes the schedule as a whole isn’t working for the team and needs to be revisited. Having the flexibility to roll with these kinds of punches will make for a happier team and a higher chance of getting issues resolved quickly when they do arise.
4. Ignoring work-life balance.
A healthy work-life balance increases attachment, loyalty, and commitment to employers. An unhealthy work-life balance has the opposite effect. Keeping teams happy, committed, and productive means taking their work-life balance into account when creating schedules.
The importance of effective on-call schedules
Downtime costs businesses $700 billion per year—and that’s just in North America.
The less effective your on-call schedule is, the more revenue you risk. Which is why on-call is so vital.
Of course, revenue isn’t the only thing impacted by ineffective schedules. Employee engagement, retention, and focus are also at risk here. Studies show that on-call employees with irregular schedules are more than twice as likely to often experience work-family conflict than employees with typical schedules. For on-call workers in the medical field, there is a strong correlation between work satisfaction and protecting sleep schedules.
An effective on-call schedule that’s tailored to your business and teams ensures customers are confident that they’ll get quick, consistent support for any potential incidents. It minimizes the risk of scheduling errors and missed issues. And it keeps employees from being overtasked, missing sleep, and sacrificing productivity and job satisfaction.
Benefits of building a sustainable rotation
A sustainable on-call schedule is one that respects and values employee time and system uptime and functionality.
Sustainability can mean a lot of things. It may mean creating a follow-the-sun schedule where teams across time zones are only tasked with being on-call during their waking hours. It may mean meeting with teams to identify which potential issues must generate an alert even in the middle of the night—and which are less urgent and could wait until morning. It may mean tracking your alerts and schedules to make sure one person isn’t getting disproportionately saddled with urgent tasks.
Whatever sustainability looks like within your organization and team, the benefits are clear:
- Happy, well-rested engineers and developers who perform better
- Higher employee retention (and satisfaction)
- More talented people who will want to work with your team when job openings arise
- Better customer service
- A boost in your business’ bottom line
- Better team culture and support for employees
- Less burnout
- Better work-life balance and flexibility
Factors to consider when designing on-call schedules
As we said before, there is no one-size-fits-all when it comes to on-call schedules. Team size, team locations, preferred working hours, company culture, and each team member’s ability to solve key issues are all factors to consider when building yours. Here are some ways to think about the different elements that will go into your schedule:
The best schedule for a two-person team trying to launch a startup is unlikely to be the best schedule for a 50-person team managing an established product.
If you are a solo developer responsible for all issues at all hours, it’s easy to burn out and it makes sense to bring in backup, if only for on-call emergencies.
For teams of two, alternating days are often the schedule of choice. One person can be on Monday, Wednesday, and Friday and the other on Tuesday, Thursday, and Saturday, with a Sunday shift every other week. Another simple split option is having person A on Monday and Wednesday and Person B on Tuesday and Thursday, with each taking a Friday through Sunday shift every other week, leaving every other weekend free for personal time. A third common option is to alternate full weeks of on-call duty.
For teams of three or more, weekly rotations tend to work well.
With on-call shifts, there’s always the possibility that your primary on-call person will sleep through or miss a notification. To mitigate this risk, you’ll probably want to have at least one on-call backup.
In a one-person team, this means bringing in a backup just for on-call emergencies. In a two-person team, it means someone is always on-call and someone is always the backup. Once you hit three people, you can either add a third layer to your backup plans or have one person who’s off the hook at any given time.
A team in a single geographic location will need to plan differently than a team that’s distributed.
After all, 1:03 a.m. in San Francisco is 1:33 p.m. in India. If you have teams in both locations, it may make sense to assign on-call duties based on daylight hours. If, on the other hand, your whole team is in Minneapolis, someone’s going to be on the night shift each night.
Scheduling on-call rotations during daylight hours is called a follow the sun model. When possible, it’s a great way to ensure better work-life balance and address employee concerns about being woken up in the middle of the night.
In some cases, distributed teams may have access to different things, so even with a follow the sun model, you may need to have a backup person on-call overnight. In this case, your best bet is to make sure that backup person is only getting alerts in cases where the issue cannot possibly be resolved by a team in another time zone and when the issue is of high importance and can’t possibly wait until morning.
A third factor that impacts on-call schedules is who owns which service and who’s capable of making fixes on it? Someone who knows a particular service inside and out is more likely to be able to fix it quickly and figure out how to avoid the same issue in the future. Splitting on-call duties so you always have a primary or backup person for each different teams and services can be a smart move for many companies.
There’s no reason to make these schedules without consulting the team itself. A morning person may work well on a 4:00 a.m. to 4:00 p.m. schedule without disrupting sleep. A night owl might prefer 4:00 p.m. to 4:00 a.m.
Some developers may ask for one-week-on one-week-off schedules because they’re easy to keep track of and mean whole weeks of interrupted time on their primary projects. Other teammates may prefer shorter shifts.
It won’t always be possible to make everyone happy, but it’s never a bad idea to find out what would work best for your specific team and use that as a starting point for developing your on-call schedule.
How to construct a fair, effective schedule that won’t burn out your teams
On-call schedules may get a bad rap, but they don’t have to be a difficult, sleep-depriving, unsupported path to burnout. Here are seven strategies to keep schedules fair and effective:
1. Talk to the teams first
Understand how your team wants to work and take their opinions to heart before you craft your plan. If you know that someone’s a night owl, you can try to avoid scheduling them on a morning shift. If your whole team agrees that they’d like to do one-week shifts, why not start there?
2. Follow a sun model schedule if you can
Good sleep habits improve memory, problem-solving skills, and productivity. Not to mention mental and physical health. If you can keep employees from being on-call overnight, the benefits are compelling.
3. Develop a supportive culture
Having a supportive team can make a huge difference in both employee satisfaction and on-call effectiveness. If someone has a long night dealing with a major outage, having another employee step up and offer to take the next day’s or the rest of the week’s shift can be a huge relief. If personal emergencies or big life events come up, people need to have a support system that can take over or shift on-call duties.
Foster the kind of culture where teams take care of each other and you’ll find the burden of on-call feels much lighter to everyone.
4. Don’t wake people up for minor issues
Do you need alerts to go out over every issue or are there some that have greater impact than others? Sit down with your team and figure out which issues are a priority and need to be immediately addressed and which might be able to hold off their alerts until daylight hours.
5. Understand what on-call shifts mean in your organization
For some companies, on-call may mean occasional alerts. For others, shifts might be intense and middle-of-the-night wake-ups more frequent.
Before you craft an on-call schedule, take a look at your company’s needs. A week-on week-off schedule might work well for companies with low intensity and might be too exhausting for companies with a lot of alerts.
6. Check in regularly
Your on-call schedule doesn’t have to be done once and then set in stone. Check in regularly. Is it working out for the team? Is it helping you prevent and resolve issues quickly? Is it truly as effective as it can be—both for customers and for employees? This process doesn’t have to be static.
7. Give employees the tools for better work-life balance
Make sure they have a mobile internet connection so that they can still leave the house and do the things they need to do while on-call. Encourage shift swapping or calling in backup if someone wants to spend an hour at Yoga or attend a parent-teacher meeting. Track your alerts and prioritize tasks that help decrease them. Use tools like OpsGenie to create, manage, and track on-call responsibilities and keep everyone organized and on the same page.
On-call schedule templates
Once you understand the type of schedule that will work best for your business and team, it’s time to create an on-call schedule template. Your template will need to include:
- Users (who will be on-call?)
- Rotation types (will the schedule be weekly, daily, or custom?)
- Restrictions (will you restrict on-call shifts to certain times?)
- Start date and time for the schedule
When choosing a tool to manage your schedules, look for one that has automated scheduling, plenty of integration options with your other tools, and on-call analytics that bring transparency to on-call policies and workload.