Things break when you move fast. Incidents are inevitable when your company quickly scales its engineering team and develops new systems. After an issue’s resolved and services are restored, collaborate with your engineering team to complete the incident postmortem template. Our template will help your team discover why an incident happened and how they can prevent it from recurring.
How to use the incident postmortem template
Step 1. Provide an incident postmortem summary
Kick off your postmortem analysis with a high-level summary of the incident’s duration, causes, and effects. Make sure to highlight which services and customers were affected. This helps your team understand how the incident affected the system and provides context as your team prepares for a thorough analysis.
Step 2. Conduct a blameless incident analysis
Now that you’ve provided an incident summary, you’re ready to dive into the details. Incidents are an opportunity for your engineering team to learn from past mistakes. Our template is designed for your team to identify an incident’s root cause without placing blame on any individual member. By conducting your incident analysis in a constructive and collaborative manner, your team can focus on brainstorming solutions.
Step 3. Create a postmortem plan
As your team works together to analyze the incident, use the template to note their insights and open questions. Follow through on your team’s analysis by using their recommendations to prevent the incident from happening again. Once you make an incident postmortem plan, keep track of your progress by creating and updating Jira tickets.
Opsgenie, powered by Atlassian, provides alerting and incident management solutions to help businesses resolve critical issues before they impact customers.
AWS architecture diagram
Visualize your infrastructure to better identify weaknesses and pinpoint places for refinement.
Prepare your operations team to quickly respond to system alerts and outages.
DevOps change management
Use this template to assess your change management performance and mitigate risk.