Imagine you had a product with millions of users, over five million lines of code, and more than 100 developers working on it. And then your company decided that this product had the opportunity to become an even more successful platform.
Atlassian did just that – it pivoted its flagship product, Jira, from one app with a series of add-ons to a platform with three purpose-built experiences on top. To prepare, the Jira team accepted early on that sh*t was going to break. So to proactively, and reactively, deal with problems, they put together a small group of experienced developers to fight fires, remove blockers, and keep everything on track. Internally, Jira’s move from a product to a platform was known as “Jira Renaissance,” and this crack team of developers were aptly named “The A-team.”
With Jira Renaissance (aka Jira 7.0.0) signed, sealed, and delivered, I talked with two members of the A-team to learn how their team of five helped 250 people – 175 of which were developers – co-coordinate Jira’s move from product to platform. We talked about what worked, what didn’t, and how their time spent on the A-team made them better development managers.
Lesson 1: development managers need to communicate
Marty Henderson, Jira’s Development Manager, and Andreas Knecht, Senior Development Team Lead for Jira, were two of the A-team’s original members. Combined, these two have 15 years of experience working on and in the Jira code base. They knew better than anyone else the potential for breaks, the source code of other products that would be affected by these changes, and that the platform code base would ultimately require over a year’s worth of work before it was ready. As worrisome as this might sound, welcoming these challenges helped the A-team form a strong strategy to combat any issue: communicate, communicate, and then communicate some more.
Since they knew stuff was going to break, the A-team took it upon themselves to be the voice of Jira Renaissance. Marty, Andreas, and the other A-team members injected themselves into all types of meetings. This included everything: stand-ups, quick meetings with teams outside of Jira, lunchtime brownbags, and even sharing timelines and regular updates at all-hands meetings.
Not only were A-team members jumping into meetings, but they constantly wrote Confluence blogs that were shared throughout the company. A blog could be a module responsibilities page for plugins that warned fellow engineers of potential risks to an announcement that we dropped the SOAP. (Yes, we finally removed the legacy SOAP API in Jira 7.0.0.)
Interestingly, blog communications across the company during this time period took on a life of their own. In total, the A-team and their engineering teams wrote and shared 97 blogs over a 200 day period. This created a culture in which developers and development managers had full context to report changes back to management during planning meetings and make decisions.
Lesson 2: it’s *sometimes* okay to merge and break things
With long running branches, it’s easy to see breakages as another team’s problem. Someone else must own that branch, right? But when you have clear communication about an upcoming code merge that will (most likely) break things, then the breakages becomes everyone’s responsibility.
That was the A-team’s opinion, at least, and to keep the Jira Renaissance project on track, teams had to merge code on time, even if the code was a little bit in the red. This “martial law” may sound controversial in an agile world, but it meant that each team had to cope with potential issues collectively and on the spot (with the A-team waiting to help if and when necessary). The A-team realized (Marty admits this was not without the occasional whinge and scream) that without adhering to a strict merging plan that fed into a deployment schedule, there would be months of residual problems to fix. Once the A-team formed, the tough decision was made to merge code into the master branch, where we build Jira for Atlassian’s dog-fooding environment, even if it was in the red.
To make things more complicated, the Jira team was shipping 6.4 every two weeks to our Cloud environment. In order to work against both environments, the Jira team developed bridging libraries that sat between Jira’s 6.4 code and the Jira 7 platform code. To keep quality high, it meant that we needed to introduce many more continuous integration plans that tested against both versions. For example, when developers committed new code, a build would run. If this build passed, the code would then be promoted to Jira’s dog-fooding environment. If a test failed, Jira’s engineers knew that they needed to jump on a fix before anything could roll out to production.
Lesson 3: distributed collaboration needs extra love
The Jira product team consists of 15 different teams and spans three timezones that don’t have any common work hours: Gdańsk, Poland, San Francisco, California, and Sydney, Australia. Marty and Andreas are based in Sydney and agreed that they had great tactics for local communication, but that they fell short when it came to communicating to teams in San Francisco and Gdańsk.
Since there was a forcing function to merge code on a certain timeline, even if things were in the red, it was important that communications about fixes and broken builds were clear, especially across timezones since some teams were working while others were sleeping. One merge in particular, the one that caused those screams that Marty talked about, was the result of teams in Gdańsk not knowing of an upcoming merge, which resulted in breakages that they were not prepared to deal with.
The A-team quickly realized that despite writing blogs and acting as a large presence in the Sydney office, they needed to keep remote teams more in the loop. It was hard to have the same kind of voice when they couldn’t randomly stop in on a stand-up or make a quick announcement at an all-hands. As a result, both Marty and Andreas said that, in retrospect, to improve the experience for remote teams it would make sense to distribute the A-team members across every major engineering location. These champions could be 100% focused on ensuring that 1) communication flowed from office to office and 2) that each team got the same support locally when and if things did break.
Lesson 4: don’t underestimate scope creep
While the A-team was focused on keeping their project on track and putting out fires, Marty and Andreas admitted that it was hard to keep track of all of the new code and changes dropping into the main branch. A simple library bump to fix a bug could bring in unexpected changes that had fallout across the Jira team and that even the A-team was not aware of or prepared to deal with.
Marty and Andreas proposed that in the future they would not only need a clear picture of everything that was changing, but also that the execution of their project should be different. A large change like the work to build the Jira platform and get it ready for the three products built on top should not have been coupled with any other major changes. Jira Renaissance alone brought with it lots of 3rd party library upgrades like Google’s guava. Marty and Andreas are currently coming up with strategies to make dealing with this sort of extensive change easier, like using a mono-repo to consolidate code and make refactoring to deal with API changes a lot easier, or re-jigging the CI pans to get more visibility into where things break ahead of time.
But like with any project, dealing with scope creep is hard and inevitable. There is no silver bullet and acknowledging that it is a large part of a development manger’s job from the sprint to project level is a good first step.