This post was featured in Dr. Dobb’s as part of a series focusing on enterprise teams making the switch to Git.
At Atlassian, we have been extremely excited about DVCS for a number of years. We have invested heavily in DVCS. We acquired Bitbucket – a cloud DVCS repository host. We developed Stash – a behind the firewall Git repository manager. We added DVCS support to Fisheye, our code browsing and search tool. And we added a myriad of DVCS connectors to Jira, our issue tracker.
We believe DVCS is a great leap forward in software development, and as part of this, we migrated the codebases for our own products and libraries from centralized version control systems (generally SVN) to DVCS. Some of these have been big migrations! We are now fairly experienced with migration to DVCS.
In this three part blog series I will focus on the biggest migration Atlassian has done – migrating the 11-year-old Jira codebase from SVN to Git. What obstacles did we encounter? What lessons did we learn? And most importantly, how did we do it without sacrificing active development on Jira? We hope that sharing this experience helps anyone approaching a similar migration.
We’ll focus on Git, because Jira moved to Git, but everything in this series applies equally to Mercurial. At Atlassian, we use both.
Migrating a big code base is not without cost. The first thing you will need to answer – both for yourself, your bosses, and the people who work for you – is what will DVCS bring us, and why is it worth the cost of migrating?
I have used SVN successfully on many projects. So has Atlassian. And I am sure many people reading this article have also used SVN successfully. Since there is always a cost to migration, you may be inclined to ask, “If Subversion has met my version control needs for many years, why should I change?” To me, that is the wrong question. The real question is, “How can DVCS make what we do today even better?”
Git is known for several things. As a developer working with code, it’s faster. It allows for advanced workflows like feature branching, forks and pull requests – in theory, these workflows are all possible with SVN, however the difficulty of merging in SVN compared to Git makes them untenable. But for anyone moving from SVN, the main benefit of Git is that because of its lightweight branching and easy merging, Git allows you to do your default SVN workflow better than SVN.
What do I mean by this? Let’s talk about how we actually develop and release software. Most of us work in a world where we have at least one released version of our software in the wild, which we call a “stable” branch. We maintain and contribute bug fixes to a stable branch while developing new features on a “development” branch (which is called trunk/master/default depending on which VCS you use).
When we commit bug fixes to stable, we need to get them into master too. SVN merge is known to be a pain and works solely on revision history – not actual content. As a result, a lot of people avoid it, or they do it infrequently and not as part of their day-to-day workflow. How many projects have you worked on where stable and development branches have started to diverge, or diverged so significantly that the effort to bring them back together is a real project cost? I have certainly been in projects where this has happened, and when I speak to other developers it’s a frequent occurrence with SVN. There are some strategies to deal with it. For example, with our issues and tracking software, Jira, we ignored merging and required developers to make each commit individually to each stable and development branch, relying on QA to make sure that it happened correctly.
Git allows you to remove this pain. Git makes merging so easy that merging the entire stable branch into the development branch on each commit is a reality; it’s now our default workflow. So even if you don’t want to use feature branches or forks or pull requests immediately, Git provides advantages from day one.
And when we were ready, we were in a position to take advantage of the advanced workflows that Git allows. Before the switch to DVCS, our major products targeted 90-day release cycles. These 90-day releases went to two platforms: downloadable products for clients to install on their own servers; and a release to our hosted cloud platform (Atlassian OnDemand) for which clients pay a monthly fee. Using branches as a core part of development workflow has allowed us to shorten this to the point where we now release our major products to the cloud every 2 weeks.
Jira is a decent size code base to move – 11 year’s worth of history, 47,228 commits across approximately 21,000 files. We average about 30 different committers over a two-week period. More than that, the VCS is a real work-horse for a project like Jira. Builds, code reviews, scripts for releasing both product distributions and source… all these things have a rich tapestry of dependencies on the source code management system.
Our main goal in the migration was to minimize interruption to developers. This is about more than just the ability to commit code; it is about the infrastructure surrounding software development.
We have 3.5 years of history in Jira’s code review system.
Jira has a lot of CI. We run approximately 60 build plans over different configurations and branches.
We have some other dependencies too – Jira has a somewhat complex release process that involves pulling together code from multiple sources. We also release our source code to customers, which involves a different set of build scripts.
There is a tradeoff here between how fast you can migrate and how stably you can do it – our guiding principle was to optimize for stability over speed. If you set a deadline for your migration and it slips, what’s the worst that happens? Developers have to commit code to SVN for another week or so. Not the end of the world. It’s far worse if the migration interrupts developers’ ability to work and meet their own deadlines.
In the end, the migration took us 14 days in total, with only a total of two hours where developers were unable to commit code. We were nearing the end of the development cycle for our latest release, Jira 5, and at no point were we unable to cut a release candidate.
When preparing the migration, there are a couple of things to be aware of.
First, it will take time. The actual git-svn clone, which takes all of the commits in the SVN repository and replicates them in Git, took three days for us.
Second, you should prepare and think of all the dependencies your infrastructure has on your VCS. And know that if your infrastructure is sufficiently complex (like ours), there will be things you never dreamed of and only discover when they break. So don’t beat yourself up when you encounter a dragon. Just slay it, and continue on your quest.
A migration like this is not something you can do overnight, or even over a weekend. It needs to be managed for a sustained period of time.
Stably migrating is daunting but it is not brain surgery; there is a process Atlassian has employed to make it manageable. In part 2 of Atlassian’s Switch to Git series to walk through, step-by-step, the technical details on migrating from Subversion to Git.
SD Times Webinar: Adopting Git In Your Enterprise
What are the implications of adopting Git in your enterprise – the benefits, the tradeoffs? How can your organization adopt a new model of working and limit the risks involved with working in non-traditional software configurations? How do you “retrain” developers, transfer knowledge and obtain the support you need?
Watch Atlassian and Orbitz, a leading online travel company, talk about how each made the switch to Git. Watch this webinar, which aired on Wednesday, January 23.
Ready to make the switch?
Check out our resources to learn more about how you can migrate from SVN to Git.