With the proliferation of virtualization and high availability architecture, teams are chasing 99.999% uptime like knights of old hunted unicorns. Many site reliability engineers find more comfort in the Boy Scouts’ motto, “Always be prepared.” Your company’s Git server is mission critical to the daily operations of engineering and everyone they support. How do you create business continuity in the face of unpredictable circumstances?
Distributed version control, like Git, insulates itself against disaster events/total outages. If an upstream host becomes unavailable, developers can continue working locally, but won’t be able to push or check out the latest changes. Realistically, this negates some of the productivity cost of an outage but doesn’t eliminate it. While your team loses their ability to build and deploy software via CI/CD tools, they’re still able to commit locally and keep work moving forward.
The larger the organization, the more teams are dependent on the outcomes of a robust disaster recovery process. Bitbucket Data Center’s disaster recovery functionality amplifies the benefits of Git for modern software teams. Smart mirrors, data integrity health checks, and consistent improvements to Bitbucket Data Center mitigate the cost of downtime and help you recover quickly.
Geographic distribution plays many parts in software development, from team location to physical infrastructure. Bitbucket Data Center’s smart mirrors reduce latency when cloning and fetching repositories or offloading high-request volume from CI/CD tools. In the event of an outage, smart mirrors offer an extra (hidden) benefit.
While Git allows developers to continue local work, tracking local commits, smart mirrors allow teams to continue to clone and fetch repositories for a period of time, even if the primary instance is down. Developers logged in via a mirror before the outage would be able to continue until the user cache expires. With the primary instance down, git push would still be unavailable so commits should be made locally.
Smart mirrors offer mitigation not traditionally available in critical business software, reducing the productivity loss associated with an outage.
Integrity health check
In a heavily trafficked Bitbucket instance, data is constantly in flux as pull requests are opened, Jira tickets are closed, and code is deployed. Application linking between Bitbucket, Jira, and third-party apps offer countless benefits but exposes those connected systems to the effects of an outage.
Identifying data anomalies early can save your bacon. Inconsistent data following disaster recovery creates a disconnect in information and communication. Pull requests could re-open, triggering a change in Jira issue status if you’re using workflow triggers. Suddenly, you have a queue of tickets from users wondering, “What just happened?”
Verifying application installation and data integrity through build acceptance tests is a cornerstone of disaster recovery preparation. When running Bitbucket Data Center in recovery mode, the integrity check corrects errors in pull requests, Jira ticket status, and more if the file server is out of sync. If it encounters an inconsistency, the integrity check writes a message to the application log for admins, as well as within Bitbucket’s activity tab so developers are notified.
For businesses in highly regulated industries (e.g., finance, insurance), ensuring data quality and retention of test data are not only best practices but requirements for external compliance.
“Git” moving on your disaster recovery plan
A defined, actionable disaster recovery plan is a must when working with a mission-critical system like Bitbucket. If disruptive events occur, you have to protect code, access logs, and pull requests data – all while ensuring operational continuity. In short, the show must go on.
Until a disaster happens, disaster plans are mostly theoretical. Don’t wait until a disaster happens to find out if your plan is going to work or not. Emergency service folk like firefighters and medics practice rescue skills regularly, so why shouldn’t your team?
With practice, your team gains familiarity and comfort with a high-pressure situation, leading to a quicker (happier) resolution. Find a cadence that works, like every 4-6 months, and have your team go through a Bitbucket failover process.