Atlassian Cloud architecture and operational practices
Learn more about the Atlassian Cloud architecture and the operational practices we use
Atlassian cloud products and data are hosted on industry-leading cloud provider Amazon Web Services (AWS). Our products run on a platform as a service (PaaS) environment that is split into two main sets of infrastructure that we refer to as micros and non-micros. Jira, Confluence, Statuspage, and Access run on the micros platform, while Bitbucket, Opsgenie, and Trello run on the non-micros platform.
Atlassian cloud products and data are hosted on industry-leading cloud provider Amazon Web Services (AWS). Our products run on a platform as a service (PaaS) environment that is spit into two main sets of infrastructure that we refer to as micros and non-micros. Jira, Confluence, Statuspage, and Access run on the micros platform, while Bitbucket, Opsgenie, and Trello run on the non-micros platform.
Bitbucket Cloud’s services and features are provided by a set of services running in the NTT Communications (NTT) data center in Ashburn, Virginia, with backups being stored in both the NTT data center in Santa Clara, California as well as AWS. Bitbucket Cloud’s customer data is stored in PostgreSQL and NetApp filers.
Data is located in the region closest to where the majority of your users are located upon sign-up. However, we know that some of you may require that your data stay in a particular location, so we do offer data residency. Currently, we offer data residency in the US and EU regions, and plan to add support for Australia, the UK, Canada, and Japan. For information and to sign-up for updates, see our data residency page.
Atlassian uses AWS’ highly available data center facilities in multiple regions world-wide. Each AWS region is a separate geographical locations with multiple, isolated locations known as Availability Zones (AZs). For example, US-West (the west cost of the United States) is a region with two AZs - us-west-1a (located in northern California) and us-west-1b (located in Oregon). While they are both of in the same region, they are geographically isolated.
Each AZ is designed to be isolated from failures in the other zones and to provide inexpensive, low-latency network connectivity to other AZs in the same region. This multi-zone high availability is the first line of defense for geographic and environmental risks and means that services running in multi-AZ deployments should be able to withstand AZ failure.
Jira and Confluence use the multi-AZ deployment mode for Amazon RDS (Amazon Relational Database Service). In a multi-AZ deployment, Amazon RDS provisions and maintains a synchronous standby replica in a different AZ within the same region to provide redundancy and failover capability. The AZ failover is automated and typically takes 60-120 seconds, so that database operations can resume as quickly as possible without administrative intervention. Opsgenie, Statuspage, Trello, and Jira Align use similar deployment strategies, with small variances in replica timing and failover timing.
We operate a comprehensive backup program at Atlassian. This includes our internal systems, where our backup measures are designed in line with system recovery requirements. With respect to our cloud products, and specifically referring to you and your application data, we also have extensive backup measures in place. We use the snapshot feature of Amazon RDS (Relational database service) to create automated daily backups of each RDS instance.
Amazon RDS snapshots are retained for 30 days with support for point-in time recovery and are encrypted using AES-256 encryption. Backup data is not stored offsite but is replicated to multiple data centers within a particular AWS region. We also perform quarterly testing of our backups.
We don’t use these backups to revert customer-initiated destructive changes, such as fields overwritten using scripts, or deleted issues, projects, or sites. To avoid data loss, we recommend making regular backups. Learn more about creating backups in the support documentation for your product.
For Bitbucket, all primary database servers reside in NTT data centers with replication nodes and backups being stored in both NTT data centers as well as AWS. The production data is constantly replicated in both the NTT Ashburn, VA and Santa Clara, CA data centers via mirroring technology. Bitbucket production data in is replicated every 2 hours from its primary site to its secondary site, with the replication lag about 10-20 minutes on average and, at most, within four hours.
Data center security
AWS maintains multiple certifications for the protection of their data centers. These certifications address physical and environmental security, system availability, network and IP backbone access, customer provisioning and problem management. Access to the data centers is limited to authorized personnel only, as verified by biometric identity verification measures. Physical security measures include: on-premises security guards, closed circuit video monitoring, man traps, and additional intrusion protection measures.
NTT data center security is also multi layered, and includes full time video surveillance, bio metrics entry card readers, and man traps. There is on-site security personnel 24/7, and clients have access to on-site monitoring as well as cloud computing services, professional services, remote hands, and NTT’s content delivery network.
Cloud platform architecture
On top of our cloud infrastructure, we’ve built a multi-tenanted microservice architecture with a shared platform that supports our products. In a multi-tenant architecture, a single service serves multiple organizations, including the relational databases (RDS) and EC2 instances required to run our cloud products. Each shard contains the data for multiple tenant, but each tenant’s data is isolated and inaccessible to other tenants. It is important to note that we do not offer a single tenant architecture.
Our microservices are built with least privilege in mind and designed to minimize the scope of any zero-day exploitation and to reduce the likelihood of lateral movement within our cloud environment. Each microservice has it’s own data storage that can only be accessed with the authentication protocol for that specific service, which means that no other service has read or write access to that API.
We’ve focused on isolating microservices and data, rather than providing dedicated per-tenant infrastructure because it narrows the access to a single system’s narrow purview of data across many customers. Because the logic has been decoupled and data authentication and authorization occurs at the application layer, this acts as an additional security check as requests are sent to these services. Thus, if a microservice is compromised, it will only result in limited access to the data a particular service requires.
As we scale up our services, we look for any anamolus behavior that may occur and use our detection process to fix it.
While our customers share a common cloud-based infrastructure when using our cloud products, we have measures in place to ensure they are logically separated so that the actions of one customer cannot compromise the data or service of other customers.
Atlassian’s approach to achieving this varies across our applications. In the case of Jira and Confluence Cloud, we use a concept we refer to as the “tenant context” to achieve logical isolation of our customers. This is implemented both in the application code, and managed by something we have built called the tenant context service (TCS). This concept ensures that:
- Each customer’s data is kept logically segregated from other tenants when at-rest
- Any requests that are processed by Jira or Confluence have a tenant-specific view so other tenants are not impacted
In broad terms, the TCS works by storing a context for individual customer tenants. The context for each tenant is associated with a unique ID stored centrally by the TCS, and includes a range of metadata associated with that tenant, such as which databases the tenant is in, what licenses the tenant has, what features they can access, and a range of other configuration information. When a customer accesses Jira or Confluence cloud, the TCS uses the tenant ID to collate that metadata, which is then linked with any operations the tenant undertakes in the application throughout their session.
Your data is also safeguarded through something that we call an edge - virtual walls that we build around our software. When a request comes in, it is sent to the nearest edge. Through a series of validations, the request is either allowed or denied.
- They land on the Atlassian edge closest to the user. The edge will verify the user’s session and identity through your identity system.
- The edge determines where your product data is located, based on data in the TCS information.
- The edge forwards the request to the target region, where it lands on a compute node.
- The node uses the tenant configuration system to determine information, such as the license and database location, and calls out to various other data stores and services (e.g. the Media platform that hosts images and attachments) to retrieve the information required to service the request.
- The original user request with information assembled from its previous calls to other services.
Because our cloud products leverage a multi-tenant architecture, we can layer additional security controls into the decoupled application logic. A per-tenant monolithic application wouldn’t typically introduce further authorization checks or rate limiting, for example, on a high volume of queries or exports. The impact of a single zero-day is dramatically reduced as the scope of services are narrowed.
In addition, we’ve built additional preventative controls into our products that are fully hosted on our Atlassian platform. The primary preventative controls include:
- Service authentication and authorization
- Tenant context service
- Key management
- Data encryption
Service authentication and authorization
Our platform uses a least privilege model for accessing data. This means that all data is restricted to only the service responsible for saving, processing, or retrieving it. For example, the media services, which allows you to have a consistent file upload and download experience across our cloud products, have dedicated storage provisioned that no other services at Atlassian can access. Any service that requires access to the media content needs to interact with the media services API. As a result, strong authentication and authorization at the service layer also enforces strong separation of duties and least privilege access to data.
We use JSON web tokens (JWTs) to ensure signing authority outside of the application, so our identity systems and tenant context are the source of truth. Tokens can’t be used for anything other than what they are authorized for. When you or someone on your team makes a call to a microservice or shard, the tokens are passed to your identity system and validated against it. This process ensures that the token is current and signed before sharing the appropriate data. When combined with the authorization and authentication required to access these microservices, if a service is compromised, it’s limited in scope.
However, we know that sometimes identity systems can be compromised. To mitigate this risk, we use two mechanisms. First, TCS and the identity proxies are highly replicated. We have a TCS sidecar for almost every microservice and we use proxy sidecars that offshoot to the identify authority, so there are thousands of these services running at all times. If there is anamolus behavior in one or more, we can pick up on that quickly and remediate the issue.
In addition, we don’t wait for someone to find a vulnerability in our products or platform. We’re actively identifying these scenarios so there is minimal impact to you and we run a number of security programs to identify, detect, and respond to security threats.
Tenant context service
We ensure that requests to any microservices contain metadata about the customer - or tenant - that is requesting access. This is called the tenant context service. It’s populated directly from our provisioning systems. When a request is started, the context is read and internalized in the running service code, which is used to authorize the user. All service access, and thus data access, in Jira and Confluence require this tenant context or the request will be rejected.
Service authentication and authorization is applied through Atlassian service authentication protocol (ASAP). An explicit allowlist determines which services may communicate, and authorization details specify which commands and paths are available. This limits potential lateral movement of a compromised service.
Service authentication and authorization, as well as egress, are controlled by a set of dedicated proxies. This removes the ability for application code vulnerabilities to impact these controls. Remote code execution would require compromising the underlying host and bypassing the Docker container boundaries - not just the ability to modify application logic. Rather, our host level intrusion detection flags discrepancies.
These proxies constrain egress behavior based on the service’s intended behavior. Services that do not need to emit webhooks or communicate to other microservices that are prohibited from doing so.
Customer data in our Atlassian cloud products is encrypted in transit over public networks using TLS 1.2+ with perfect forward secrecy (PFS) to protect it from unauthorized disclosure or modification. Our implementation of TLS enforces the use of strong ciphers and key-lengths where supported by the browser.
Data drives on servers holding customer data and attachments in Jira Software Cloud, Jira Service Management Cloud, Jira Work Management, Bitbucket Cloud, Confluence Cloud, Statuspage, Opsgenie, and Trello use full disk, industry-standard AES-256 encryption at rest.
PII transmitted using a data-transmission network are subject to appropriate controls designed to ensure that data reaches its intended destination. Atlassian's internal Cryptography & Encryption Policy sets out the general principles for Atlassian's implementation of encryption & cryptography mechanisms to mitigate the risks involved in storing PII and transmitting it over networks. The type of encryption algorithm used to encrypt PII must take into account the classification level of the PII in accordance with Atlassian's internal Data Security & Information Lifecycle Management. To learn more about how we collect, share, and use customer data, refer to our privacy page.
To keep up to date on additional data encryption capabilities, see our cloud roadmap.
At Atlassian, we use the AWS Key Management Service (KMS) for key management. To further ensure the privacy of your data, KMS is the originator and secret store for these keys. The encryption, decryption, and key management process is inspected and verified internally by AWS on a regular basis as part of their existing internal validation processes. An owner is assigned for each key and is responsible for ensuring the appropriate level of security controls is enforced on keys. Atlassian-managed keys are rotated upon relevant changes of roles or employment status.
We also leverage envelope encryption. AWS has the master key that we can never see and any key encryption or decryption requests requires the right AWS roles and permissions. And when we use envelope encryption to build or generate keys for individual customers, we have different data keys for different types of data through our data stores. Additionally, we have an encryption approach to the internal application layer that provides backup data keys in other AWS regions. Keys are automatically rotated annually and the same data key isn’t used for more than 100,000 data elements.
Soon, we will offer bring your own key (BYOK) encryption, giving you the ability to encrypt your cloud product data with self-managed keys in AWS KMS. With BYOK, you will have complete control over the management of your keys and will be able to grant or revoke access at any time, both for your own end-users and Atlassian systems.
AWS KMS can be integrated with AWS CloudTrail in your AWS account in order to provide you with logs of all key usage. This solution enables encryption of your data at different layers throughout the applications, such as databases, file storage, as well as our internal caches and event queuing. Through the whole process, there will be no product usability impact.