Part I: 3 Lessons from Building Software Development Tools for Google MarketPlace
(Part II)

By now you have probably heard about the Google Market Place and Jira Studio integration. If not, the short version is that Google has created a place for users of Google Apps to sign up for third party hosted, SaaS offerings and have them fully integrated with their Google Apps domain. Google was kind enough to ask Atlassian if we would be interested in integrating our hosted agile development suite, Jira Studio, for their initial offering, and we thought it would be an awesome opportunity. You can now purchase Jira Studio through your Google Apps account allowing you to add integrated bug tracking and agile project management, a wiki, a source code repository and review system, and (optionally) a continuous integration server configured out of the box, using the same user accounts as your Google Apps for authentication.

However, we decided that just stopping at authentication integration, while cool, wasn’t enough. Google provides a rich set of APIs for accessing your Google Apps data, so we wanted to provide a better experience for anyone who purchased Jira Studio. While a team in Sydney worked on the authentication integration bits, a second team was created to work on the features we dubbed, “Cool Shit”. Some of the more obvious integration points we could add in were Google Doc macros for use in the wiki, enabling the attachment of Google Docs within Jira issues and so on. But we wanted something that was really cool and would stand out.

The one Google App that we hadn’t found a way to integrate yet was GTalk. Someone suggested implementing a chat client similar to the one you get in GMail that would appear on every page in Studio. We quickly came to realize it didn’t need to be limited to just chat, we could create tabs to show useful information in a highly convenient format from many other applications within Jira Studio and Google Apps. And so the Jira Studio Activity Bar was born.

activitybar.png

With Jira Studio’s Activity Bar, now you can

  • see all the recent activity going on in Studio – who’s working on what issue, who committed what and how long ago
  • see issues assigned to you
  • see what reviews you need to do
  • see recent builds
  • see favorite wiki pages
  • see unread mail in your GMail inbox
  • see recent Google Docs
  • see any upcoming events on your Google Calendar
  • chat with anyone with a Jabber account

All without leaving the Jira Studio environment.

buddies.png
That’s what the developers here in the US have been working on since a little before Christmas. Now that we’ve got the details of what we built, I’d like to talk a little about how we built it, from the planning phase all the way through deployment.

Initial planning

Our initial estimate for implementing this feature was, “a lot of work”. I’m pretty sure we underestimated. The immediate question that jumped out at us and continued to bother us during development was, “How can we scale this?” Instant messaging is typically done with a persistent connection to the chat server. That way the server can “push” messages to clients when they’re sent. That’s what makes it “instant”. To do the same with a browser we’d need to use some Comet techniques for maintaining a connection between the browser and server. In Java apps, this typically means a thread will be blocked waiting for something to do. It was immediately obvious that this just wouldn’t scale very well. For a Studio instance with 30-50 users, we could expect at least that many threads to be tied up all the time.

How do we scale?

So we started looking at how we could do non-blocking IO in Java apps. Fortunately, most Java app servers have some form of non-blocking, asynchronous IO capabilities these days. Unfortunately, they have different APIs for using those capabilities since Servlet 3.0 implementations aren’t yet wide spread. To make matter worse, we weren’t sure exactly what container we were going to wind up deploying to.

You see, Jira Studio runs on Virtual Machines on Contegix hardware. This is to keep all the customer data separate and secure. Each Studio instance is allocated a specific amount of memory so that Contegix can properly load balance instances for maximum utilization of their hardware. Studio, before adding all this GApps integration, was already starting to get closer and closer to exceeding that memory limit.

recent-activity.png

How will it be deployed?

At first, we had hoped to develop the Activity Bar as a plugin to one of our apps. Because of the way servlets are wrapped in our plugin system this proved to be impossible as there is no way to access any of the container specific methods for doing asynchronous IO. Our next thought was to deploy it as a separate webapp. We could do that simply with a minimal web server like Jetty or Grizzly. There was some debate about whether the app would meet the limited memory footprint even with the most minimal web server we could find, since the JVM by itself would take up 40-50MB of memory. The final thought was that we would deploy the Activity Bar webapp in one of the existing containers alongside Jira, Confluence, Crowd or Bamboo. Even that decision was destined to change.

So, amidst all this uncertainty which, ironically, couldn’t be cleared up until we had something to deploy, what was a developer to do? Well, the same thing we always do. Abstract, abstract, abstract. Atmosphere was built exactly to solve this problem. By using it’s APIs you can run on any of the myriad of containers with asynchronous IO support and not have to worry about the details. Even better, it integrates well with Jersey making it even easier to use! And to top it all off the developers are quick to patch and release bug fixes as they are found. We certainly couldn’t have got this project done as quickly as we did without Atmosphere, so I want to give a big thanks to Jean-Francois for all his help.

Early days

And so we set out to develop the chat backend as a separate web application, to be run on an as yet undetermined container using Atmosphere to abstract away the details of doing asynchronous IO. In pretty quick order we were able to get logins to GTalk, fetching the buddy list, message sending and receiving, and handling presence updates working.

We naively did this using the HTTP streaming form of Comet techniques. This worked well for a while until we started doing some more tests and adding error handling. At this point two nasty problems reared their ugly heads: 1) you couldn’t really do any error handling if the connection failed, 2) if the initial connection succeeded but timed out after a while, there wasn’t much we could do to detect that. This was due to the fact that the connection with the browser was maintained in an iframe. When the server needs to send a message to the browser, it simply spits out a <script> tag containing a call to a JavaScript function to handle it. Well, there is really no way of determining the HTTP responses status code of an iframe. That meant that when a connection problem occurred, there would be little we could do to handle it. As for the second problem, we might have created an onload event handler such that when it was invoked it would remove the iframe and create a new one to receive notifications. The two problems, combined, however, made us realize that HTTP streaming just isn’t a great solution, so we set out to implement long polling instead.

Looking back, while HTTP streaming wasn’t the best solution for us, it wasn’t an entirely bad way to start. We avoided issues like storing notifications between polling requests that we otherwise would have had to deal with and we were able to get the UI development going with real live data while the migration to long polling was being done. It was also good because it made us confident that yes, this really would work and we could pull it off in the short time frame we had been given. While it wasn’t the best solution, it was good way to get started and the migration to long polling wasn’t all that difficult or disruptive because we had a good foundation to work on.

Not too surprisingly the application tabs were the easiest bits to get working. For pulling data from Jira, Confluence, Crucible and Bamboo we just had to request feeds for the data we wanted and parse and display them. Since all Jira Studio apps come from http://yourcompany.jira.com, we didn’t have to worry about any same origin policy issues and SSO took care of our authentication needs. The Google Apps data was a bit harder to come by because we did need to worry about the same origin policy and authentication.

unread-gmail.png

As luck would have it, the other features that were being worked on as part of our integration work took care of all of that very nicely. To allow a user to see a list of Google Docs in Confluence and pick a document to attach to an issue, the “Cool Shit” integration team had developed plugins for the Atlassian applications that acted as proxies to the actual GData feeds. Authentication is handled using OAuth, which is setup automatically between your Jira Studio instance and Google at the time you sign up. With those feeds available to the Activity Bar, it was a simple matter of fetching and parsing them to display your Google Docs, upcoming Google Calendar events and unread messages in your inbox.

Lessons learned

  1. Use abstraction to help delay deployment decisions as much as possible. This is a good general rule, but came in really handy in our case, even though it is a hosted software product and should have complete control over the deployment environment. The truth is, even when you are hosting a service yourself there are many factors that come into play when you do your initial deployment and as you try and lower the cost. So using abstraction to keep your options open is very important!
  2. HTTP Streaming isn’t the greatest approach to comet, but it is really easy to get going quickly. Even though we eventually switched to long-polling, this change would up being fairly easy to accommodate on the front-end. The fact that we got something working early that allowed the front-end developers to get cracking right away, and that was hugely important.
  3. Watch out for the same origin policy when mashing data together! Fortunately, Google’s use of open standards like OAuth made it easy to create a proxy to use to retrieve data.

Now it was just a matter of rolling it out, right? Right? Not quite, stay tuned for part 2 of this blog post.

Fresh ideas, announcements, and inspiration for your team, delivered weekly.

Subscribe now