When I first started working at Atlassian, my team, Jira Studio, was working on load testing, with the goal of identifying and fixing performance issues. We used JMeter to simulate a high load, and used its graphing capabilities to report on the results. This worked really well.
Having achieved our initial goals, we then wanted to set up a more permanent load test environment, running nightly load tests against a load test server. We got this set up, but found that it wasn’t that useful. Each night we were generating reports, but, we never checked the reports. And even if we did check them, the reports themselves didn’t show if performance was getting better or worse, because the reports were per build, they didn’t graph data across builds.
I had a look around at a few tools that could aggregate JMeter build results from multiple builds, but I wasn’t that impressed by what I saw. There was Chronos, however its reports were at a high level, and it didn’t have any way of notifying you if performance had degraded between builds. This is how I came to write the Bamboo JMeter Aggregator plugin, it started as a ShipIt project, became a 20% project, and is now used as an important part of performance testing at Atlassian.
Why a Bamboo plugin?
Bamboo is the platform that we run our performance tests from. Out of the box, it stores build results, including JUnit test results, and allows reporting on these over time. This is exactly what I wanted to do, but JMeter test data is very different to JUnit test data, to make JMeter test data useful a lot of processing is required, and the types of graphs that I wanted to see from JMeter test results weren’t quite what Bamboo provided out of the box. Nevertheless, it made sense that JMeter build results should be stored and reported on by Bamboo. Bamboo provides a rich plugin API, allowing seamless integration of new features as if they were part of the main product. More details can be found in the Bamboo Plugin Guide.
If you’re unfamiliar with JMeter, I suggest you read George Barnett’s blog post on Performance Testing With JMeter. For now, I’ll summarise the concepts of load testing and JMeter.
A sampler is something that takes samples, for web application performance testing that usually means makes HTTP requests, and times how long they take to run. An example sampler might be one that issues a search request on an application. Assertions can also be run against the returned data to ensure the sample is valid.
A sampler will output many samples, each sample containing information such as how long the sample took, whether the assertions on it passed or failed, and what the HTTP status code was.
The first thing the plugin does is to aggregate build results. JMeter outputs a log file that contains details of every sample it takes, including what time the sample was made, how long it took to return, what the return code was, and whether any assertions on the sample passed or failed. A sample in JMeter may be an HTTP request, an SQL query, a JMS message, or many other things. In our case it’s an HTTP request.
The JMeter Aggregator plugin takes these log files, when the build completes, and calculates a number of different metrics on them, including mean, median, maximum and minimum sample times, throughput, percentage of successful samples, total number of samples etc. It calculates these metrics per sampler, as well as across all samplers. It then stores this aggregated data for future use.
Having aggregated the build data, the next thing the plugin does is validates it. One of the difficulties of load tests is that their success, or lack their of, is very subjective. The way judgments are most often made on whether performance is good or bad is by looking at graphs of the builds over time. Rarely are the actual numbers themselves used, as some requests will be inherently slower or faster than others. While it may by perfectly acceptable for a search request on an application to take an average of 2 seconds, it might not be acceptable for the login to take 2 seconds.
Setting up rules for failing builds based on these raw numbers is not only tedious, it will likely lead to false failures in builds. This is where the JMeter Aggregator plugin can help. Rather than only letting you set up assertions based on the current value of the mean, average etc, it also lets you configure assertions based on the change in value. You can configure blanket assertions that say, for example, fail the build if the mean time of any sampler increases by more than 20%. You might pick other requests, for example, viewing a simple content page might be something that should always have a stable response time, so if it deviates by more than 300ms, the build should fail. And of course, you can still configure simple assertions on the current value, such as if any sample takes over 10 seconds, the build should fail.
Running assertions on builds gives developers quick feedback on whether their changes have introduced any significant performance problems. This will help identification of features with performance problems early, so they can be rethought or redesigned if necessary before too much effort is put into them.
Reporting is by far the most valuable feature of load tests. Performance problems can show themselves gradually over time, eluding assertions. This is where it is essential to be able to visualise performance data. The Bamboo JMeter Aggregator plugin gives you power over exactly what data is plotted in the reports it produces. To show this power, I will set up a little scenario. Let’s say I have a website that allows creating and editing pages, commenting on pages, searching on pages, and also provides an RPC interface for searching content. I have written a load test in JMeter to test each of these functions under a normal load. After making a number of changes, the load test build fails, saying that the search functions performance has degraded. So I have a look at the JMeter Aggregators report to see what happened:
Clearly there is a big problem here, and it’s happening on every request. I’ve graphed the mean, median, 90% line and largest sample time all on the same graph, and they’ve all jumped, so I know that it’s not just a few requests taking forever that are dragging the mean up. At this point, I might assume that this problem is to do with the search code itself, maybe I’m doing something very expensive here. But before I jump to conclusions, let’s see if the RPC search is also affected:
The RPC search actually decreased in time! So, the search code itself must be fine, maybe it’s something to do with the web interface? To check, I’ll graph other web interface functions against the search function:
It’s not big, but they all seem to have degraded in performance a little bit. My conclusion is the likely cause is that something in the web interface is slowing all my page views down, and is affecting the search page much more than any other page.
Something that JMeter doesn’t measure, but is also very important to know during load testing, is machine load. The JMeter Aggregator plugin also will aggregate arbitrary samples from a CSV file. For example, while the load tests are running you may run a background script that every 10 seconds, takes a reading of the CPU usage, memory usage, IO activity etc, and output it into a CSV file in the following format:
The JMeter Aggregator plugin is able to take and aggregate this data, as if each column of the CSV file is an HTTP sampler.
The Bamboo JMeter Aggregator plugin is an ideal tool to get the most value out of the data produced by your performance builds. It aggregates the data into meaningful metrics, provides fast feedback to developers of performance problems based on the change in values of metrics, and allows powerful reports to be generated for analysing where problems lie. For downloads, installation instructions and documentation, please visit the Bamboo JMeter Aggregator Plugin’s page in the Bamboo plugin library.