How to Report Your Performance Test Results Like a Pro
Let's start with a key question: What is a performance test?
Performance tests try to reduce the risks of downtime or outages on multi-user systems by conducting experiments that use load to reveal limitations and errors in the system. Testing usually involves assessing the performance and capacity of systems that were expensive and time-consuming to build.
Very few software projects -- and in my experience, never -- are delivered early, so significant time pressures usually exist. The findings of a performance test inform tactical and strategic decisions that have even more at stake: The wrong decision to go live with a website or application could damage the financial results, brand, or even viability of the company.
Performance testing Is critical
In a short period of time, we need to gather information to help advise stakeholders to make decisions that will affect the business. As performance testers, we have a responsibility to report reliable information about the systems we test.
All of the steps in global performance testing matter to successful projects and making good decisions. These steps include (but aren’t limited to):
Conducting discovery
Modeling outcomes
Developing scripts
Executing tests
Interpreting these results and reporting them properly is where the value of an experienced performance engineer is proven.
Data needs analysis to become intelligence
Most load testing tools have some graphing capability, but you should not mistake graphs for reports. You don't want to send a canned report without properly analyzing the results. Graphs are just a tool. The most effective way to use graphs is to aid in visualization to guide stakeholders in consuming actionable information.
The performance tester should form hypotheses, draw tentative conclusions, determine what information is needed to confirm or disprove their conclusions, and prepare key visualizations that give insight on system performance, bottlenecks, and support the narrative of the report.
This requires an understanding the following:
Architecture
Hard and soft resources
Garbage collection (GC) algorithms
Database performance
Message bus characteristics
Other components of complex systems
Understanding that a system slows down after a certain load is surpassed is valuable information. Understanding the limiting resource (i.e. the reason for the system slowing down) is actionable intelligence. Developing the ability to recognize patterns can take years, and is an ongoing and ever-changing process.
Socio-political considerations
A large part of reporting is knowing how to talk to stakeholders. Consider the following:
Who needs to know these results?
What do they need to know?
How do they want to be told?
How can we form and share the narrative so that everyone on the team can make good decisions that will help us all succeed?
It is our job to guide stakeholders by 1) revealing information, 2) identifying actionable items, and 3) turning our findings into a solid plan.
How to grow the skills
The good news is that you don’t have to do this all by yourself. The subject matter experts you are working with -- developers, operations, database administrators, help desk staff, business stakeholders, and your other teammates -- all have information that can help you unlock the full value of a performance test.
This is a complex process that can be difficult to teach. In order to address this challenge, my former consulting partner and mentor, Dan Downing, came up with a six-step process called CAVIAR which stands for:
Collecting
Aggregating
Visualizing
Interpreting
Analyzing
Reporting
Let’s review each step.
1. Collecting: Gathering results from tests to help weigh the validity of the results.
Are there errors? What kind, and when did they occur? What are the patterns? Can you get error logs from the application? One important component of collecting is granularity. Measurements from every few seconds can help you spot trends and transient conditions.
2. Aggregating: Summarizing measurements using various levels of granularity to provide tree and forest views, and comparing them with consistent granularities.
Another component of proper reporting is measuring results with meaningful statistics like scatter plots, data ranges, variance, percentiles, and other ways of examining data distribution. Reporting is more accurate when multiple metrics are used to "triangulate" or confirm the hypotheses.
3. Visualizing: Graphing key indicators to help understand what occurred during the test.
Here are some key comparisons to start with:
Errors vs. load (“results valid?”)
Bandwidth throughput over load (“system bottleneck?”)
Response time vs. load (“how does system scale?”)
Business process end-to-end
Page level (min-avg-max-SD-90th percentile)
System resources (“how’s the infrastructure capacity?”)
Server CPU vs. load
Java virtual machine (JVM) heap memory/GC
Database lock contention, input/output (I/O) latency
4. Interpreting: Drawing conclusions from observations and hypotheses.
Interpreting data requires you to evaluate your data and test your hypotheses:
Make objective, quantitative observations from graphs and data -- what can you observe from this data?
Compare your observations --where are the consistencies and inconsistencies in your observations?
Develop hypotheses based on your observations
Test your hypotheses
Turn validated hypotheses into conclusions: “From observations A and B, corroborated by C, I conclude that…”
5. Assessing: Checking where objectives were met and deciding what action to take.
Determine remediation options at the appropriate level (business, middleware, application, infrastructure, network) and retest. At this stage, you should generate recommendations that are specific and actionable at a business or technical level. It never hurts to involve more people to make sure your findings add up. The benefits, costs, and risks of your recommendations should be as transparent as possible.
Remember that a tester illuminates and describes the situation, but the final outcome is up to the judgment of your stakeholders, not you. If you provide good information and well-supported recommendations, you’ve done your job.
6. Reporting: Aggregating and presenting your recommendations, risks, costs, and limitations.
Note the “-ing” here -- reporting is an ongoing process. A written, presentation, or oral report should follow one of the formats below:
Short elevator summary
Three-paragraph email
Narrative
These are the report formats that people prefer to consume, so it's worthwhile to spend time on getting them right. Writing the report yourself avoids risking the errors that often stem from a third party interpreting your work.
Good reporting conveys recommendations in stakeholders’ terms. You should identify the audience(s) for the report, and appeal to them in their language. What points do you need to include and what information is required to support them?
Let’s review some best practices for clear reporting.
How to write a test report
A written report is still usually the key deliverable, even if most people won’t read it (and fewer will review the whole report). One way to construct the written report follows:
1. Executive summary (two to three pages maximum)
Address the primary audience, usually executive sponsors and the business
Keep language simple, and avoid acronyms and jargon -- if you need to use complicated terms, define and explain them
Include only pertinent details -- don’t include information that isn’t related to the audience
Correlate recommendations to business objectives
Summarize objectives, approach, target load, and acceptance criteria
Cite factual observations
Draw conclusions based on observations
Make actionable recommendations
2. Supporting detail
Detail all of the information necessary for repeating your tests
Include rich technical detail like observations and annotated graphs
Incorporate feedback from technical teams, quoted accurately
Describe test parameters (date/time executed, business processes, load ramp, think-times, hardware configuration, software versions/builds, etc)
Consider sections for errors, throughput, scalability, and capacity
Insert annotated graphs, observations, conclusions, and recommendations in each section where possible
3. Associated documents (if appropriate)
Include a full set of graphs, workflow details, scripts, and test assets at the end of the report to document the process
Consider the audience, which figures support the findings, and the observations they can make from the graphs and tables
Remember: Data + Analysis = Intelligence
4. Present the results
The best presentations usually consist of about 5-10 slides with visual aids and little text. Be as clear as possible in your explanation of the recommendations, describing the risks and benefits of each solution.
Key takeaways
This methodology isn’t appropriate for every context. Your project may be small, or you may have a charter to run a single test and report to only a technical audience. There are other reasons to decide to do things differently in your project, and that’s OK. Keep in mind that your expertise as a performance tester is what turns the numbers into actionable information and valuable insights.