Everything you need to know about load testing
Load testing is a dress rehearsal for your site or app. It gives crucial insight into how your system performs under high traffic and heavy usage—and reveals what you need to do to optimize performance. Discover everything you need to know about load testing, including what it is, why it’s important, when and how you should run load tests, and what to do when a load test fails.
Table of contents
- What is load testing?
- Why is load testing important?
- When should you load test?
- Performance testing, load testing & stress testing
- How to do load testing
- What to do if you fail your load test
- Summary: Load test to impress
Load testing involves testing software’s performance with a specified load to understand how it behaves under expected or heavy usage. It’s a type of performance test that investigates the speed, scalability, and reliability of a system.
If you know your system works, load testing asks the question: “does it work at scale?”
Load tests are run on almost all websites and apps you use. They’re used to test:
- Ecommerce stores that get a lot of orders.
- Ticketing sites that handle large ticket sales.
- Government sites that deal with public service interactions.
- Video games with online multiplayer gameplay.
- Banks that handle financial transactions.
Load tests are important for any system that has fluctuating and unpredictable levels of load (traffic, requests, actions). They let businesses and organizations simulate different levels of load in a controlled environment, so they don’t experience crashes, errors, or slowdowns during popular events like Black Friday, tax time, or a major concert onsale.
Load testing isn’t unique to software—it’s needed for many products. Furniture companies load test their tables and chairs to make sure they can handle large amounts of mass. Batteries are load tested to ensure they can maintain sufficient voltage. Cranes are load tested to make sure they’re safe and reliable while hoisting massive weight.
Just as these physical load tests ensure your chair doesn’t collapse when you sit on it and cranes don’t break while lifting materials, software load tests ensure your site or app doesn’t break when people use it.
A software load test lets you understand how your site or app functions under load. It reveals performance bottlenecks and provides the information you need to optimize your systems and handle major traffic with confidence.
Load testing lets you understand the limitations of your system so you can improve its speed and reliability. It gives insight into performance issues and bottlenecks in a test environment, so your users don’t have to suffer through them in real life.
Load testing is like a dress rehearsal. If you’re expecting heavy traffic for a major sale or product drop, or a high-profile government registration, or ticketing event, you need to know your site and servers can handle it.
92% of IT professionals say that performance and/or load tests are important or very important. And 63% of enterprise IT leaders execute performance tests on all new releases.
If you fail to load test, there’s a good chance your site or system will fail too. And crashes and outages pose major threats to:
- Your bottom line: 91% of enterprises report downtime costs exceeding $300,000 per hour.
- Your customer loyalty: 1 in 3 customers will leave a brand they love after one bad experience.
- Your brand reputation: 32% of IT professionals say their brand reputation was damaged by outages.
- Your job: 53% of IT professionals say their company will experience an outage so bad someone loses their job.
RELATED: The Cost of Downtime: Outages, Brownouts & Your Bottom Line
Load testing is an essential tool to prepare for and mitigate these risks of downtime. It helps you avoid making mistakes like the one Coinbase made when they spent $10 million on a Super Bowl ad only to send customers to a site that looked like... well, this:
Functionality is the most important element of your product or service—if something doesn’t work then it doesn’t matter how scalable or reliable it is.
But after functionality, reliability is the most crucial component of great user experience.
And load testing—and performance testing generally—is essential for ensuring reliability (as well as functionality at scale).
Load testing for SLAs
Load testing is often also important to fulfill the requirements of Service Level Agreements (SLAs) between service providers and clients. If you’ve guaranteed a certain level of performance in your SLAs, load testing helps you ensure you deliver on that guarantee (or let you check your service provider delivers on their guarantee).
“Why is load testing important?” summary: Load testing is crucial to understanding how your site or app will perform while in use. Load tests reveal bottlenecks and performance issues that are key to preventing costly websites crashes, errors, and/or system slowdowns. Load testing is also important for meeting service level agreements, and for delivering a fast and reliable service to the public.
Load tests should be run regularly as a proactive measure, but they’re particularly important ahead of high-traffic events or after changes are made to your application architecture, system dependencies, or code algorithms.
57% of organizations run performance and/or load tests at every sprint, and 95% commit to testing annually. Just 5% of IT professionals “never” run load or performance tests.
Companies and organizations typically run load tests ahead of events and activities like:
- The launch of a new product or service
- A major flash sale like Black Friday
- Ticket onsales for a hyped concert or event
- High-profile public sector registrations
- A large marketing campaign or PR appearance
If you’re expecting traffic to rise significantly above normal levels for any reason, you should load test to make sure this traffic doesn’t bring your service down.
You should also run load tests when any major changes are made to the application architecture, system dependencies, or code algorithms. Small changes can have a massive impact on performance—and load tests let you measure and understand this impact.
Even if your site or application doesn’t have scheduled high traffic events like a ticket onsale or major sale, load tests are crucial to ensuring you’re prepared for the unexpected.
Unexpected traffic spikes are, by nature, unexpected. But we’ve seen a few common causes come up again and again:
- PR appearances or features in major new outlets.
- A celebrity shouting out your brand or wearing your items.
- A viral social media post or making it to the frontpage of Reddit (called the “Reddit hug of death” or the “Slashdot effect”)
When France’s 3rd largest online marketplace, Rakuten, for example, appeared on the national news, their traffic spiked 819% in just 2 minutes.
“It wasn’t even an ad or offer or anything like that. It just said who our spokesperson worked for and mentioned our brand, and immediately we saw a spike,” Thibaud Simond, Rakuten France’s Infrastructure manager told us.
RELATED: A TV spot spiked Rakuten France’s traffic 819% in 2 minutes. Here’s how it went.
The load testing process needs to start early. It requires a lot of planning. And you don’t know what you’re going to find, which means you don’t know how long it will take to implement the necessary changes.
38% of IT professionals say that time is the biggest impediment to running load tests.
This is because load testing isn’t a one and done process—it’s iterative. Your first load test might expose a bottleneck. Then after you fix it, the second load test exposes another. Then after you fix it, your third load test—you get the point.
It often takes many iterations to get to the desired capacity or throughput, so the more time you have, the better.
Starting early is also important because many large retailers and organizations put a code freeze in place ahead of big sales days like Black Friday. If you leave things too late, you might not be able to act on the findings of your test.
Reliability, availability, and performance under load are essential to us at Queue-it, so we perform load tests every time:
- Changes are made to design, architecture, or algorithms.
- We set up new infrastructure such as a new region.
- We have a client holding an event where traffic is expected to be significantly greater than what we typically see.
We’re also working on developing scheduled automated load testing that generates reports so we can monitor and track performance over time.
“When should you load test?” summary: There are three times you should run load tests: (1) ahead of a high traffic event; (2) if changes are made to your application; (3) as part of regular performance testing maintenance. If you’re running a load test for a scheduled event, plan the test well in advance—the process is iterative and can take many rounds of testing and optimization.
The software testing ecosystem can be difficult to navigate. There are dozens of types of performance tests that serve similar, but distinct purposes.
Here’s a simple breakdown of performance testing, load testing, and stress testing.
RELATED: Load Testing vs. Stress Testing: Key Differences, Definitions & Examples
Performance testing is the umbrella term for all non-functional tests on the performance of applications. These are tests that look at speed, scalability, and reliability.
When you’re testing a site, app, or piece of software, there’s functional and non-functional tests.
In very simple terms, functional tests ask: “does it work?”
Non-functional tests, on the other hand, ask: “How well does it work? How fast and reliable is it? Does it work at scale?”
Load testing is a type of performance testing that’s essential for determining how your system performs when experiencing different levels of load.
Load tests typically focus on expected load. This means if you’re expecting 10,000 concurrent users, your test will populate your site with 10,000 users—or maybe 11,000 to give some breathing room—to see how the site manages it.
While load testing is technically a non-functional test, it can reveal both functional (e.g., errors or crashes) and non-functional problems (e.g., slow response times or cart timeouts). This is because some bugs or errors that impact function only appear under load.
Stress testing is a type of performance testing that focuses not just on the expected load, but on extreme load. It’s about taking your site or app to the breaking point and observing recovery.
This means that even if your site works fine at the expected 10,000 concurrent users, you’ll keep bumping that number up.
With a stress test, you’re looking to determine where things slow down, produce errors, or crash. It’s about determining the level of load that causes failure, and then investigating how well and fast your system can recover (its recoverability). Stress tests can also be used to reveal denials of service, data corruption, and security issues.
To decide between a load test and a stress test you need to consider your objectives:
- Are you preparing for an event with predictable traffic levels? Then load test.
- Are you looking to determine the upper limits of your system and its recoverability? Then stress test.
Both types of tests are important. But load testing serves a more basic fundamental purpose—helping you stay online during expected traffic—than stress testing, which is more focused on what happens when your system fails because of high traffic.
"Performance vs. load vs. stress testing" summary: Performance tests are a range of tests that investigate the speed and reliability of systems. Load testing is among the most common performance tests, as it lets you test how your system performs under an expected load—to simulate real-world conditions during events like sales or marketing campaigns. Stress testing is a more advanced type of performance test that uncovers the load at which your system breaks and its ability to recover from this load.
Before you even start setting up your load tests, it’s crucial you establish your objectives. You need to be realistic. No website or application can handle unlimited traffic. No site or app can maintain lightning-fast speeds while at or close to capacity. No site or app can autoscale instantly or infinitely.
Your objectives will help determine the metrics you’ll want to track in your load test. Common metrics you might track include:
- Response time: how long it takes for the application to respond to requests.
- Resource utilization: the level of CPU and memory usage.
- Error responses: the number of requests that produce 5xx status codes like 500, 503, etc.
- Throughput: the number of transactions per second.
- Workload: the number of concurrent tasks or users.
Say you decide these are the 5 metrics you want to track. You should set your goals for each metric and how they work together. Ask yourself:
- How many concurrent users should we be able to handle?
- How low should latency to be at this number of users?
- How many transactions per second should we be able to handle?
- How many (if any) errors are acceptable?
- How much CPU and memory are we comfortable using?
For some organizations we talk to, these questions can be a bit elaborate and go beyond their needs. For a standard small-to-medium sized ecommerce brand, you might only really need two questions answered:
- Are we processing orders at the rate we want to?
- Are web pages loading fast enough to not impact the customer experience?
You want to be generous with your estimates and prepare for a worst-case scenario. But you also need to be realistic. Being able to handle massive amounts of users at lightning-fast speeds takes a lot of work, costs a lot of money, and is sometimes impossible.
Another way you can simplify evaluating your load tests is by creating a benchmark around availability. The industry standard is “four nines” (meaning the system availability cannot be lower than 99.99%), but this could anything from two nines (99%) to six nines (99.9999%), depending on your goals.
Evaluating load tests in this way lets you summarize the system behavior into one number and verify it against your benchmark. It turns your load test into a simpler pass/fail process that looks at availability.
It’s important to note that increasing your capacity may start off cheaper and easier. But as the number of concurrent users you want to handle goes up, so will the costs.
Fixing a bottleneck early in the process might just be adjusting an algorithm or adding a bigger database server. But as your traffic gets larger, so do the challenges. You may need to change your architecture, replace or change your data models, or even change core business logics and processes.
The exponential costs of scaling
Being clear about your objectives is important in determining which type of performance tests you should run:
- If want to see how your system scales & recovers during sudden fluctuations in traffic, you need a spike test.
- If you’re looking to establish the breaking point of your system, you need a stress test.
- If you want to handle large amounts of users for an extended period, you need a soak test.
The second crucial step before your load tests is that you establish clearly what users are doing on your system.
There’s a big difference between thousands of users sitting on your homepage and thousands of users browsing your site and purchasing items. And the bottlenecks you’re looking to identify in your load tests are often caused by the latter.
Common site bottlenecks include:
- Payment gateways
- Database locks & queries (e.g. logins, updating inventory databases)
- Building cart objects (add to cart or checkout processes, tax calculation services, address autofill)
- Third-party service providers (authentication services, fraud services)
- Site plugins
- Transactions (e.g. updating inventory databases while processing orders and payments).
- Dynamic content (e.g., recommendation engines, dynamic search & filters)
- CPU usage
When Canadian Health Center NLCHI was facing massive traffic for vaccinations, for example, they thought scaling their servers on AWS would help. But, as Andrew Roberts, the Director of Solution Development and Support told us, “this just brought the bottleneck from the frontend to the backend.”
Similarly, British ecommerce brand LeMieux load tested their site ahead of their biggest Black Friday sale yet. They determined exactly what they could handle, then set up a virtual waiting room and set the outflow from waiting room to site at that rate.
But on the morning of their Black Friday sale, their site experienced significant slowdowns. They realized they’d not load tested their site search and filter features, focusing on traffic volume rather than behavior.
Because they had a virtual waiting room in place, they were able to lower the outflow quickly and resolve the issue. But their story provides a valuable lesson for those running tests: focus on user behavior.
"We believed previous problems were caused by volume of people using the site. But it’s so important to understand how they interact with the site and therefore the amount of queries that go back to the servers."
Jodie Bratchell, Ecommerce & Digital Platforms Administrator, LeMieux
RELATED: LeMieux’s Best Black Friday Yet: “Queue-it saved our website”
A flow-based approach to load testing
It’s important you take a realistic, flow-based approach to load testing that replicates real user behavior.
Even well-built load tests can sometimes suffer from their own perfection. This is because load tests typically follow very uniform patterns.
Let’s say you’re running a product launch and you’re expecting about 1,000 users. You run your load test with 100 users per minute for 10 minutes and confirm your site works fine with 1,000 concurrent users.
But what happens if all 1,000 users show up exactly when the product launches?
Your pages slow down, your cart objects break, and payments fail.
Because you’re sending a wave of visitors through the journey at the same time, which risks overwhelming your bottlenecks.
Although you load tested for 1,000 concurrent users, your tests only had 100 users hitting your bottlenecks each minute. Real users don’t follow these uniform patterns.
Making assumptions based on concurrent users or overall capacity is a common load testing mistake. You need to focus on throughput and activity, rather than total capacity.
RELATED: How to Avoid the Website Capacity Mistake Everyone Makes
For optimal performance, your site needs to have an average inflow that’s equal to the average outflow.
To take a flow-based approach to load testing, consider and replicate the:
- Typical sequence of pages (e.g., home page, sale page, product page, shopping cart, etc.)
- The estimated think time for each action (e.g., how long does it take for customers to go from a search page to a product page to their cart).
- The most popular workflows (e.g., do most people use dynamic search or product filtering).
This flow-based approach also becomes important during the validation of your results. You can use Little’s Law, from Queuing Theory to ensure that the throughput is accurate.
As Hassan and Jiang write in A Survey on Load Testing of Large-Scale Software Systems: “If there is a big difference between the calculated and measured throughput, there could be failure in the transactions or load variations (e.g., during warm up or cool down) or load generation errors (e.g., load generation machines cannot keep up with the specified loads).”
While you can run load tests using your own infrastructure, many companies use a third-party performance testing service.
There are many load testing tools you can use. Some are open-source tools you can use to write and execute tests (e.g. Gatling & JMeter). Others are SaaS tools that offer cloud-based testing and provide more support and features, but come with a cost (i.e. LoadNinja & BlazeMeter).
Even if you use an open-source tool like JMeter to write your test, to run cloud-based load testing you’ll need to deploy your tests to multiple servers, execute these tests simultaneously, and visualize and store the results of these tests. This is possible on your own, but it’s a lot easier using a SaaS product.
To discover how to run less painful, almost free cloud-based load tests, check out Queue-it’s Director of Product Martin Larsen’s guide to getting started with Gatling and RedLine13.
- JMeter: a Java application designed specifically for load testing that lets you test web applications and response times. JMeter is one of the most popular load testing tools.
- Gatling: A powerful load testing tool designed for continuous load testing that integrates with your development pipeline. Gatling load tests are written in Scala code, executed from the console, and results are generated in HTML. Gatling is also available as a SaaS product through Gatling Enterprise.
- The Grinder: A Java load testing framework that makes it easy to run distributed tests using many load generator machines. The Grinder works on any system with a Java API.
- Locust: A Python-based, distributed load testing tool that lets you “swarm” your system with millions of concurrent users. Locust is user-friendly, summarizing the results of your load tests into easy-to-understand dashboards and test reports.
- k6 Cloud: k6 Cloud is the SaaS version of the open-source load testing tool k6. k6 Cloud lets you script by recording user journeys on a browser, test from 21 different geo-locations, and run tests in the cloud with up to 1 million concurrent virtual users. It’s targeted at developers and offers performance monitoring and user-friendly UI.
- LoadNinja: LoadNinja lets you load test using real browsers. Browser-based load testing brings the test closer to replicating real high traffic situations. LoadNinja is simple to use, letting you run scriptless tests based on browser recordings. It also contains a range of tools to help you analyze your performance results.
- BlazeMeter: BlazeMeter is one of the few load testing services built specifically for Apache JMeter. It lets you set up tests quickly, create tests with up to 1 million concurrent users, simulate mobile testing from real devices, and run tests from multiple geo-locations. It offers tests of up to 50 concurrent users for free, but costs quickly rise if you want to test with larger load.
- LoadRunner: LoadRunner is a sophisticated load testing tool that lets you detect performance issues in web applications, ERP software, and legacy system applications. It’s specially designed to help detect bottlenecks before the application implementation phase, letting you evaluate performance before it goes live. It has a patented auto-correlation engine that lets you accurately detect system, end-user, and code-level bottlenecks.
Once you’ve determined your objectives, understand user behavior and the desired throughput, and have chosen your tool, you should be ready to start your load tests.
The steps of a load test are typically as follows:
- Create a dedicated test environment that’s identical to the production environment.
- Determine load testing transactions for an application.
- Execute & monitor your load test.
- Analyze the results & determine the actions needed.
- Optimize performance & remove bottlenecks.
- Rinse & repeat until your load tests meet your objectives.
"How to do load testing" summary: The first step of running a load test is to answer the questions "why are we testing?" & "what do we want to achieve?". Once you've set your objectives & benchmarks, the second step is understanding how users are interacting with your site & the expected traffic levels. Third, you need to choose the right load testing tool(s) the job. Finally, you can run your load tests, evaluate the results & get to work improving performance.
If you fail your load test, the first thing you need to do is analyze the results, identify the bottleneck(s) limiting performance, and remove it. This could involve activities like:
- Fixing bad code
- Toggling performance-intensive features
- Contacting third-party service providers about improving throughput
- Optimizing your CDN or setting up autoscaling
RELATED: Optimize Your Website Performance with These 11 Expert Tips
But even after standard optimization efforts like these, many companies and organizations continue to face problems with handling traffic—either in subsequent load tests, or in real-world high traffic scenarios.
These could be due to internal issues like limited resources or an inability to change business procedures, or external issues like the unpredictability of real-world user behavior or bot attacks.
- Improvements to the system are extremely expensive and difficult to implement.
- There isn’t enough time to implement the necessary improvements in time for the high traffic event.
- Improvements to the system are not practical to implement for infrequent high traffic events.
When Rapha partnered with a hyped streetwear brand for a product drop, for example, they quickly realized their site simply wasn't built for the kind of traffic the drop would attract. Tristan Watson, Rapha's Engineering Manager, told us:
“Our partner only releases using this product drop method. Their whole technology stack is built around selling out - fast. In our first meeting, they disclosed they had seen 100,000 requests every few minutes. Unlike the partner, we work in a more traditional sales model, meaning our infrastructure isn’t designed to deal with the compressed traffic spikes you get during a ‘hyped’ drop.”
- The bottlenecks are due to third-party service providers (payment providers, SaaS solutions, bot or fraud protection) that can’t increase throughput (or can for an exorbitant price).
- There are unpredictable issues that occur during high traffic events (like bot attacks or huge traffic spikes) that aren’t accounted for in load testing.
- The high traffic events involve limited inventory (such as concert tickets or a limited-edition product drop), causing issues around fair allocation of goods and/or overselling.
For DeinDeal, flash sales drove massive spikes in traffic that they simply couldn't predict. Alexandre Branquart, DeinDeal's CTO, told us:
"Not all components of a technical stack can scale automatically, sometimes the tech part of some components cannot react as fast as the traffic is coming. We have campaigns that start at a precise hour…and in less than 10 seconds, you have all the traffic coming at the same time. Driving this kind of autoscaling is not trivial."
If you’re facing internal or external challenges to handling high traffic events, there is another solution. It’s fast and easy to implement, doesn’t require significant optimization efforts, delivers a fair and reliable customer experience, and protects your site or app under any level of load.
It’s called a virtual waiting room.
A virtual waiting room gives you control over what other performance optimization efforts can’t: the traffic.
In high-demand situations, websites or apps with a waiting room redirect online visitors to a customizable waiting room using an HTTP 302 redirect. These visitors are then throttled back to your website or app in a controlled, first-come-first-served order.
If you’re running a sale or registration that starts at a specific time, you can create a scheduled waiting room that holds early visitors on a countdown page and then randomize them just like a raffle, giving everyone an equal chance. Visitors arriving afterwards are added to the end of the queue on a first-in-first-out basis.
RELATED: Developers’ Guide to How Queue-it Works
You can control traffic flow down to the minute with a virtual waiting room. Set the traffic outflow from the waiting room to your site to exactly match what your load tests reveal you can handle—whether it’s a hundred, a thousand, or ten thousand users per minute.
And if unexpected bottlenecks appear in the system, you can reduce traffic flow on the fly.
Because the waiting room page is super lightweight, holding users is less resource-intensive than on typical sites or apps.
There’s no updating of inventory databases, or building of cart objects, or bottlenecks related to third-party plugins and payment gateways. This means Queue-it can handle more traffic than even the biggest ecommerce sites.
On an average day, Queue-it processes 200 million visitors. It runs on robust and highly scalable cloud-hosted infrastructure that handles some of the world’s biggest online events.
RELATED: 1 Million in Queue: How Ingresso.com Delivers on Massive Demand for Rock in Rio
Essentially, it can handle the massive traffic peaks high-visibility events attract, so you and your engineering team don’t have to.
One of our most recent load tests simulated an inflow of 50,000 new users every minute for 60 minutes, reaching a total of 3,000,000 (3 million) concurrent users in the waiting room without any issues.
RELATED: Discover How Load Testing Works with a Virtual Waiting Room [Whitepaper]
- Load testing is the process of testing software performance under a specified load.
- Load testing is important because it helps businesses and organizations understand how their site or app will perform in real-world high-traffic scenarios. It gives the necessary insights to optimize performance and avoid crashes, errors, and slowdowns.
- You should load test well in advance of major events, as well as whenever there are changes made to application architecture, system dependencies, or code algorithms. Enterprise-level businesses should also run load tests as part of their regular proactive monitoring and testing.
- Load testing and stress testing are both different types of performance tests. While load testing looks at how your site or app performs under expected load, stress testing takes your system to the limit to determine its breaking point and recoverability.
- The load testing process involves:
- Setting your objectives and benchmarks
- Understanding the user journey
- Choosing your load testing tools
- Running your load test
- Using the results to identify and remove bottlenecks
- Repeating until performance meets objectives
- If your load tests fail and you can’t get performance to where you need it to be, you can use a virtual waiting room to get control over the flow of traffic and prevent errors, slowdowns, or crashes caused by bottlenecks.