With more than 45,000 members, New York Road Runners (NYRR) is one of the world's leading running organizations. NYRR runs a year-round calendar of more than fifty races, including the famed ING New York City Marathon. Web Performance has performed NYRR's load-testing services for the last three race seasons, each time ensuring the websites and servers designed to handle New York City Marathon traffic are in the same peak condition as the runners themselves.
To help site visitors follow the runners' progress during a race, Web Performance set up the NYRR application to interface with Google Maps. This up-to-the-minute functionality, combined with an enormous amount of traffic over a very short duration, means the New York City Marathon site falls into one of the worst-case scenarios for server load and website performance.
At a goal of 30,000 concurrent users, and an average page duration of less than two seconds, the requirement to meet this load were significant. The NYRR's initial estimate included nine web servers and three database servers, with a single load-balancer distributing load across the web servers. After conducting a thorough initial assessment, Web Performance increased their requirements to a more robust and more realistic twelve web servers and four database servers.
The site's lone function is monitoring runners' progress throughout the race. Web Performance began by developing a test case that involved logging onto the site, selecting three runners by bib number, retrieving each runner's current location, then loading the map to show the runners' locations.
The test peak load was set for 30,000 concurrent users, according to our initial peak load estimates. However, the first test resulted in sluggish site performance at 2,000 users and started dropping connections at 6,000 users. This initial problem revealed a hardware configuration problem. The load balancer couldn't keep up with the rate of new connections, and was either delaying connections or dropping them altogether—a problem Web Performance discovered after analyzing server metrics and eliminating the web servers as the issue.
"Web Performance's service was outstanding," says Diego Marin of NYRR. "Their engineers went beyond stress testing to help me pinpoint potential bottlenecks within my environment."
Web Performance also discovered a number of 503 Service Unavailable errors to some requests, which were resolved by adding three more web servers and an additional database server.
Once those issues were solved, the test case was reconfigured to loop between runner data and map views.
The final successful test lasted 30 minutes and completed approximately 191,000 test cases. During that time, the site sustained peak load for nearly 15 minutes with 30,000 concurrent users at an average page duration of less than four seconds. As a final step, we performed capacity tests on a single webserver and single database server, as well as some exploratory testing on an Amazon Elastic Load Balancer configuration.
From start to finish, the entire project ran less than two weeks and was completed in plenty of time to successfully track the race season, which began on April 23rd.
"When it came to brainstorming, they were there," says Marin. "When I had to stress test any time of day, they were there as well. They were invaluable."