Category Archives: Testing

Database Migration

MalcolmInTheMiddleGif

(Always One More Thing…)

Who Are We?

The Ad Management team here at Bazaarvoice grew out of an incubator team. The goal of our incubator is to quickly iterate on ideas, producing prototypes and “proof of concept” projects to be iterated on if they validate a customer need. The project of interest here generates reports based on aggregations of event data gathered from several other teams at the company. As our project gained traction, it grew in size and scope, eventually leading to the need to revisit some of the design decisions made in the prototyping phase. Specifically, we found the original database system we chose, EmoDB, to not meet our needs as our requirements evolved.

Why Migrate?

When this project was started, it began as a prototype designed to get the project rolling as quickly and easily as possible. The initial team chose EmoDB since they were familiar with the in-house technology from their other projects and it fit our initial needs. As the project gained traction, and we had more data to operate against, we encountered scalability issues, initially resolved with caching and some refactoring. We found that we were querying EmoDB as if it were a typical relational database, when it’s not actually designed for that use case. (Emodb is an eventually consistent json blob store with a change notification databus that spans multiple AWS AZs and Regions. EmoDB powers many of our solutions at Bazaarvoice and is now open-source and available at:
https://github.com/bazaarvoice/emodb

We chose to switch to MySql to leverage the Relational Data Model for rolling up aggregations of data we collect and calculate. We ran into problems previously when we retrieved whole documents to perform aggregations on our data, leading us to decide that a technology that is optimized for relational models would suit the project much better.

How to Migrate?

Since our project already had trained users by the time we wanted to migrate database systems, we needed to design our migration with a no-down-time approach; “seamlessly” changing out the back-end implementation for our users. We also made these transitions configurable, such that we wouldn’t need to make one large switch to master from the new system, but we could choose which services were ready to be cut over to the new data back-end.
The following image is our design document that describes how we planned our migration. On the left is our origin code base named “legacy”. On the right was the proposed design for our new service stack for the migration. Inserted into the middle is the “Service Facade” where we intended to run our quality assurance against live data between the legacy technology stack and our new technology stack.

Copy of ad-management migration to MySql based stack - Page 1

How to Maintain Data Consistency?

Depending on the size of the data that is being diffed and migrated between databases, it can be expensive to run the necessary migrations. Our solution was both writing specific tasks that backfilled data or directly migrating data sets to the new data source. This allowed us to smoke test that our services are working, without expending large amounts of time or money finding bugs along the way. As our confidence grew in our custom tooling and services, we would backfill and migrate larger chunks of data, until we had migrated everything necessary to master from our new service.

What is a Service Facade?

The service facade layer is responsible for executing the respective operations out of the legacy and new stacks. This is where we placed our diffing logic to compare the results returned from Emo and Mysql for the same operation. The facade returns data from the pre-defined configured stack. This meant that certain areas of the application could be sourcing from Mysql, while other areas, that we weren’t confident in, continued to source from Emo. For example, our CampaignRoiReportBuilderServiceFacade written in Scala looked something like this:


class CampaignRoiReportBuilderServiceFacade @Inject()(
private val campaignRoiReportBuilderServiceLegacy:CampaignRoiReportBuilderServiceLegacy,
private val campaignRoiReportService:CampaignRoiReportService,
private val campaignConfig: CampaignConfiguration,
private val facadeDiffTool: FacadeDiffTool) {
...
  def buildReport(...): Future[Option[CampaignRoiReport]] = {
    val roiReportFromEmoDbFuture: Future[Option[CampaignRoiReport]] = campaignRoiReportBuilderServiceLegacy.buildReport(...)
    val roiReportFromMySqlFuture: Future[Option[CampaignRoiReport]] = campaignRoiReportService.buildReport(...)

    // Extract data from scala futures
    for {
      roiReportFromEmoDbMaybe <- roiReportFromEmoDbFuture
      roiReportFromMySqlMaybe <- roiReportFromMySqlFuture } { // Pattern Matching to extract values from scala Option (roiReportFromEmoDbMaybe, roiReportFromMySqlMaybe) match { case (None, None) => //this is an impossible case, but listed to avoid compilation warning
        case (Some(_), None) => LOG.warn("/*Report missing/mismatched data*/")
        case (None, Some(_)) => LOG.warn("/*Report missing/mismatched data*/")
        case (Some(roiReportFromEmo),Some(roiReportFromMySql)) =>
          val mismatches = facadeDiffTool.campaignROIReportLegacyDiff(roiReportFromEmo,roiReportFromMySql)
          if(mismatches.nonEmpty)
            LOG.warn("/*Report missing/mismatched data*/")
      }
    }

//This is how we configure what source we return to the resource.
    campaignConfig.masteringFrom match {
      case EmoDb =>
        roiReportFromMySqlFuture.onFailure{
          case e:Throwable => LOG.warn("Failed to build an ROI report on the MySql side", e)
        }
        roiReportFromEmoDbFuture
      case MySql =>
        roiReportFromEmoDbFuture.onFailure{
          case e:Throwable => LOG.warn("Failed to build an ROI report on the EmoDb side", e)
        }
        roiReportFromMySqlFuture
    }
  }
}

The original resource classes will be modified to call from the new facade layer, but no other functionality should change. If constructed properly, the facade layer will act in the same way as the original service because the facade mimics the public functions available in the original service class. These duplicated functions will call to the methods from the legacy service class as well as the new service. With the responses from both the legacy and new services, the facade layer can make an assessment on the differences between the two service stacks. To report our differences such that we could be notified during API usage, we would log them out to our log management and monitoring system.

How Did We Capture Mismatches?

Logging was a large concern of ours. We knew that there would be many differences per call while we were debugging our new service stack. As an example, on one call, we reported 2000+ differences. We wanted to compose all differences into one log per call in a meaningful way. For this, we wrote custom diff tooling that would return differences in the data as sets of MismatchedField classes.


case class MismatchedField[T](name: String,
                              legacyValue: T,
                              newValue: T)


This templated class will hold the values returned from both the legacy service (legacyValue), and the new stack’s service (newValue), as well as some meaningful tag with which to identify where this mismatch came from (name). We would then compose all mismatches for any given call into a single log through our custom diff tool. Every function within our custom diff tool returns Set[MismatchedField[Any]]. We can then compose each set into a single set of differences such that we can use only one log call to write out the whole set of differences in one log entry.

An Interesting Finding:

One of the most interesting findings we had through this migration weren’t bugs that came in constructing our new service stack for the new database, but that we found bugs in our original database stack. One take-away from this was to make sure to investigate any mismatches found down to the source data. We found during the code migration process that some of our legacy functionality was written incorrectly. As an example, in our legacy code we were storing some aggregated data in sets, unintentionally masking duplicate data. When re-implementing these same aggregations for our new service stack, they were correctly implemented as a list, producing a mismatch in the data. Through our investigation, instead of simply matching the data to how our legacy service worked, we went back to our origin data, and ran the calculations manually through the Scala REPL. In doing so, we found that the new service was correct, where our legacy code was wrong. Fortunately, the bug within our legacy code was a simple fix. We implemented the fix within the legacy code and our mismatch disappeared.

Other Take-aways:

An important team take-away was to be very upfront and declarative about the work that the migration would require. Our investigation into the migration not only involved setting up a new technology stack for MySql, but also changing our build tool from Maven to SBT, introducing a Flyway + Jooq plugin to enforce type safety throughout the migration, designing a new data model (which was ultimately the driving factor for doing the migration in the first place), as well as upgrading our code up to the newest scala version to leverage all of the previous changes. Ultimately, we severely underestimated, and under-ticketed the work necessary to start our migration.

It is also important to keep in mind that every team is different and has different needs. When having conversations about database migrations, take the time to do a proper risk assessment for the work ahead. Keep these conversations going during the migration as well. As a team, we ended up prioritizing new feature requests and non-migration related bugs because the migration felt orthogonal to our production environment.

A further takeaway is that we could have saved ourselves a lot of time if we had more realistically assessed our users. In retrospect, the users of these reports were internal and would have been more lenient with smaller service outages, which would have allowed us to leverage our configurable services to migrate much sooner. At the expense of stability, we believe that we could have had a quicker migration by forcing ourselves to fix problems forward, instead of maintaining our legacy code for as long as we did. Still, most scenarios don’t have this luxury and we hope the façade based approach is of help to you.

Cross-Platform Mobile SDK Testing

This Bazaarvoice blog entry is co-authored by Tanvir Pathan as part of a Bazaarvoice internship project on the Bazaarvoice Mobile Team.

Automated testing of native mobile applications has long been a pain point in the world of mobile app development. If you are creating and distributing apps or open source SDKs across two or more major platforms (Android and iOS in our case), you can easily find yourself duplicating efforts to test the same source and business logic across different technology stacks. For example, if you have experienced developers and testers using Xcode for iOS apps, they may tend to automate testing with Instruments Automation, where Android developers and testers may automate with Espresso or UIAutomator. This becomes an expensive proposition for development and maintenance of unit tests, which can be costly as your test coverage increases.

Test strategy can also vary depending on the type of mobile app development your shop pursues: native, hybrid, cross-compile, mobile web. Hence, the selection of test tools will vary depending on how you build and deploy apps.

In this blog post, we’ll detail a novel solution to cross-platform testing of our native SDKs, along with some background of other mobile tool offerings. Our solution focuses on cross-platform open source mobile SDK testing utilizing Cordova to wrap our SDKs in a generic JavaScript interface, and Calabash to drive our cross-platform behavioral tests.

If you want to check out the full solution, the Cordova plugin and description on how to execute Calabash can be found in our Github repository.

How to View Mobile Application Development

Non-mobile app developers typically don’t actually know the difference between a web app, native, or hybrid app. If you work in any business that supports some kind of mobile solution (and you probably wouldn’t be reading this if you didn’t) it’s really important to understand some fundamental differences. It’s very easy to just throw out the word “mobile” in conversation and not realize there’s multiple parts to this elephant!

blindmen

The table below presents four general categories of mobile application development. Keep these categories in mind when talking about “mobile” in general and don’t fall in the trap of the blind men and the elephant.

Mobile Development Types & Tools

Type
Description
Tools
Native Application Development Developers creating purely native apps will write in the language supported by their target platform. For iOS apps, developers can write in Swift and/or Objective-C, while Android developers can write in Java (and C/C++ for lower level execution)
Cross-compilation Developers can also write apps for multiple platforms in a single language, such as Java Script. Cross-compilation tools will take a common language and actually convert it to the target language of the native platform. In this case, while developers aren’t writing in the native language, the tools create real native apps. Some of the most common tools are: Appcelerator (JavaScript), Xamarin (C#), React Native (Java Script)
Hybrid Applications Hybrid apps utilize Web Views to display content, typically written with common web development technologies such as Java Script, HTML, and CSS. Hybrid apps will typically have a “bridge” that allows javascript code to communicate with the native libraries to do things like access the camera, location services, or contacts. Cordova (aka PhoneGap) as the application container. Developers choose their favorite UI layer to work with Corodva: ionic, Sencha Touch, jQuery Mobile, Onsen, Framework 7.
Web Applications / Mobile Web A web application isn’t as much an application as it is a mobile optimized web site. Hence, you won’t find a Web Application in the App or Google Play Store, you just fire up your favorite mobile web browser and load the site.

Cross Platform Mobile UI Testing Tools

When developing for native mobile, developers will typically write unit tests to check individual pieces of functionality and business logic, perhaps even employ certain mocking techniques to test networking and user interface capabilities without the need for a full application. However, when it comes to full system testing of full applications and SDKs, making the right selection can be a tough process. However, if cross-platform testing is your objective and you want to write all your tests in one common scripting language, the options narrow quickly.

While there are platform-specific UI automation frameworks for Android (Robotium, UiAutomation) and iOS (Instruments Automation, Keep It Functional, EarlGrey), there currently only two (that we are aware of) that allow us to test cross-platform with a common script.

Tool
Summary
Appium Appium lets developers write tests for applications without having to add any additional code the applications. It works with native, hybrid, and mobile web applications.
Calabash Calabash is owned and maintained by Xamarin, and provides cross-platform testing for native or hybrid apps. Unit tests can be written in Ruby with Cucumber.

2 Birds, One Stone

kill2birds3

Making the decision to use Cordova and Calabash was fairly easy. First we already distribute our BBVSD via frameworks and libraries for iOS and Android. Second we know some of our customers are creating hybrid applications with Corodva. So we immediately thought: what if we could create a test environment that not only tests our SDK deliverables, but also provides our clients with an easy avenue to integrate Bazaarvoice services into their own Cordova app. Win! Win! As well, because we already use Cucumber extensively at Bazaarvoice, we decided to leverage our already strong in-house expertise and utilize an automation framework that is internally familiar.

Calabash Unit Tests

Another great thing about using Calabash at Bazaarvoice is that we already have an internal framework developed on top of Cucumber. Because Calabash layers on top of Cucumber, the paradigm and philosophy of writing human readable test cases still applies. The test cases utilize Behavior Driven Development  modeling tools to add meaning to your mobile app testing.

Let’s say you are creating the same app for multiple platforms. Typically, you would have to write completely different sets of code to run similar tests. With Calabash, this is not the case. You write one set of code tests and make slight adjustments depending on the platforms in question and you are done!  Best of all, in addition to Calabash being free, the test cases are super easy to write as a developer and even easier to read for others who may be interested in checking out the health of the project.

Needless to say, Calabash provides a lot of benefits for cross platform testing. Lets take a look at an example test case from the BVSDK Cordova Plugin project. Let’s go through a simple scenario based on the following app screens shots from the iOS simulator.

bvsdk_build_simulator bvsdk_running_simulator

Say you wanted to count the number of products that were recommended by our Product Recommendations API. If you were doing it manually you would go through the following steps:

  1. Wait for the app to launch
  2. Make sure you receive a success alert and press ok
  3. Click the Recommendations tab
  4. Then count how many products there are and compare them to what you were expecting

Now how would we code this? Calabash has two essential components: one feature file and one ruby file. The feature file is where you write out the tests and the ruby file is used primarily to make custom functions if needed (although most of what you need comes right out of the box). So returning to our problem, writing out the test case, you simply write down those exact steps in the feature file:

Feature File
Feature: BVSDK Demo App
@recommendations_test
  Scenario: As a user, I want to get new recommendations
    Given the app has launched
    Then I should see "BVSDK has been built successfully."
    Then I press the OK button
    Then I press Recommendations tab
    Then I check number of products

mindblown

That’s really all there is to it. Of course, tests can be also written to be more platform specific when needed.

Entering the Matrix – Travis CI

We use Travis CI for all our public repos at Bazaarvoice. It’s awesome. But, we have to support multiple build tools on different virtual machines. Configuring all these build machines with custom tools sounds and build scripts really scary! Freak out!

matrix

The really slick thing about Travis CI is that you can test multiple configuration, variations, permutations, salutations, etc, etc, etc, by building a matrix in Travis’ Config file (.travis.yml). For our testing, since XCode only runs on OS X and it’s the only way to build for iOS, we must have an OS X image. For the Android Studio and Gradle build tools, we build against Linux. In addition there’s some common tooling we can install for each build machine. The result is that we can use two different VMs for testing each platform, with just one set of tests. Note in the test result below, the build jobs are defined by the environment variables defined in the Travis config file.

travis_matrix

The .travis.yml script looks like this, where we build a matrix with environment variables and platforms:

matrix:
  include:
    - language: android
      env: TO_TEST=ANDROID
      jdk: oraclejdk8
    - os: osx
      osx_image: xcode8
      env: TO_TEST=IOS
  fast_finish: true
android:
  components:
    - platform-tools
    - tools
    - android-23
    - build-tools-23.0.3
    - extra-android-m2repository
    - extra-google-m2repository
    - sys-img-armeabi-v7a-android-19
install:
  - rvm install 2.2.0
  - if [ "$TO_TEST" = "ANDROID" ]; then gem install calabash-android; fi
  - if [ "$TO_TEST" = "IOS" ]; then gem install calabash-cucumber; fi
before_script:
  - if [ "$TO_TEST" = "ANDROID" ]; then chmod 755 createEmulator.sh; fi
  - if [ "$TO_TEST" = "ANDROID" ]; then ./createEmulator.sh; fi
script:
  - if [ "$TO_TEST" = "ANDROID" ]; then chmod 755 androidTest.sh; fi
  - if [ "$TO_TEST" = "ANDROID" ]; then ./androidTest.sh; fi
  - if [ "$TO_TEST" = "IOS" ]; then chmod 755 iosTest.sh; fi
  - if [ "$TO_TEST" = "IOS" ]; then ./iosTest.sh; fi

BVSDK Cordova Plugin Features

So what if I want to try out the BVSDK Cordova plugin? If you want more info or checkout the source code for the plugin and unit tests, just head over to our Cordova Github repo. There’s plenty of info in the README for running the examples and unit tests.

Open Source Contributions

If you are building a Cordova-based application and want to see other things added just let us know, or better yet submit a pull request and we’ll be happy to review it!

Quick and Easy Web Service Load Testing with JMeter

What is Load Testing and Why Should I Care?

Somewhere between the disciplines of Dev Operations, Database Management, Software Design and Testing, there’s a Venn diagram where at its crunchy, peanut-butter filled center lies the discipline of performance testing.

crunky
 Herein lies the performant (sic)

Which is to say, professional performance testers have a very specific set of skills (that they have acquired over many years) that make them a nightmare for non-performing web services. This helps them to answer the following question from our clients:

“Just what is the level of performance we can expect from your service”?

What if your project team doesn’t have access to anyone with a background in perf testing? What if you suddenly find yourself needing to answer the above question but don’t know where to begin?

Scary, right? Bazaarvoice’s Shopper Marketing team recently found itself in this exact situation (they weren’t being hunted by Liam Neeson so they had that going for them though).

takentester

 

 

 

 

 

 

The point of this article is not to bootstrap you, the reader, into a perf-testing, dev-ops version of the dude from Taken. Instead, it’s to show how a small team can quickly (and cheaply) performance test a web service in order to insure they can meet a client’s needs.

Approach:

If you’ve never been involved in any sort of performance testing for web services before, there are essentially two different tracks performance testing can begin from:

Targeted Testing – you have a pre-defined level of service/latency that you need to reach. Generally, you already have an understanding of what your service’s current performance baseline is.

Exploratory Testing – The current performance baseline isn’t really known. Here, the goal is to find out at what point and how quickly performance degrades.

Typically, with small-team oriented projects, you’ll find often that the team starts with the latter path in order to then progress to the former – as was the case with Shopper Marketing’s efforts here.

Our Setup:

We have a RESTful web API (built in Java) which handles requests for shopper profile information stored and sorted across multiple types of data stores. This API will service a JavaScipt based front end widget deployed to a client’s home page to display product data. The client’s home page receives approximately 20 simultaneous unique views per second on average. Can our API service the client at that level?

To test this, we constructed a load test in JMeter that would do the following:

  1. Execute a series of continuous requests to the API that mimic those that will come from the client front end.
  2. Enable the test to run parallel requests to simulate a specific user load
  3. Use real, sampled data so that the requests and responses will be as close to real-world as possible
  4. Measure the response time of the API over time
  5. Measure the number of successful responses vs failures from the API while under load

Why JMeter:

So why are we conducting our test using JMeter? Isn’t that thing two days older than dirt?

_original













Dirt: its old.

Well, for one, JMeter is free.

jmeter











Jmeter: Its older and free-er

We could just leave it at that but wait, there’s more:

JMeter is a load testing tool that has been around for many years. It isn’t just developed and maintained by the same group responsible for Apache Web Server, JMeter is a modified version of Apache itself.  It specializes not only in sending and receiving HTTP requests (you know, like a web server) but with monitoring and reporting tools available for it as well as a wealth of plugins.

50_banner

Sure, there are better (cough! more expensive cough!) tools out there that specialize in load testing but in our case, we needed to determine metrics quickly and with a tool that could be easily set up and re-run numerous times (heavy emphasis on quick and cheap).

Performance testing is a very repetitive process. You will be executing tests, reporting findings, then modifying your service in order to improve performance – followed by a lot of washing, rinsing and repeating to further refine performance. Whatever load testing tool you choose, make sure it is one that allows you to quickly and easily modify, re-run and re-report findings as you will be living your own form of dev-ops Groundhog Day when you take on this endeavor.

Groundhog_Day_(movie_poster)
 A thought provoking documentary on the life and times
 of a performance tester

But enough memes – lets get down to how we programmed Jmeter to show us pretty performance graphs!

Downloading the Application:

You can download JMeter from the Apache Software Foundation here: http://jmeter.apache.org/download_jmeter.cgi

Note – Jmeter requires Java 6 or above (you are using Java 8+ right?) and you should have your Java HOME environment variables set up on your local environment (or wherever you plan on deploying and executing your load tests from).

Latest Java download:
http://java.com/en/download/

Setting up your local Java environment:
https://docs.oracle.com/cd/E19182-01/820-7851/inst_cli_jdk_javahome_t/

Once the JMeter binary package is downloaded and unzipped to your test machine, start JMeter by running ./jmeter from the command line within the application’s bin/ directory.

Configuring Jmeter:

Regardless of what load testing tool you prefer to use; its technical merits will always be tied to its reporting capability. JMeter’s default reporting capabilities are pretty limited. However, there are a wealth of plugins to augment this. Before going any further you will want to install JMeter Plugins Extras and JMeter Plugins Extras Lib in order to get the results you’ll want from JMeter.

http://jmeter-plugins.org/downloads/all/

68263351

Unarchive the contents of these files and place them in the lib/ext directory within your JMeter installation.

Once finished, re-start JMeter.

Note – you can install, update and manage plugins for JMeter using the JMeter plugin manager. This feature is in Beta so your mileage may vary. More on the JMeter plugin manager here: http://jmeter-plugins.org/wiki/PluginsManager/

Designing a Test:

For those new to JMeter, setting up a test is rather simple but there’s a little bit of jargon to explain. A basic JMeter test consists of the following:

Test Plan – A collection of all of the elements that make up your load test

Thread Group – Controls the number of threads and their ramp-up/ramp-down time. Think of each thread as a unique visitor to your site/request to your service.

Listeners – These will be attached to your thread group and will generate your reports

Config Elements – These contain a variety of base information required to execute the test – mainly domain, IP or port information related to the service’s location. Optionally, some config elements can be used to handle situations like having to authenticate through LDAP during tests.

Samplers – These elements are used to generate and handle data as well as options and arguments during test (e.g. request payloads and arguments).

Building the Test – Step by Step:

1. Click on your test plan and assign it a name and check the Run Teardown option
2. Right click on the test plan and select Add > Threads > Thread Group

jm1

3. Enter a name for the thread group (e.g. load test 1)
a. Set the number of threads option to the maximum desired number of requests you want to field         to the API per second (simultaneously)
b. Set the ramp-up period option to the number of seconds you wish the test to take before it                 reaches the maximum number of threads set above (e.g. setting thread count to 100 and the             ramp-up to 60 will start the test with 1 thread and add an additional thread per second. After 1           minute, the test will be at a maximum of 100 concurrent requests per second).
c. Set the Loop option for the number of cycles of maximum requests you wish the test to repeat           once it reaches is maximum number of threads. Once this loop finishes, the test will end.
d. Check the forever option if you wish the test to continue to execute at its max thread count                 indefinitely. Note – this will require you to manually shut the test down.

jm2

4. Right click on the Thread Group and select Add > Config Element > HTTP Request Defaults
5. Set the Server Name or IP Address (and optionally the Port) fields to the domain/IP/port your             service can be reached at (e.g. http://my.network.bazaarvoice.com)

jm3

Now we’re ready to program our test – using the options in the HTTP Request element, we’ll construct what we want each request per each thread to contain.

1. Right click on the thread group and select Add > Sampler > HTTP Request
2. In the HTTP Request config panel, set the implementation to HTTPClient 4
3. Set your protocol (http or https) and your method type (in this place, GET)
4. Set the path option to the endpoint you wish to send your request to – do not include any HTTP         arguments (e.g. /path/sub-path1/sub-path2/endpoint)
5. Next, we’ll configure each and every HTTP argument we need to pass within our request.
6. Do this by clicking into the first line of the send parameters table.
7. Enter your first argument name into the name field, the value into the value field, click the include       equals option and, if need be, click the encode option if your argument value needs to be HTTP         encoded.
8. Click Add and repeat this process for each key-value pair you need to send with your request

jm4
 Your HTTP Request sampler should look something like this.

Now would be a good time to save your test!

leo
 Don’t be this man

Adding Listeners:

Next, we need to add listeners (JMeter-speak for report generators) in order to report our findings during and after the load test.

Right click on the thread group and select Add > Listeners > and then pick your choice of listener.

The choice of test listeners is quite deep, especially if you installed the reporting add ons as noted above. You can configure whatever listeners you feel you need, though here some you may want to add to your test:

View Results Tree – This listener will tabulate each request response it receives during the test as well as collect its response type, headers and content. I highly recommend configuring two of these listeners and assigning 1 for successes and 1 for failures. This will help sort your response types, allow you to debug your tests in case of authentication errors, malformed requests or troubleshoot issues if your API should suddenly start returning 500 errors.

Response Times Vs Threads – If you’re configuring your test to ramp up its load over time, this listener will provide a chart which you can use to measure the responsiveness of your API over time as the request load is increased. Again, I recommend configuring multiple instances of this listener – one to record requests and another to record error latency if you choose to use this listener.

Response Times Over Time – If your test is being configured to provide a constant load over a period of time, this listener can help chart performance of the API based on a steady load over time. It’s helpful in helping to spot issues such as inadequate load balancing or rate limiting of your requests depending on if your service architecture is really aggressive when it comes to that aspect (cough, cough – load balancers – cough).

jm5
 Example of response time over time graph setup (successful responses only)

Now would be another great time to save your progress.

Kicking off a Test:

OK – the moment you’ve been waiting for (be honest). Let’s kick this pig and bask in the glory of our performant API!

Click the big, green button to start the test. Note on the upper right hand side of the JMeter UI, you’ll have an indicator showing the number of threads currently running (out of the total max to ramp up to) as well as an indicator of any warnings or errors being thrown).

jm6
 GO!

Click on the Results Tree listener to view a table of your responses. If you’re seeing errors, click on an error instance in the results tree to view the error type, body and content.

jm7
 10 out of 10 threads running, no exceptions

Once you’re ready to stop a test, click the big, red, X icon to shut the test down.

jm8
 STOP!

Modifying a Test:

You’re probably thinking, “Hey, that was easy and oh look! Our test results are coming in and they look pretty good. There’s got to be more to it than this”. …And you would be right. Remember that comment about load balancers above? Well, in most modern web service architectures, you’ll encounter some form of load balancing whether it’s part of the web server’s features or an intermediary. In our case, Mashery would have cached our static request after a few seconds at maximum load. After that, we weren’t even talking to the API directly, rather, Mashery simply sent us the cached response for the request. Our results in Jmeter may have looked good but it was lying to us.

khaled

Fortunately, JMeter allows us to inject some form of randomness into our requests to circumvent this issue.

One way of accomplishing this is to invoke a randomized ID into your HTTP arguments – especially if your API accepts a random, serialized load ID as an argument. Here’s how you can do that:

1. Right click on the thread group and select Add > Config Elements > Random Variable

jm9

2. On the Random Variable config screen, set a value for in the variable name field (e.g.                         my_random_id)
3. Set a minimum and maximum value to define the range your random variable will take (e.g. 1000       and 9999)

jm10

4. Set the Per Thread option to true (this will ensure a random value will be set for each thread.

vegita
 And this joke is over plaaaaaaaaaaayed!!!!

5. Next, we’ll need to click on the HTTP Sampler and include our newly added random variable to         our test. Let’s assume our API accepts an argument called ‘loadId’ which corresponds to a                 random, 5-digit number.

6. In this case, click on the Send Parameters table and add a new key value pair with the name set       to ‘loadId’ and the value set to ‘{$my_random_id}’ (or whatever you’ve named your variable in the       config element screen.

jm11

One of the requirements of our load test request is that we must provide a specific profile ID that relates to a profile to be returned by the API. For our purposes, we exported a list of existing IDs (over 90,000) from the Cassandra database our API reads from and writes to, imported that into our JMeter test and instructed the HTTP Request sampler to randomly grab an ID and include it as the argument for every individual request.

We configured this by doing the following:

1. Right click on the thread group and select Add > Config Element > CSV Data Set Config

jm12

2. In the CSV data set config options, set the file name option to the path to your CSV file that                 contains your working data
3. In the variable name field, provide a name for which the test sampler will refer to each instance of       your data as (e.g. myRandomID)
4. Enter ‘,’ into the delimiter option field
5. Set the Recycle on EoF to true, Stop on EoF to false and Sharing Mode to All Threads

jm13

This last set of options will ensure that if the test cycles through all elements in your CSV (which it will use for each and every thread) it will simply start back at the top of the list.

Next, click on your HTTP Sampler. Here you will need to add a bash script style variable to the sampler in order for it to automatically pull the shared data variable from your CSV config element (e.g. if you named your variable in the CSV config element to “myRandomID” you need to inject the value {$myRandomID} into the listener somewhere. This will depend on the nature of your API. In our case, we simply appended this to our API endpoint, setting the ID variable to be called between the API domain/endpoint call and the HTTP arguments in the URI.

jm14

Yup – good time to save your game – I mean test. After that…

giphy-1
 Ratta-tat-tat yo!

Reading Results:

We’ve gone over how to build and run a performance test but once the test has concluded and you have gathered results, you need to understand what you’re looking at.

To view the results of a particular listener, just click on it in the JMeter UI.

The Results Tree reports are self-explanatory but what about the other reports? In particular, lets look at the Threads Over Time listener. Here is the graph output for this listener from our initial performance test:

3-15-16-load-test-2
 Average response time for successful requests – 1.6 seconds

This listener was configured to only measure time per successful request in order to obtain more focused results. In the graph you can see that over time, there was a great deal of variance with the majority of requests taking around 1.6 seconds to resolve. Note the highest and lowest points on the graph – these are the outlining deviations for test results as opposed to the concentrated area of red (the average time per request).

Generally speaking, the tighter the graph, the more consistent the API’s performance and of course, the lower the average number, the faster the performance.

Large spikes with pronounced peaks and valleys usually indicate there is an issue with the service’s load balancing features or something “mechanical” getting in the way of the test.

Long periods of plateauing are another indicator to watch for. These may indicate some form of rate limiting or timeout.

Caveats and Next Steps:

Now you’re ready to send off your newly minted beast of a load test to go show that MCP who’s boss. Before you go and press that button – some advice.

On Test Performance:

JMeter is a great tool and all – especially for the price you pay but it is old and not necessarily the most high-performance perf testing tool out there (oh the irony). When launching tests off your local machine or a server – keep in mind that each thread you configure for your test will be another thread your CPU will need to handle. You can quickly and easily create a test that will test your local environments to its limit. Doing so can, at times, crash you test (and dump your results into the ether – engage sad panda face). Start with a small-to-medium performance load and build up from there.

5679299

Starting and Stopping a Test:

When stopping a test, manually or automatically, you might notice a sudden uptick in errors and latency at the very end of the test (and possibly at the beginning as well). This is normal behavior – when a test is started and stopped you can experience some level of thread abandonment (which JMeter will record as an error because those last requests never receive proper responses). These errors can be ignored when viewing results.

Basically, the test results are kind of like a loaf of bread – no one wants the ends.

bread














One of life's great mysteries

A Word of Caution:

JMeter is basically a multi-threaded web request generator and manager. The traffic patterns it generates can resemble those seen during a DoS attack – especially for very large tests. If there are any internal or external web security policies in place within the network you’re testing, be careful as to not set these off (i.e. just because you can run a test on a production server that sends 400,000 simultaneous requests to a google web service – which then gets your whole office IP range banned from said service – doesn’t mean you should and no, the author of this piece has absolutely no knowledge of any similar event ever happening, ever…).

From Latency to Where?

The above performance graph was from the very first performance test against our internal Shopper Marketing recommendations API. Utilizing the test, its results and monitoring tools like DataDog we were able to find where we needed to improve our service from both the code base as well as hosting environment to reach our performance goal.

After several repeated tests along with re-provisioning new Elastic Search clusters and a lot of code refactoring, we eventually arrived at the following test result:

perftestfinal

 

 

 

 

Average response time for successful requests – 100 milliseconds

From and average response rate of 1.6 seconds to 100 milliseconds is a pretty big leap in performance. Ultimately, our client was pretty happy with our answer.

This is by no means an exhaustive method of load testing but merely a way of doing quick and easy exploratory testing that delivered a good deal of value for our team.

Have fun testing!