Category Archives: Uncategorized

Publishing Load Test Results with Jmeter, Jenkins and S3

A while ago, I published a blog post that presented a tutorial overview of how to use Jmeter for load testing a typical RESTful API.

This post builds upon that original post with handy information on some updated reporting features of Jmeter as well a quick dive into how you can better propagate your load results in a continuous build environment (ala Jenkins).

if you’re completely new to Jmeter, please read my previous blog post, linked above before proceeding.

For those of you experienced with Jmeter (hopefully you already have some tests you’re wanting to deploy somewhere rather than let them live contentedly on your local machine), get ready because you’re gonna love this

67295777

It’s truly amazing the gifs you can find on the web.

Meet the New Jmeter, Same as the Old Jmeter (not really)

The previous load test tutorial covered setting up a load test from scratch using an early version of Jmeter.  One of the benefits of using Jmeter, as outlined in the original post was that it has been around for quite a long time, is well known, stable and free.  However, one of the major cons in using Jmeter is that its’ well, old – and has gone for a long while without major updates – especially when it comes to report generation.

Well, welcome to Jmeter 3!  It still comes with that same, great, vintage 90s UI but now with a hoard a bug fixes, stability upgrades and – wait for it – native HTML report generation, wooo!  Check out version 3’s documentation here:

Installation and Setup

So, let’s get down to it!  You can find the latest build of Jmeter 3 here: https://jmeter.apache.org/download_jmeter.cgi

Steps to setting up Jmeter 3 locally are similar to the installation steps covered in my original blog post.  Pre-installation requirements are the same:

  • Java JDK (version 7 or later)
  • Be sure your Java HOME local environment or path variables are set before execution.

Functionally, Jmeter 3 appears identical to older versions of the application but there are a lot of changes under the hood.  I recommend unarchiving and installing Jmeter 3 in its own directory, separate from any previous version of Jmeter you may have previously installed.

Migrating your tests

If you’re following along with this tutorial and wish to create a new load test from scratch, please follow the test setup steps in the original post’s tutorial however, be sure to SKIP the 3rd party report and listener portion of the test setup (guess what, you won’t be needing those plugins anymore).

If you have an existing test you wish to simply fire up in Jmeter 3, your mileage will vary in this case.  Some of the 3rd party listeners and other tools you may have used in your setup may be incompatible with the newer version of Jmeter.  If you have any custom listeners embedded in your test I recommend doing the following:

  • Open your original test in your older version of Jmeter
  • Remove 3rd party listeners, samplers or elements from your test (right-click and select remove)
  • Save the updated test as a new test
  • Open the newly created test in Jmeter 3

From this point, you won’t really need any of your original report listeners configured in your test, however, for debugging purposes, I do recommend keeping a simple response listener in your test – one for successful API responses, another separate for failed responses – just to help with debugging any issues your test might run into.

In a few cases, I’ve found despite removing any 3rd party plugins that some tests have compatibility issues with Jmeter 3.  Sadly, the easiest course of action is to rebuild the test from scratch.  Such is life (sigh).

Such is life...

Such is life…

Executing Tests with Report Generation

Given you have your initial test configured for Jmeter 3, let’s start up the app and view it in the new UI (though it looks like the old UI).  Once you start Jmeter froim the command line, you’ll see a message like this in your terminal:

================================================================================
Don't use GUI mode for load testing, only for Test creation and Test debugging !
For load testing, use NON GUI Mode & adapt Java Heap to your test requirements
================================================================================

You don’t say!?  If you’re a long-time user of earlier version of Jmeter, you’re probably thinking, “Well, no duh!” right?

Open your load test in the UI and give it a look-over to make sure its configured the way you intend.  It should be minimal simple thread group, your HTTP samplers, your results tree report and any CSV data resources you might need.

Your test may look something like this

Your test may look something like this

Next, save your test and close Jmeter (we’re now done with the Jmeter UI).

Now, we are going to use Jmeter’s command line interface to execute the load test and generate the report for the test.

Do the following:

Open your CLI terminal and navigate to your Jmeter 3 bin directory.  Execute the following command:

jmeter -n -t <test JMX file> -l <test log file> -e -o <Path to output folder>

Be sure to set the ‘test JMX file’ argument to the path where your load test’s .JMX file resides, relative to the Jmeter bin directory.

The test log file should be an arbitrary log file path/name that currently does not exist.

The output folder path is where your report will be written to.  This must be an empty directory.

A couple of notes:

If you configured your test to run for a set period, just sit back and wait.  The report will be generated in the output directory you provided once the test execution finishes.

If you configured it to run in perpetuity – you will need to formally stop the test before report generation will execute (simply killing your terminal session won’t help you here).

To stop your Jmeter test from the command line, simply open a new terminal window and enter one of the following commands:

Control + . – This stops Jmeter, halting any active listeners.

Control + ,  – This sends a shutdown command to Jmeter – it will wait until all active threads have finished their current tasks before closing shop (this is the recommended method to use unless you’re impatient

or there is an emergency – like, oh you forgot you pointed your test at production).

Are you really going to argue with Chuck?

Are you really going to argue with Chuck?

Once your test has gracefully exited, navigate to the output folder you specified and open your newly minted report in your web browser of choice.

Jmeter 3 HTML report

Jmeter 3 HTML report

Pretty sweet huh?
sxf8ww

If you haven’t seen Jmeter’s dashboard report before, you’re probably thinking, “golly, there’s a lot of stuff here” – and there is.  The dashboard report contains a pretty wide variety of data on your test execution.  Enumerating through all of that here would be exhausting however – if you’re interested in all the gory details of Jmeter’s dashboard metrics, check out this in-depth explanation here:

https://jmeter.apache.org/usermanual/generating-dashboard.html

Show and Tell (mostly show)

So now you have this fancy load test report but hoarding all this metric goodness on your local development workstation is no fun for the rest of your team.  But that’s OK because if you have access to a continuous integration build environment (ala – Jenkins), the rest of this blog post will walk you through taking your Jmeter test and throwing it into a Jenkins build that allows you to not only run your test remotely but also save and propagate your results to the rest of your agile team.

Make your test available

First things first, you’re going to need to take your Jmeter test, get it off your local machine and into some form of web-accessible delivery.  If you have AWS access, you could place your .JMX files and resources into a bucket and be done with it.  In our case, I recommend creating a private Github repository for your tests and committing your Jmeter working directory, tests and resource files to that repository.  There’s a couple of reasons for this:

  • It’s easy to access a given Github repository with Jenkins
  • You can manage updates and versions of your tests just like source code
  • You can manage variants of your load tests using Github’s branch management features

That said, before tossing your tests into Github’s gaping maw, here’s an important note regarding information security:

Depending on your type of load test, you may have some API or other service access keys stored in your test.  You don’t want to check in API or other access keys into your repository (even if it is marked private).  But don’t just take my word for it:

https://softwareengineering.stackexchange.com/questions/205606/strategy-for-keeping-secret-info-such-as-api-keys-out-of-source-control

If you’ve already committed your tests to your repository and you still have some software keys in there and are thinking, “Oh boy… What now!?”, here’s a link to some handy notes on how to extract some sensitive values from your test or code using some handy features of Git:

https://gist.github.com/derzorngottes/3b57edc1f996dddcab25

Getting Up and Running with Jenkins

Jenkins is one of the most widely available CI build tools in use today.  As such, we’re going to cover how to set up a Jenkins task to run your load tests.  If you’re using some other build system (e.g. Team City) this walk-through won’t be as pertinent but essentially, the tasks will be the same.

Step 1: Get access to Jenkins:

Assuming you already have a Jenkins build server of some sort available to your organization – let’s log into it.  If you don’t have access to Jenkins, talk to your friendly neighborhood dev ops engineer.  Note, that for this walkthrough, you’re going to need some level of administrator access to set up everything we need to execute our tests.

Step 2: Create and Setup your Job with Parameters:

Do the following:

  • From the Jenkins home page, choose a directory you wish to run your load test jobs from (or create one).
  • Click the “New Item” link and then choose “Freestyle Project”
  • Give your project a name and description
  • Choose the “This project is parameterized” option – we’ll use parameters to allow you to configure your load test If needed.
  • Select the “New string parameter” option and create a new parameter named, “thread”
  • Set the default value to whatever you want your test’s default number of threads to be (e.g. 40).
  • Create another string parameter and name it “ramp” – set its default value to the number of seconds you wish your test to take to ramp up to its maximum load.

If your test is going to loop, create another string parameter named, “loops” and set its default, numeric value.

If you have an API key (or keys) you need for your test to execute, as mentioned above, and have already removed said keys from your test (substituting them for a placeholder value) you can create additional string parameters here (naming them key1, key2, etc.).  You can then set your API key(s) to the default value field.  Granted, you are propagating your API key here but this is a much better practice in terms of security than leaving them in a potentially publicly accessible repository as your Jenkins instance should be private and protected.

5dbe72d3d6a5db0d23a62eea640d7c5a_1457280669769jpg-keymaster-gatekeeper-meme_250-250

Step 3: Git Jenkins to Talk to Git:

While still in the Jenkins job configuration screen, under the Source Code Management section, select “Git” and enter your repository’s URL alias into the Repository URL field.

If this is your first time doing this, Jenkins expects a specific format for your Github URL strings.  It should look something like this:

https://github.com:<your main repository name>/<your project name>

If you’re using a publicly accessible Github repository, skip the next step, enter your preferred branch name (or */master if you don’t need a specific branch).

Next, select the appropriate credentials option from the relative drop down menu to allow Jenkins to properly introduce itself to your Github repo (if you’re having trouble with this, ask your dev ops engineer or Jenkins server admin for help – and maybe do something nice for them while you’re at it like get them a thank you card or some cookies or something, because at this point, you’d be out of luck without them, right?).

bp4

Step 5: Build Settings:

You can kick off your tests manually however, it’s probably better to set up timer or other method to handle this for you.  Try the following:

Under the Build Triggers heading, check the Build periodically option

Enter your time range you wish the build to trigger at (this will be in Cron notation – which is tricky if you’re seeing this for the first time).

bp5

This article does a decent job explain Cron’s formatting: http://www.nncron.ru/help/EN/working/cron-format.htm

Now, click on the Add build step option and from the dropdown menu, select ‘Execute shell’.

Here, we will define the command we will use to execute the Jmeter test once Jenkins pulls the test resources from Github. This will be like the shell command we used to run Jmeter locally as described above.

bp7

Syncing to Amazon Web Services

Now we need to make sure we’re saving our generated report somewhere so the rest of our team can review it.  Jenkins has a Jmeter performance plugin that can archive our results.  Also, we could just have Jenkins archive the output directory Jmeter generates the report in (though handing all the recursive directories it will build might be a pain).

Given you have an AWS account, you have the options to sync your reports to an AWS bucket and host them like a static web page.  This method would be beneficial if you need to expose your load test results to team members or other parties that do not or should not have access to your build systems (e.g. product manager, support team members, etc.).

If you don’t have access to AWS, aren’t sure you have access to AWS or may have issues with your build server communicating to AWS, once again, now is a fine time to make friends with (*cough* bribe *cough*) your friendly neighborhood dev ops engineer.

Let’s say you’ve taken care of that and you and your dev ops guru are best buddies now.  Next, you’ll need to further configure your Jenkins job by adding another Execute shell build step.  Make sure this one is ordered sequentially after your Jmeter test execution step.

In the Execution shell, simply add a command to change to the output directory your report will be in (defined in your test execution command) and use the AWS sync command to send your HTML report to AWS:

cd bin/target
 aws s3 sync . s3://mybucketroot/mybucketname

Save your Jenkins job.  You’re done (with Jenkins at least).

Now, in the above step, we are assuming you have a bucket already created in AWS where you can sync your data to.  You’re also going to need access to modify how the bucket works in terms of its permissions.

Making Your Test Results Readable from the Web:

By default, your AWS bucket is out there on the web, you can read from it, write from it, but your other team members will have trouble reading the HTML report from that location.  We need to configure the bucket so that it will display its contents like a static web page.  Here’s how:

Log into your AWS account and access your specified bucket.

Click on the bucket and select the Properties option

Click the Static website hosting panel

bp9

Enable the static hosting option

Specify your index.html file location (this should be the relative path to the main HTML file for your uploaded report).

bp10

Save your changes and boom – now you’re done.

Save and run your job.  If your account permissions are all set up correctly, your test should execute and the test results will be uploaded to the specified AWS bucket.

From here, simply access your AWS bucket using the following styled web link:

https://s3.amazonaws.com/myBucketRoot/myBucketName

995f09e2e0291e1c61e3d22a62438b95f23395d966984d472eb5b43ea04f76beYour test results should now be viewable just as if they were a static web page by anyone on your team.

Wrapping it Up:

Now that you have your load tests residing in a Github repository, feel free to leverage Github’s branching features to allow you to manage multiple iterations of your load tests:

Commit different variations of the same branch to new branches

Use Jenkin’s copy feature to copy your existing job, and just point it to your different branches to allow you to execute multiple test types.

Use Jmeter’s UI locally to copy/edit your existing tests to make additional iterations.  Save these to your repo and commit them just like you would any other coding project.

Again, use Jenkin’s copy feature to create variations of your test run jobs – this time, just edit your shell command to execute your desired .JMX test file per a given execution.

Thanks for reading and I hope this give you further ideas and insight into how you can leverage Jmeter to further improve your software projects through load testing.

I want to be a UX Designer. Where do I start?

So many folks are wonder what they need to do to make a career of User Experience Design. As someone who interviewed many designers before, I’d say the only gate between you and a career in UX that really matters is your portfolio. Tech moves too fast and is too competitive to worry about tenure and experience and degrees. If you can bring it, you’re in!

That doesn’t mean school is a waste of time, though. Some of the best UX Design candidates I’ve interviewed came from Carnegie Mellon. We have a UX Research intern from the University of Texas on staff right now, and I’m blown away by her knowledge and talent. A good academic program can help you skip a lot of trial-by-fire and learning things the painful way. But most of all, a good academic program can feed you projects to use as samples in your portfolio. But goodness, choose your school carefully! I’ve also felt so bad for another candidate whose professors obviously had no idea what they were talking about.

Okay, so that portfolio… what should it demonstrate? What sorts of samples should it include? Well, that depends on what sort of UX Designer you want to be.

Below is a list of to-dos, but before you jump into your project, I strongly suggest forming a little product team. Your product team can be your your knitting circle, your best friend and next-best-friend, a fellow UX-hopeful. It doesn’t really matter so long as your team is comprised of humans.

I make this suggestion because I’ve observed that many UX students actually have projects under their belt, but they are mostly homework assignments they did solo. So they are going through the motions of producing journey maps, etc., but without really knowing why. So then they imagine to themselves that these deliverables are instructions. This is how UX Designers instruct engineers on what to do. Nope.

The truth is, deliverables like journey maps and persona charts and wireframes help other people instruct us. In real life, you’ll work with a team of engineers, and those folks must have opportunites to influence the design; otherwise, they won’t believe in it. And they won’t put their heart and soul into building it. And your mockups will look great, and the final product will be a mess of excuses.

So, if you can demonstrate to a hiring manager that you know how to collaborate, dang. You are ahead of the pack. So round up your jackass friends, come up with a fun team name, and…

If you want to be a UX Researcher,

Demonstrate product discovery.

  • Identify a market you want to affect, for example, people who walk their dogs.
  • Interview potential customers. Learn what they do, how they go about doing it, and how they feel at each step. (Look up “user journey” and “user experience map”)
  • Organize customers into categories based on their behaviors. (Look up “personas”)
  • Determine which persona(s) you can help the most.
  • Identify major pain points in their journey.
  • Brainstorm how you can solve these pain points with technology.

Demonstrate collaboration.

  • Allow the customers you interview to influence your understanding of the problem.
  • Invite others to help you identify pain points.
  • Invite others to help you brainstorms solutions.

If you want to be a UI designer,

Demonstrate ideation.

  • Brainstorm multiple ways to solve a problem
  • Choose the most compelling/feasible solution.
  • Sketch various ways that solution could be executed.
  • Pick the best concept and wireframe the most basic workflow. (Look up “hero flow”
  • Be aware of the assumptions your concept is based upon. Know that if you cannot validate them, you might need to go back to the drawing board. (Look up “product pivoting”)

Demonstrate collaboration.

  • Invite other people to help you brainstorm.
  • Let others vote on which concept to pursue.
  • Use a whiteboard to come up with the execution plan together.
  • Share your wireframes with potential customers and to see if the concept actually resonates with them.

If you want to be an IX Designer and Information Architect,

Demonstrate prototyping skill.

  • Build a prototype. The type of prototype depends on what you want to test. If you are trying to figure out how to organize the screens in your app, just labeled cards would work. (Look up “card sorting). If you want to test interactions, a coded version of the app with dummy content is nice, but clickable wireframes might be sufficient.
  • Plan your test. List the fundamental tasks people must be able to perform for your app to even make sense.
  • Correct the aspects of your design that throw people off or confuse people.

Demonstrate collaboration.

  • Allow customers to test-drive your prototype. (Look up “usability testing”)
  • Ask others to help you think of the best ways to revise your design based on the usability test results.

If you want to be a visual designer,

Demonstrate that you are paying attention.

  • Collect inspiration and media that you think your customers would like. Hit up dribbble and muzli and medium and behance and google image search and, and, and.
  • Organize all this media by mood: the pale ones, the punchy ones, the fun ones, whatever.
  • Pick the mood that matches the way you want people to feel when they use your app.
  • Style the wireframes with colors and graphics to match that mood.
  • Bonus: create a marketing page, a logo, business cards, and other graphic design assets that show big thinking.

Demonstrate collaboration.

  • Ask customers what media and inspiration they like. Let them help you collect materials.
  • Ask customers how your mood boards make them feel, in their own words.

Whew! That’s a lot of work! I know. At the very least, school buys you time to do all this stuff. And it’s totally okay to focus on just UX Research or just Visual Design and bill yourself as a specialist. Anyway, if you honestly enjoy UX Design, it will feel like playing. And remember to give your brain breaks once in a while. Go outside and ride your bike; it’ll help you keep your creative energy high.

Hope that helps, and good luck!

This article was originally published on Medium, “How to I break into UX Design?

WHERE WE’RE GOING, WE DON’T NEED ROADS… BUT WE WILL NEED FLAGS

In the previous blog post I introduced our stripe-ctf-2-vm, a self-contained capture the flag puzzle ladder in one vm. In this post, I’d like to talk about how we used the vm to introduce the security mindset to our developers here at Bazaarvoice.

One of the tenets in R&D is to responsibly “fail fast, win fast” with data-driven decisions. It was time for a grand experiment. We put on our lab coats and called for a dozen volunteers for a test training session. The format of this session ran like this:

  • Each participant was a lone competitor.
  • Discussion was encouraged to promote collaboration.
  • Each puzzle was timeboxed based on the feel of the room. 
  • Each puzzle/timebox would conclude with a discussion of the solution.  

sshot-104

What We Learned, Part 1

Gamifying learning is hard. There’s a whole field dedicated to this for a reason. We felt like we hit the broad strokes by providing: an increasingly difficult ladder of challenges, competition, social interaction (almost, see below), and timely feedback. The experiment led us to some very important tweaks…

  • Lone competitors don’t collaborate effectively, for obvious reasons. Attendees should form small teams of three or four for a better social element. Consider mixing technical skill sets and/or functional teams. That really helps.
  • There’s a balance to be had between freedom to play with a puzzle and frustration. It’s a good idea to track the time closely and cut to the solution before tables are flipped.
  • You also need to track the time so you can make an educated guess on how long future training sessions should be.
  • Proctoring and encouragement is needed for those new to CTF or security puzzles. This is easier to do when small teams are formed.
  • Prizes rewarded per level don’t motivate very much.

Taking It to the Big Leagues

Armed with what we learned and bolstered by testimonials from some of the attendees, we proposed to management a way to take this further. Now, BV R&D management is very pro-learning and our VP takes security very seriously. We didn’t have to sell the Why very much, just the How. Since this education was seen as vital to the growth of our engineering effort, we made the case that all one hundred and twenty plus of our engineers needed to train. We provided a plan for about nine sessions with about twenty engineers each. Once we had management buy-in, we did a two minute elevator-pitch to all of engineering so they had context for the meeting invites that followed.

If you don’t have rapport with your management about the value of secure code, you have your work cut out for you. We suggest you leverage your own talent to locate a few security flaws either through code review, pen testing, or fuzzing tools if you’ve got the skills. The likely low-hanging fruit for a web app exploit are unsanitized inputs, cross-site scripting, and SQL injection. Put together a presentation that educates on the rising number of web app breaches, the cost of fixing flaws in production, and the fact that you just found flaws in production. Paying for external pen testing or security training, or security tools can be expensive. Use what leverage you get from the presentation to instigate a bottom-up approach like a book club or training like this CTF session to cultivate security mavens who can advise the engineers around them.

tumblr_l9ofytzjp61qb7unno1_500

The Pattern for a Session

Once you have roomful of engineers ready get their security on, what do you do? We divided them up into small teams and tried to get a mix between front-end, back-end, and QA skill sets when possible. The next thirty or so minutes were spent on helping the participants setup their VMs and log into them via SSH. Most of the issues you will run into will likely stem from mix-ups with localhost VBox adapters or trying to run and log into the VM directly from the VBox UI rather than via their terminal and shell script. When your engineers are logged into the VM, you’re ready to run through the introductory slides.

As soon as fingers are poised to take on the puzzles, give everyone the password and present the staging slide for Puzzle 0 (0, because we are programmers). Set up the puzzle and let the teams go to town on it. It’s a good idea to remind everyone to be vocal about looting the puzzle. And make sure they know they can look at the source code in the VM or in the github repo (no, the code you can see in the levels/0 directory is not the running instance of it so don’t modify it and expect to see the changes). A few minutes later, show the hint slide. Feel out your audience though. Easier puzzles may not need a hint, harder puzzles may need more breathing room before or after the hint is shown.

After you have a winner, or better yet, wait until you have two victorious teams (you don’t want to move too fast), show the solution and remediation slides and talk through ways the vulnerability could have been avoided. Any examples specific to your stack or technology you can use to bolster the remediation discussion will strengthen the discussion enormously. Once all the teams have unlocked the next level, repeat until you run out of time or until you solve all the puzzles. The latter means you are a bunch of badasses. Srsly!

maxresdefault

Go as far as you can, and reward the team that conquered the most puzzles with something highly coveted: Pokemon figures. Or money. Whichever.

Your mileage will very of course, but we found the following timing worked for BV R&D across our three hour training sessions:

ctf-timings

What We Learned, Part 2

Here’s what we learned from conducting nine sessions:

  • Scheduling 120+ people for a three hour training session is hard. Set aside admin time to handle rescheduling requests and plan for a make-up session.
  • Generate a VM per session so the loot is different each time.
  • For our engineers, a three hour session allowed the class to work through the first 5 or 6 puzzles as a group which made for a good introduction to security vulnerabilities.
  • Send out instructions and links to your generated VMs in the meeting invite.
  • Even so, plan on spending the first thirty to forty minutes setting up VMs.
  • Don’t send out the ctf user password in the invite; some clever, motivated individual will work through all the problems before the session. 🙂  If they do that, they should proctor the session with you.
  • If your company uses HipChat, Slack, or similar, set up an invite-only room per session for that session’s attendees only. It helps to have a means of cut/pasting the solutions to the group.
  • Don’t forget to work the room to give encouragement and redirect rabbit holing. Each team is different and some prefer to be left alone. Learn to feel that and respect it.
  • Some engineers will get hung up on the implementation details of the puzzles: “I don’t need to learn bloody PHP.” Try to impress upon them the larger vulnerability exercised by the puzzle. Try leading them through the thought exercise of how the security flaw (and not the language-specific flaw) may be at work in your company’s code today.
  • Some teams will race ahead after unlocking the next level. Most of the time that will even out as the CTF continues. Occasionally you have an ace security-minded team. If they continually blaze ahead, you may need to get their buy-in to take them out of direct competition with the other teams for morale’s sake. Perhaps they can help proctor the others?
  • Use something simple like a GoogleDoc form to take a quick poll of participants after each session. You will get invaluable feedback on how to better customize this to your organization.

We hope you find this unique way of introducing the secure coding mindset to your engineering organization useful and fun. It is just the tip of the iceberg. Once your engineers are thinking this way, you will need to develop further plans to encourage more education on the common secure coding patterns themselves. Consider periodic secure coding code reviews or pair programming with the experienced security-minded folks on your team. Forming teams to compete in other CTF competitions and internal security bug bounties might also help. Please don’t hesitate to contact us if you want tips on how to implement this for yourself or if you just want to let us know how it worked for you.

Intern Demo Day

As the summer comes to an end, so do the internships for numerous university students here at Bazaarvoice. This past week, the interns were given an opportunity to present a summary of their accomplishments. This afternoon of presentations, known as the Bazaarvoice “Intern Demo Day”, highlighted the various achievements throughout the company, not just in the R&D department.

The following is a short summary of the great work our interns complete this summer as well as some images from the “Intern Demo Day”.

CHASE PORTER: My project, which I have named “The Great (red)Shift”, is intended to improve data accessibility for computed aggregated counts of various canonical events written to HBase. To do this I designed a data warehouse in Amazon Redshift that I loaded with transformed aggregated counts extracted from the tables in HBase. This makes the counts readily SQL query-able in an incredibly fast system whereas before they had to be computed with performance heavy queries from Raw Logs generated by Cookie Monster. The biggest block for this project was in processing the data from HBase which was stored as serialized bytes and needed to be handled uniquely for different types of canonical events (i.e. pageviews, impressions, features) to translate into a readable form for Redshift.

BEN DEVORE: My product is web crawler written in node.js that scrapes clients’ webpages for product data in order to build their product feeds for them. For many of Bazaarvoice’s smaller clients, building and maintaining their product feed is a significant obstacle in the onboarding process. This tool aims to clear that obstacle by taking this task out of there hands.

STONEY MCCRARY: So I have been fortunate enough to get to work on several different pieces in curations but I am going to talk on what I have been hammering on for the last couple of weeks. More and more of our high volume clients are receiving millions of hits a day and this has caused performance to become a higher priority problem for them. In response to this, we are focusing our efforts on building a new display with performance in mind. Performance for the display centers around only providing the minimal amount of data needed and supply the rest as necessary. The piece I will be showing is the display carousel and how it dynamically loads and dumps the data to allow for faster loading and to keep browser memory low.

ZESHAN ANWAR: Eagle is a dashboard built for our Incubator team. With so many moving parts, it was important we had a summarized ‘birds-eye’ view of the team in one place. Eagle was initially meant to be an aggregation of all our Jenkin builds; a single view of all our jobs across our different Jenkins environments. However, it grew to also include JIRA and GitHub statistics. My other project was optimizing our UI tests by having them run concurrently. Our old UI tests were extremely slow, and by running them in parallel we drastically reduced test times.

ZESHAN_Eagle(1)

ZESHAN_Eagle(3)

BRENDON KELLEY: Testing Framework: This summer my project was to help build out a new testing framework for Curations. The current automation tests used for Curations is Saladhands. Before my internship, there wasn’t much if any automation tests for the submission/direct upload capability of Curations. I worked on creating tests and a CI environment for submission in a new testing framework called Intern. One of tests includes a language translation test using mongoDB as an endpoint to store the various languages’ text. Intern is a javascript based testing framework which will allow developers to contribute to writing tests since Curations is mostly javascript. I’ve also worked on updating and creating new console tests in this framework. The foundation built this summer in Intern will enable the ability to further contribute to the framework.

KRYSTINA DIAO: My main project for the summer was to analyze and report the effectiveness of the implementation of the new Connections Knowledgebase. Through Salesforce, I collected and analyzed the number of cases, time spent on each case, etc. After drawing my conclusions, I decided to present my findings via data visualization methods (JavaScript’s C3 and D3 libraries) and provide actionable insights on how this information can be leveraged. This information is valuable in that it can be used for future product KB decisions, as well as understanding how much time, manpower, and money is saved by having a KB.

Krystina Diaos (3)

Krystina Diaos (2)

Krystina Diaos

MARKO SAVIC: Over the summer, I was a part of the SEO Team. I managed to create a tool on pagesManaged and keywordsManaged feed for every Spotlights client. Generated feeds will be consumed by SeoClarity tools on a daily basis. This helps in identifying search rank gains on the specified keywords and pages where Spotlights are present. The SeoClarity reporting will help in proving out Spotlights value and eventually lead to Spotlights renewal/upticks.Also, I created algorithm tweaks on the PINS (Post Interaction Notification System) Generator that take into account product activeness, product relevancy and review count, and use them to ask the user to write reviews on the most relevant products.

TREVOR NELLIGAN: Here is a description of my project: I worked on the Aperture Component library and many of the projects it supports. Aperture is build in React, and its purpose is to be used as an internal Bazaarvoice tool for constructing web pages. Using Aperture, anyone at Bazaarvoice can easily create a functional, intuitive, Bazaarvoice themed webpage, all with the building blocks Aperture provides.

Using the Aperture library, I helped the construction of numerous pages for the curations beta console. I personally built the interface for a new client-facing template builder, which will allow clients to create curations templates quickly and easily without having to go through an implementation engineer and a long process, as was the case previously. I also supplied custom Aperture components for several projects, like the content curation beta page.

RAMIE RAUFDEEN: The mixer is a component of our product recommendations engine which differentiates shoppers, and optimizes recommendations for them. This is primarily derived from their shopping behavior – in real time. Prior to the mixer, product recommendations were aggregated from multiple sources, using the same algorithm for every shopper. Shoppers are now categorized based off of a set of rules (using the shopper’s profile data), each of the rules map to a plan (which you can think of as an ‘algorithm’). A plan defines how recommendations should be mixed from each of the sources. For example, if source B has proven to have a higher conversion rate for ‘heavy-shoppers’, the plan for ‘heavy-shopper’ would give a higher weighting to source B. We can now target specific types of shoppers when it comes to product recommendations. This also sets the groundwork for a more granular machine learning implementation in the future.

Science4Science3Science2

We want to thank all the interns who spent time with us this summer and wish you the best back at school. We look forward to hearing about all the great things you all develop in the future.

If you are interested in an internship at Bazaarvoice, please contact kindall.heye@bazaarvoice.com.

HackerX event hosted at Bazaarvoice

The Bazaarvoice headquarters hosted the July 20th HackerX event in Austin, Texas. The event featured not only Bazaarvoice, but also included Facebook, Amazon, and Indeed. 70+ engineers participated in onsite interviews and networking. HackerX commented that “this was one of the most successful events” they have ever seen.

Gary Allison, Executive Vice President of Engineering, kicked off the event with a compelling message about Bazaarvoice and why this is an awesome place to work.

HackerX started in 2012 with invite-only, face-to-face recruiting events that connect tech talent with some of the world’s most innovative companies. Currently, they operate over 100+ events in 40+ cities, 15+ countries annually.

See www.hackerx.org for additional information.

hackerx2

hackerx4

hackerx1

Conversations API Deprecation for Versions 5.2 and 5.3

Today we are announcing an important change to our Conversations API service:

  • On April 30, 2017 service will end for Conversations API versions 5.2 and 5.3

By deprecating older versions of our API service, we can refocus our energies on the current and future API services, which we feel offer the most benefits to our customers. Please visit our Upgrade Guide to learn more about the Conversations API, our API versioning, and the steps necessary to support the upgrade.

We understand that this change will require effort on your part. Bazaarvoice is committed to making this transition easy for you. We are prepared to assist you in a number of ways:

  • Pre-notification: You have more than 12 months to plan for and implement the change.
  • Documentation: We have specific documentation to help you.
  • Support: Our support team is ready to address any questions you may have.
  • Services: Our services teams are available to provide additional assistance.

In summary, on April 30, 2017, Conversations API versions released before 5.4 will no longer be available. Applications and websites using versions before 5.4 will no longer function properly after April 30, 2017. If your custom application or website is making API calls to Conversations API versions 5.2 or 5.3 you will need to upgrade to the current Conversations API (5.4). Applications using Conversations API versions 5.4 and later will continue to receive uninterrupted API service.

If you have any questions about this notice, please submit a case in Spark. We will periodically update this blog and our developer Twitter feed (@BazaarvoiceDev) as we move closer to the change of service date.

Visit the following page Coversations API 2017 Deprecation to learn more.

Thank you for your partnership,
Chris Kauffman
Manager, Product Management

What does a data scientist do?

More and more companies and industries are grappling with the challenges of extracting value from large amounts of data. Data scientists, the people whose job it is to overcome these challenges, are becoming more prominent, yet what it is they do, and how they’re different than software engineers, is still a mystery to a lot of people. The goal of this article is to explain one of the most important tools that data scientists use: machine learning (ML). The bottom-line is: using ML is slow, costly, and error-prone, but it allows companies to achieve business objectives that are unattainable any other way.

Just like a software engineer, the goal of a data scientist is to develop programs that perform business functions. In software engineering, the engineer writes a program that encodes all of the “rules” for what the program is supposed to do, based on the requirements. For example, take the task of returning all of the product reviews for a given product ID. The rules here include things like how to make sure the product ID is valid (and what to do when it’s not), how to query a database of reviews with the product ID, and how to format the reviews to be returned. A reasonably skilled software engineer can easily write a program that encodes all of these rules.

However, there are many problems where it is not feasible for anyone to write down all of the rules required for a software program to perform some task. Sometimes this is because the rules are simply not known, and other times it’s because there are way too many rules. Several good example of the latter type come from natural language processing (NLP), like the problem of predicting the sentiment of movie reviews (i.e. did the reviewer like the movie or not?). Like nearly all NLP problems, this is something that a human could do reasonably well, but it doesn’t mean that a person could easily write down a set of rules for how they made their decisions. If you had to, you’d probably start by listing key words or phrases that would indicate the reviewer’s sentiment, like “great”, “liked”, “smash hit”, “boring” and “terrible”, and use their appearance in a review to judge the sentiment. This will only work so far, because language is much more complex than that. You’d need additional rules to get around the fact that many of the key words can mean different things in different contexts, more rules to cover all of they ways that you can express sentiment without using a key word, and even more rules to detect when sentiment gets flipped by a negating phrase. Add to this the fact that people don’t actually write in perfect English, and you’d end up with a seemingly endless list of rules and an impossible software engineering problem.

ML has completely different approach: instead of writing out the rules in a top-down fashion, ML attempts to infer the rules in a bottom-up way, from data. For the problem of predicting the sentiment of movie reviews, you’d take a set of movie reviews with the actual sentiment of each, and feed them into a ML program. This ML program then literally outputs another program that takes in a movie review and outputs a prediction of its sentiment (with some expected accuracy). One reason that ML can work (not perfectly) on NLP problems is because where a human would have a very hard time creating millions and millions of rules, a computer has no such limitation.

Of course, there are a few catches to using ML. First, the data is not always cheap. Sentiment analysis of movie reviews has been popular topic only because many movie reviews online come with ratings (usually on a scale of 1 to 5), which tell you the answer — the term for this in ML is “ground truth”. For many problems, the ground truth is not readily available and can be very costly to figure it out. A recent Kaggle competition on this topic used a dataset of 50,000 movie reviews — imagine needing to read every single one of these to determine the ground truth sentiment.

Another catch, which I’ve already mentioned, is that ML will rarely produce a program with perfect accuracy. This is because for most real-world problems, it’s impossible to provide the ML program with all of the relevant information. For NLP problems, humans have a wealth of knowledge about what words mean, while computers do not. Many other real-world problems involve predicting human behavior, but it’s impossible to observe everything that’s going on in people’s heads. While ML algorithms are designed to make the most out of limited information, they’re only viable as a solution when the business objectives can tolerate some amount of error.

ML solutions are also slow to develop, even more so than standard software engineering solutions. As mentioned earlier, ML solutions need to work with limited information, which means that it’s impossible to know whether ML will meet the business’s requirements beforehand. Effective data scientists will make educated guesses about the data, ML algorithms, and algorithm parameters that are most likely to succeed based on the problem, but experimentation is always required. This can mean multiple iterations refining the data, algorithms, and parameters before a definitive answer can be reached about whether an ML solution will work.

Last, ML solutions in production are costly to maintain. Their performance needs to be continuously monitored, because their performance can change over time as the characteristics of the data they’re analyzing changes. In the case of predicting the sentiment of movie reviews, just changes in writing style could drop the accuracy significantly. Processes and instrumentation are required to evaluate a statistically significant sample of the solution’s predictions, and take actions to improve performance whenever it drops such as creating a new program using newer data.

I hope this de-mystifies some of what data scientists do, and explains one of the important tools they use to extract value from large amounts of data.

Automating a Git Rebase Workflow

When I started on the Firebird team at Bazaarvoice, I was happy to learn that they host their code on GitHub and review and land changes via pull requests. I was less happy to learn that they merged pull requests with the big green button. I was able to convince the team to try out a new, rebase-oriented, workflow that keeps the mainline branch linear and clean. While the new workflow was a hit with the team, it was much more complicated than just clicking a button, so I automated the workflow with a simple git extension, git land, which we have released as an open source tool.

What’s Wrong With the Big Green Button?

The big green button is the “Merge pull request” button that GitHub provides to merge pull requests. Clicking it prompts the user to enter a commit message (or accept a default provided by GitHub) and then confirm the merge. When the user confirms the merge, the pull request branch is merged using the –no-ff option, which always creates a merge commit. Finally, GitHub closes the pull request.

For example, given a master branch like this:

git log of an example master branch with three commits

An example master branch

 

…and a feature branch that diverges from the second commit:

An example feature branch started from the second commit

A feature branch started from the second commit

 

…this is the result of doing a –no-ff merge:

The result of merging the examples with the --no-ff option

The result of merging the examples with the –no-ff option. Note that the third commit on master is interleaved with the merge commit and the feature branch commits.

 

Merging with the big green button is frowned upon by many; for detailed discussions of why this is, see Isaac Z. Schlueter and Benjamin Sandofsky. In addition to the problems with merge commits that Isaac and Benjamin point out, the big green button has another downside: it merges the pull request without an opportunity to squash commits or otherwise clean up the branch.

This causes a couple of problems. First, because only the pull request author can clean up the PR branch, merging often became a tedious and drawn out process as reviewers cajoled the author to update their branch to a state that would keep `master`’s history relatively clean. Worse, sometimes messy pull requests were hastily or mistakenly merged.

As a result, the team was encouraged to keep their pull requests squashed into one or two clean commits at all times. This solved one problem, but introduced another: when an author responds to comments by pushing up a new version of the pull request, the latest changes are squashed together into one or two commits. As a result, reviewers had to hunt through the entire diff to ensure that their comments were fully addressed.

An Alternate Workflow

After some lively discussion, the team adopted a new workflow centered on fast-forward merging squashed and rebased pull request branches. Developers create topic branches and pull requests as before, but when updating their pull request, they never squash commits. This preserves detailed history of the changes the author makes in response to review feedback.

When the PR is ready to be merged, the merger interactively rebases it on the latest master, squashes it down to one or two commits, and does a fast-forward merge. The result is a clean, linear, and atomic history for `master`.

The result of merging the example feature branch into master by rebasing and doing a --ff-only merge

The result of merging the example feature branch into master using the described workflow.

One hiccup is that GitHub can’t easily tell that the rebased and squashed commit contains the changes in the pull request, so it doesn’t close the PR automatically. Fortunately, GitHub will close pull requests that contain special keywords. So, the merger has a final task: adding “[closes #<PR number>]” to one of the squashed commit’s message.

git-land

The biggest downside to the new workflow is that it transformed merging a PR from a simple operation (pushing a button) to a somewhat tricky multi-step process:

  • update local master to latest from upstream
  • check out pull request branch
  • do an interactive rebase on top of master, squashing down to one or two commits
  • add “[closes #<PR number>]” to the last commit message for the most recent squashed commit
  • do a fast-forward merge of the pull request branch into master
  • push local master to upstream

This process was too lengthy and error-prone to be reliable unless automated. To address this problem, I created a simple git extension: git-land. The Firebird team has been using this tool for a little over a year with very few problems. In fact, it has spread to other teams at Bazaarvoice. We are excited to release it as an open source tool for the public to use.

Front End Application Testing with Image Recognition

One of the many challenges of software testing has always been cross-browser testing. Despite the web’s overall move to more standards compliant browser platforms, we still struggle with the fact that sometimes certain CSS values or certain JavaScript operations don’t translate well in some browsers (cough, cough IE 8).

In this post, I’m going to show how the Curations team has upgraded their existing automation tools to allow for us to automate spot checking the visual display of the Curations front end across multiple browsers in order to save us time while helping to build a better product for our clients.

The Problem: How to save time and test all the things

The Curations front end is a highly configurable product that allows our clients to implement the display of moderated UGC made available through the API from a Curations instance.

This flexibility combined with BV’s browser support guidelines means there are a very large number ways Curations content can be rendered on the web.

Initially, rather than attempt to test ‘all the things’, we’ve codified a set of possible configurations that represent general usage patterns of how Curations is implemented. Functionally, we can test that content can be retrieved and displayed however, when it comes whether that the end result has the right look-n-feel in Chrome, Firefox and other browsers, our testing of this is largely manual (and time consuming).

How can we better automate this process without sacrificing consistency or stability in testing?

Our solution: Sikuli API

Sikuli is an open-source Java-based application and API that allows users to automate web, mobile and OS applications across multiple platforms using image recognition. It’s platform based and not browser specific, so it enables us to circumvent limitations with screen capture and compare features in other automation tools like Webdriver.

Imagine writing a test script that starts with clicking the home button within an iOS simulator, simply by providing the script a .png of the home button itself. That’s what Sikuli can do.

You can read more about Sikuli here. You can check out their project here on github.

Installation:

Sikuli provides two different products for your automation needs – their stand-alone scripting engine and their API. For our purposes, we’re interested in the Sikuli API with the goal to implement it within our existing Saladhands test framework, which uses both Webdriver and Cucumber.

Assuming you have Java 1.6 or greater installed on your workstation, from Sikuli.org’s download page, follow the link to their standalone setup JAR

http://www.sikuli.org/download.html

Download the JAR file and place it in your local workstation’s home directory, then open it.

Here, you’ll be prompted by the installer to select an installation type. Select option 3 if wish to use Sikuli in your Java or Jython project as well as have access to its command line options. Select option 4 if you only plan on using Sikuli within the scope of your Java or Jython project.

Once the installation is complete, you should have a sikuli.jar file in your working directory. You will want to add this to your collection of external JARs for your installed JRE.

For example, if you’re using Eclipse, go to Preferences > Java > Installed JREs, select your JRE version, click Edit and add Sikuli.jar to the collection.

Alternately, if you are using Maven to build your project, you can add Sikuli’s API to your project by adding the following to your POM.XML file:

<dependency>
    <groupId>org.sikuli</groupId>
    <artifactId>sikuli-api</artifactId>
    <version>1.2.0</version>
</dependency>

Clean then build your project and now you’re ready to roll.

Implementation:

Ultimately, we wanted a method we could control using Cucumber that allows us to articulate a web application using Webdriver that could take a screen shot of a web application (in this case, an instance of Curations) and compare it to a static screen shot of specific web elements (e.g. Ratings and Review stars within the Curations display).

This test method would then make an assumption that either we could find a match to the static screen element within the live web application or have TestNG throw an exception (test failure) if no match could be found.

First, now that we have the ability to use Sikuli, we created a new helper class that instantiates an object from their API so we can compare screen output.

import org.sikuli.api.*;
import java.io.IOException;
import java.io.File;
/**
* Created by gary.spillman on 4/9/15.
*/
public class SikuliHelper {

public boolean screenMatch(String targetPath) {
new ImageTarget(new File(targetPath));

Once we import the Sikuli API, we create a simple class with a single class method. In this case, screenMatch is going to accept a path within the Java project relative to a static image we are going to compare against the live browser window. True or false will be returned depending on if we have a match or not.

//Sets the screen region Sikuli will try to match to full screen
ScreenRegion fullScreen = new DesktopScreenRegion();

//Set your taret to compare from
Target target = new ImageTarget(new File(targetPath));

The main object type Sikuli wants to handle everything with is ScreenRegion. In this case, we are instantiating a new screen region relative to the entire desktop screen area of whatever OS our project will run on. Without passing any arguments to DesktopScreenRegion(), we will be defining the region’s dimension as the entire viewable area of our screen.

double fuzzPercent = .9;

try {
    fuzzPercent = Double.parseDouble(PropertyLoader.loadProperty(&quot;fuzz.factor&quot;));
}
catch (IOException e) {
    e.printStackTrace();
}
new ImageTarget(new File(targetPath));

Sikuli allows you to define a fuzzing factor (if you’ve ever used ImageMagick, this should be a familiar concept). Essentially, rather than defining a 1:1 exact match, you can define a minimal acceptable percentage you wish your screen comparison to match. For Sikuli, you can define this within a range from 0.1 to 1 (ie 10% match up to 100% match).

Here we are defining a default minimum match (or fuzz factor) of 90%. Additionally, we load in from a set of properties in Saladhand’s test.properties file a value which, if present can override the default 90% match – should we wish to increase or decrease the severity of test criteria.

target.setMinScore(fuzzPercent);
new ImageTarget(new File(targetPath));

Now that we know what fuzzing percentage we want to test with, we use target’s setMinScore method to set that property.

ScreenRegion found = fullScreen.find(target);

//According to code examples, if the image isn't found, the screen region is undefined
//So... if it remains null at this point, we're assuming there's no match.

if(found == null) {
    return false;
}
else {
    return true;
}
new ImageTarget(new File(targetPath));

This is where the magic happens. We create a new screen region called found. We then define that using fullScreen’s find method, providing the path to the image file we will use as comparison (target).

What happens here is that Sikuli will take the provided image (target) and attempt to locate any instance within the current visible screen that matches target, within the lower bound of the fuzzing percentage we set and up to a full, 100% match.

The find method either returns a new screen region object, or returns nothing. Thus, if we are unable to find a match to the file relative to target, found will remain undefined (null). So in this case, we simply return false if found is null (no match) or true of found is assigned a new screen region (we had a match).

Putting it all together:

To completely incorporate this behavior into our test framework, we write a simple cucumber step definition that allows us to call our Sikuli helper method, and provide a local image file as an argument for which to compare it against the current, active screen.

Here’s what the cucumber step looks like:

public class ScreenShotSteps {

    SikuliHelper sk = new SikuliHelper();

    //Given the image &quot;X&quot; can be found on the screen
    @Given(&quot;^the image \&quot;([^\&quot;]*)\&quot; can be found on the screen$&quot;)
    public void the_image_can_be_found_on_the_screen(String arg1) {

        String screenShotDir=null;

        try {
            screenShotDir = PropertyLoader.loadProperty(&quot;screenshot.path&quot;).toString();
        }
        catch (IOException e) {
            e.printStackTrace();
        }

        Assert.assertTrue(sk.screenMatch(screenShotDir + arg1));
    }
    new ImageTarget(new File(targetPath));
}

We’re referring to the image file via regex. The step definition makes an assertion using TestNG that the value returned from our instance of SikuliHelper’s screen match method is true (Success!!!). If not, TestNG throws an exception and our test will be marked as having failed.

Finally, since we already have cucumber steps that let us invoke and direct Webdriver to a live site, we can write a test that looks like the following:

Feature: Screen Shot Test
As a QA tester
I want to do screen compares
So I can be a boss ass QA tester

Scenario: Find the nav element on BV's home page
Given I visit &quot;http://www.bazaarvoice.com&quot;
Then the image &quot;screentest1.png&quot; can be found on the screen
new ImageTarget(new File(targetPath));

In this case, the image we are attempting to find is a portion of the nav element on BV’s home page:

screentest1

Considerations:

This is not a full-stop solution to cross browser UI testing. Instead, we want to use Sikuli and tools like it to reduce overall manual testing as much as possible (as reasonably as possible) by giving the option to pre-warn product development teams of UI discrepancies. This can help us make better decisions on how to organize and allocate testing resources – manual and otherwise.

There are caveats to using Sikuli. The most explicit caveat is that tests designed with it cannot run heedlessly – the test tool requires a real, actual screen to capture and manipulate.

Obviously, the other possible drawback is the required maintenance of local image files you will need to check into your automation project as test artifacts. How deep you will be able to go with this type of testing may be tempered by how large of a file collection you will be able to reasonably maintain or deploy.

Despite that, Sikuli seems to have a large number of powerful features, not limited to being able to provide some level of mobile device testing. Check out the project repository and documentation to see how you might be able to incorporate similar automation code into your project today.

Predictively Scaling EC2 Instances with Custom CloudWatch Metrics

One of the chief promises of the cloud is fast scalability, but what good is snappy scalability without load prediction to match? How many teams out there are still manually switching group sizes when load spikes? If you would like to make your Amazon EC2 scaling more predictive, less reactive and hopefully less expensive it is my intention to help you with this article.

Problem 1: AWS EC2 Autoscaling Groups can only scale in response to metrics in CloudWatch and most of the default metrics are not sufficient for predictive scaling.

For instance, by looking at the CloudWatch Namespaces reference page we can see that Amazon SQS queues, EC2 Instances and many other Amazon services post metrics to CloudWatch by default.

From SQS you get things like NumberOfMessagesSent and SentMessageSize. EC2 Instances post metrics like CPUUtilization and DiskReadOps. These metrics are helpful for monitoring. You could also use them to reactively scale your service.

The downside is that by the time you notice that you are using too much CPU or sending too few messages, you’re often too late. EC2 instances take time to start up and instances are billed by the hour, so you’re either starting to get a backlog of work while starting up or you might shut down too late to take advantage of an approaching hour boundary and get charged for a mostly unused instance hour.

More predictive scaling would start up the instances before the load became business critical or it would shut down instances when it becomes clear they are not going to be needed instead of when their workload drops to zero.

Problem 2: AWS CloudWatch default metrics are only published every 5 minutes.

In five minutes a lot can happen, with more granular metrics you could learn about your scaling needs quite a bit faster. Our team has instances that take about 10 minutes to come online, so 5 minutes can make a lot of difference to our responsiveness to changing load.

Solution 1 & 2: Publish your own CloudWatch metrics

Custom metrics can overcome both of these limitations, you can publish metrics related to your service’s needs and you can publish them much more often.

For example, one of our services runs on EC2 instances and processes messages off an SQS queue. The load profile can vary over time; some messages can be handled very quickly and some take significantly more time. It’s not sufficient to simply look at the number of messages in the queue as the average processing speed can vary between 2 and 60 messages per second depending on the data.

We prefer that all our messages be handled within 2 hours of being received. With this in mind I’ll describe the metric we publish to easily scale our EC2 instances.

ApproximateSecondsToCompleteQueue = MessagesInQueue / AverageMessageProcessRate

The metric we publish is called ApproximateSecondsToCompleteQueue. A scheduled executor on our primary instance runs every 15 seconds to calculate and publish it.

private AmazonCloudWatchClient _cloudWatchClient = new AmazonCloudWatchClient();
_cloudWatchClient.setRegion(RegionUtils.getRegion("us-east-1"));

...

PutMetricDataRequest request = new PutMetricDataRequest()
  .withNamespace(CUSTOM_SQS_NAMESPACE)
  .withMetricData(new MetricDatum()
  .withMetricName("ApproximateSecondsToCompleteQueue")
  .withDimensions(new Dimension()
                    .withName(DIMENSION_NAME)
                    .withValue(_queueName))
  .withUnit(StandardUnit.Seconds)
  .withValue(approximateSecondsToCompleteQueue));

_cloudWatchClient.putMetricData(request);

In our CloudFormation template we have a parameter calledDesiredSecondsToCompleteQueue and by default we have it set to 2 hours (7200 seconds). In the Auto Scaling Group we have a scale up action triggered by an Alarm that checks whether DesiredSecondsToCompleteQueue is less than ApproximateSecondsToCompleteQueue.

"EstimatedQueueCompleteTime" : {
  "Type": "AWS::CloudWatch::Alarm",
  "Condition": "HasScaleUp",
  "Properties": {
    "Namespace": "Custom/Namespace",
    "Dimensions": [{
      "Name": "QueueName",
      "Value": { "Fn::Join" : [ "", [ {"Ref": "Universe"}, "-event-queue" ] ] }
    }],
    "MetricName": "ApproximateSecondsToCompleteQueue",
    "Statistic": "Average",
    "ComparisonOperator": "GreaterThanThreshold",
    "Threshold": {"Ref": "DesiredSecondsToCompleteQueue"},
    "Period": "60",
    "EvaluationPeriods": "1",
    "AlarmActions" : [{
      "Ref": "ScaleUpAction"
    }]
  }
}

 

Visualizing the Outcome

What’s a cloud blog without some graphs? Here’s what our load and scaling looks like after implementing this custom metric and scaling. Each of the colors in the middle graph represents a service instance. The bottom graph is in minutes for readability. Note that our instances terminate themselves when there is nothing left to do.

Screen Shot 2015-04-17 at 11.37.21 AM

I hope this blog has shown you that it’s quite easy to publish your own CloudWatch metrics and scale your EC2 AutoScalingGroups accordingly.