Category Archives: Uncategorized

I want to be a UX Designer. Where do I start?

So many folks are wonder what they need to do to make a career of User Experience Design. As someone who interviewed many designers before, I’d say the only gate between you and a career in UX that really matters is your portfolio. Tech moves too fast and is too competitive to worry about tenure and experience and degrees. If you can bring it, you’re in!

That doesn’t mean school is a waste of time, though. Some of the best UX Design candidates I’ve interviewed came from Carnegie Mellon. We have a UX Research intern from the University of Texas on staff right now, and I’m blown away by her knowledge and talent. A good academic program can help you skip a lot of trial-by-fire and learning things the painful way. But most of all, a good academic program can feed you projects to use as samples in your portfolio. But goodness, choose your school carefully! I’ve also felt so bad for another candidate whose professors obviously had no idea what they were talking about.

Okay, so that portfolio… what should it demonstrate? What sorts of samples should it include? Well, that depends on what sort of UX Designer you want to be.

Below is a list of to-dos, but before you jump into your project, I strongly suggest forming a little product team. Your product team can be your your knitting circle, your best friend and next-best-friend, a fellow UX-hopeful. It doesn’t really matter so long as your team is comprised of humans.

I make this suggestion because I’ve observed that many UX students actually have projects under their belt, but they are mostly homework assignments they did solo. So they are going through the motions of producing journey maps, etc., but without really knowing why. So then they imagine to themselves that these deliverables are instructions. This is how UX Designers instruct engineers on what to do. Nope.

The truth is, deliverables like journey maps and persona charts and wireframes help other people instruct us. In real life, you’ll work with a team of engineers, and those folks must have opportunites to influence the design; otherwise, they won’t believe in it. And they won’t put their heart and soul into building it. And your mockups will look great, and the final product will be a mess of excuses.

So, if you can demonstrate to a hiring manager that you know how to collaborate, dang. You are ahead of the pack. So round up your jackass friends, come up with a fun team name, and…

If you want to be a UX Researcher,

Demonstrate product discovery.

  • Identify a market you want to affect, for example, people who walk their dogs.
  • Interview potential customers. Learn what they do, how they go about doing it, and how they feel at each step. (Look up “user journey” and “user experience map”)
  • Organize customers into categories based on their behaviors. (Look up “personas”)
  • Determine which persona(s) you can help the most.
  • Identify major pain points in their journey.
  • Brainstorm how you can solve these pain points with technology.

Demonstrate collaboration.

  • Allow the customers you interview to influence your understanding of the problem.
  • Invite others to help you identify pain points.
  • Invite others to help you brainstorms solutions.

If you want to be a UI designer,

Demonstrate ideation.

  • Brainstorm multiple ways to solve a problem
  • Choose the most compelling/feasible solution.
  • Sketch various ways that solution could be executed.
  • Pick the best concept and wireframe the most basic workflow. (Look up “hero flow”
  • Be aware of the assumptions your concept is based upon. Know that if you cannot validate them, you might need to go back to the drawing board. (Look up “product pivoting”)

Demonstrate collaboration.

  • Invite other people to help you brainstorm.
  • Let others vote on which concept to pursue.
  • Use a whiteboard to come up with the execution plan together.
  • Share your wireframes with potential customers and to see if the concept actually resonates with them.

If you want to be an IX Designer and Information Architect,

Demonstrate prototyping skill.

  • Build a prototype. The type of prototype depends on what you want to test. If you are trying to figure out how to organize the screens in your app, just labeled cards would work. (Look up “card sorting). If you want to test interactions, a coded version of the app with dummy content is nice, but clickable wireframes might be sufficient.
  • Plan your test. List the fundamental tasks people must be able to perform for your app to even make sense.
  • Correct the aspects of your design that throw people off or confuse people.

Demonstrate collaboration.

  • Allow customers to test-drive your prototype. (Look up “usability testing”)
  • Ask others to help you think of the best ways to revise your design based on the usability test results.

If you want to be a visual designer,

Demonstrate that you are paying attention.

  • Collect inspiration and media that you think your customers would like. Hit up dribbble and muzli and medium and behance and google image search and, and, and.
  • Organize all this media by mood: the pale ones, the punchy ones, the fun ones, whatever.
  • Pick the mood that matches the way you want people to feel when they use your app.
  • Style the wireframes with colors and graphics to match that mood.
  • Bonus: create a marketing page, a logo, business cards, and other graphic design assets that show big thinking.

Demonstrate collaboration.

  • Ask customers what media and inspiration they like. Let them help you collect materials.
  • Ask customers how your mood boards make them feel, in their own words.

Whew! That’s a lot of work! I know. At the very least, school buys you time to do all this stuff. And it’s totally okay to focus on just UX Research or just Visual Design and bill yourself as a specialist. Anyway, if you honestly enjoy UX Design, it will feel like playing. And remember to give your brain breaks once in a while. Go outside and ride your bike; it’ll help you keep your creative energy high.

Hope that helps, and good luck!

This article was originally published on Medium, “How to I break into UX Design?


In the previous blog post I introduced our stripe-ctf-2-vm, a self-contained capture the flag puzzle ladder in one vm. In this post, I’d like to talk about how we used the vm to introduce the security mindset to our developers here at Bazaarvoice.

One of the tenets in R&D is to responsibly “fail fast, win fast” with data-driven decisions. It was time for a grand experiment. We put on our lab coats and called for a dozen volunteers for a test training session. The format of this session ran like this:

  • Each participant was a lone competitor.
  • Discussion was encouraged to promote collaboration.
  • Each puzzle was timeboxed based on the feel of the room. 
  • Each puzzle/timebox would conclude with a discussion of the solution.  


What We Learned, Part 1

Gamifying learning is hard. There’s a whole field dedicated to this for a reason. We felt like we hit the broad strokes by providing: an increasingly difficult ladder of challenges, competition, social interaction (almost, see below), and timely feedback. The experiment led us to some very important tweaks…

  • Lone competitors don’t collaborate effectively, for obvious reasons. Attendees should form small teams of three or four for a better social element. Consider mixing technical skill sets and/or functional teams. That really helps.
  • There’s a balance to be had between freedom to play with a puzzle and frustration. It’s a good idea to track the time closely and cut to the solution before tables are flipped.
  • You also need to track the time so you can make an educated guess on how long future training sessions should be.
  • Proctoring and encouragement is needed for those new to CTF or security puzzles. This is easier to do when small teams are formed.
  • Prizes rewarded per level don’t motivate very much.

Taking It to the Big Leagues

Armed with what we learned and bolstered by testimonials from some of the attendees, we proposed to management a way to take this further. Now, BV R&D management is very pro-learning and our VP takes security very seriously. We didn’t have to sell the Why very much, just the How. Since this education was seen as vital to the growth of our engineering effort, we made the case that all one hundred and twenty plus of our engineers needed to train. We provided a plan for about nine sessions with about twenty engineers each. Once we had management buy-in, we did a two minute elevator-pitch to all of engineering so they had context for the meeting invites that followed.

If you don’t have rapport with your management about the value of secure code, you have your work cut out for you. We suggest you leverage your own talent to locate a few security flaws either through code review, pen testing, or fuzzing tools if you’ve got the skills. The likely low-hanging fruit for a web app exploit are unsanitized inputs, cross-site scripting, and SQL injection. Put together a presentation that educates on the rising number of web app breaches, the cost of fixing flaws in production, and the fact that you just found flaws in production. Paying for external pen testing or security training, or security tools can be expensive. Use what leverage you get from the presentation to instigate a bottom-up approach like a book club or training like this CTF session to cultivate security mavens who can advise the engineers around them.


The Pattern for a Session

Once you have roomful of engineers ready get their security on, what do you do? We divided them up into small teams and tried to get a mix between front-end, back-end, and QA skill sets when possible. The next thirty or so minutes were spent on helping the participants setup their VMs and log into them via SSH. Most of the issues you will run into will likely stem from mix-ups with localhost VBox adapters or trying to run and log into the VM directly from the VBox UI rather than via their terminal and shell script. When your engineers are logged into the VM, you’re ready to run through the introductory slides.

As soon as fingers are poised to take on the puzzles, give everyone the password and present the staging slide for Puzzle 0 (0, because we are programmers). Set up the puzzle and let the teams go to town on it. It’s a good idea to remind everyone to be vocal about looting the puzzle. And make sure they know they can look at the source code in the VM or in the github repo (no, the code you can see in the levels/0 directory is not the running instance of it so don’t modify it and expect to see the changes). A few minutes later, show the hint slide. Feel out your audience though. Easier puzzles may not need a hint, harder puzzles may need more breathing room before or after the hint is shown.

After you have a winner, or better yet, wait until you have two victorious teams (you don’t want to move too fast), show the solution and remediation slides and talk through ways the vulnerability could have been avoided. Any examples specific to your stack or technology you can use to bolster the remediation discussion will strengthen the discussion enormously. Once all the teams have unlocked the next level, repeat until you run out of time or until you solve all the puzzles. The latter means you are a bunch of badasses. Srsly!


Go as far as you can, and reward the team that conquered the most puzzles with something highly coveted: Pokemon figures. Or money. Whichever.

Your mileage will very of course, but we found the following timing worked for BV R&D across our three hour training sessions:


What We Learned, Part 2

Here’s what we learned from conducting nine sessions:

  • Scheduling 120+ people for a three hour training session is hard. Set aside admin time to handle rescheduling requests and plan for a make-up session.
  • Generate a VM per session so the loot is different each time.
  • For our engineers, a three hour session allowed the class to work through the first 5 or 6 puzzles as a group which made for a good introduction to security vulnerabilities.
  • Send out instructions and links to your generated VMs in the meeting invite.
  • Even so, plan on spending the first thirty to forty minutes setting up VMs.
  • Don’t send out the ctf user password in the invite; some clever, motivated individual will work through all the problems before the session. 🙂  If they do that, they should proctor the session with you.
  • If your company uses HipChat, Slack, or similar, set up an invite-only room per session for that session’s attendees only. It helps to have a means of cut/pasting the solutions to the group.
  • Don’t forget to work the room to give encouragement and redirect rabbit holing. Each team is different and some prefer to be left alone. Learn to feel that and respect it.
  • Some engineers will get hung up on the implementation details of the puzzles: “I don’t need to learn bloody PHP.” Try to impress upon them the larger vulnerability exercised by the puzzle. Try leading them through the thought exercise of how the security flaw (and not the language-specific flaw) may be at work in your company’s code today.
  • Some teams will race ahead after unlocking the next level. Most of the time that will even out as the CTF continues. Occasionally you have an ace security-minded team. If they continually blaze ahead, you may need to get their buy-in to take them out of direct competition with the other teams for morale’s sake. Perhaps they can help proctor the others?
  • Use something simple like a GoogleDoc form to take a quick poll of participants after each session. You will get invaluable feedback on how to better customize this to your organization.

We hope you find this unique way of introducing the secure coding mindset to your engineering organization useful and fun. It is just the tip of the iceberg. Once your engineers are thinking this way, you will need to develop further plans to encourage more education on the common secure coding patterns themselves. Consider periodic secure coding code reviews or pair programming with the experienced security-minded folks on your team. Forming teams to compete in other CTF competitions and internal security bug bounties might also help. Please don’t hesitate to contact us if you want tips on how to implement this for yourself or if you just want to let us know how it worked for you.

Intern Demo Day

As the summer comes to an end, so do the internships for numerous university students here at Bazaarvoice. This past week, the interns were given an opportunity to present a summary of their accomplishments. This afternoon of presentations, known as the Bazaarvoice “Intern Demo Day”, highlighted the various achievements throughout the company, not just in the R&D department.

The following is a short summary of the great work our interns complete this summer as well as some images from the “Intern Demo Day”.

CHASE PORTER: My project, which I have named “The Great (red)Shift”, is intended to improve data accessibility for computed aggregated counts of various canonical events written to HBase. To do this I designed a data warehouse in Amazon Redshift that I loaded with transformed aggregated counts extracted from the tables in HBase. This makes the counts readily SQL query-able in an incredibly fast system whereas before they had to be computed with performance heavy queries from Raw Logs generated by Cookie Monster. The biggest block for this project was in processing the data from HBase which was stored as serialized bytes and needed to be handled uniquely for different types of canonical events (i.e. pageviews, impressions, features) to translate into a readable form for Redshift.

BEN DEVORE: My product is web crawler written in node.js that scrapes clients’ webpages for product data in order to build their product feeds for them. For many of Bazaarvoice’s smaller clients, building and maintaining their product feed is a significant obstacle in the onboarding process. This tool aims to clear that obstacle by taking this task out of there hands.

STONEY MCCRARY: So I have been fortunate enough to get to work on several different pieces in curations but I am going to talk on what I have been hammering on for the last couple of weeks. More and more of our high volume clients are receiving millions of hits a day and this has caused performance to become a higher priority problem for them. In response to this, we are focusing our efforts on building a new display with performance in mind. Performance for the display centers around only providing the minimal amount of data needed and supply the rest as necessary. The piece I will be showing is the display carousel and how it dynamically loads and dumps the data to allow for faster loading and to keep browser memory low.

ZESHAN ANWAR: Eagle is a dashboard built for our Incubator team. With so many moving parts, it was important we had a summarized ‘birds-eye’ view of the team in one place. Eagle was initially meant to be an aggregation of all our Jenkin builds; a single view of all our jobs across our different Jenkins environments. However, it grew to also include JIRA and GitHub statistics. My other project was optimizing our UI tests by having them run concurrently. Our old UI tests were extremely slow, and by running them in parallel we drastically reduced test times.



BRENDON KELLEY: Testing Framework: This summer my project was to help build out a new testing framework for Curations. The current automation tests used for Curations is Saladhands. Before my internship, there wasn’t much if any automation tests for the submission/direct upload capability of Curations. I worked on creating tests and a CI environment for submission in a new testing framework called Intern. One of tests includes a language translation test using mongoDB as an endpoint to store the various languages’ text. Intern is a javascript based testing framework which will allow developers to contribute to writing tests since Curations is mostly javascript. I’ve also worked on updating and creating new console tests in this framework. The foundation built this summer in Intern will enable the ability to further contribute to the framework.

KRYSTINA DIAO: My main project for the summer was to analyze and report the effectiveness of the implementation of the new Connections Knowledgebase. Through Salesforce, I collected and analyzed the number of cases, time spent on each case, etc. After drawing my conclusions, I decided to present my findings via data visualization methods (JavaScript’s C3 and D3 libraries) and provide actionable insights on how this information can be leveraged. This information is valuable in that it can be used for future product KB decisions, as well as understanding how much time, manpower, and money is saved by having a KB.

Krystina Diaos (3)

Krystina Diaos (2)

Krystina Diaos

MARKO SAVIC: Over the summer, I was a part of the SEO Team. I managed to create a tool on pagesManaged and keywordsManaged feed for every Spotlights client. Generated feeds will be consumed by SeoClarity tools on a daily basis. This helps in identifying search rank gains on the specified keywords and pages where Spotlights are present. The SeoClarity reporting will help in proving out Spotlights value and eventually lead to Spotlights renewal/upticks.Also, I created algorithm tweaks on the PINS (Post Interaction Notification System) Generator that take into account product activeness, product relevancy and review count, and use them to ask the user to write reviews on the most relevant products.

TREVOR NELLIGAN: Here is a description of my project: I worked on the Aperture Component library and many of the projects it supports. Aperture is build in React, and its purpose is to be used as an internal Bazaarvoice tool for constructing web pages. Using Aperture, anyone at Bazaarvoice can easily create a functional, intuitive, Bazaarvoice themed webpage, all with the building blocks Aperture provides.

Using the Aperture library, I helped the construction of numerous pages for the curations beta console. I personally built the interface for a new client-facing template builder, which will allow clients to create curations templates quickly and easily without having to go through an implementation engineer and a long process, as was the case previously. I also supplied custom Aperture components for several projects, like the content curation beta page.

RAMIE RAUFDEEN: The mixer is a component of our product recommendations engine which differentiates shoppers, and optimizes recommendations for them. This is primarily derived from their shopping behavior – in real time. Prior to the mixer, product recommendations were aggregated from multiple sources, using the same algorithm for every shopper. Shoppers are now categorized based off of a set of rules (using the shopper’s profile data), each of the rules map to a plan (which you can think of as an ‘algorithm’). A plan defines how recommendations should be mixed from each of the sources. For example, if source B has proven to have a higher conversion rate for ‘heavy-shoppers’, the plan for ‘heavy-shopper’ would give a higher weighting to source B. We can now target specific types of shoppers when it comes to product recommendations. This also sets the groundwork for a more granular machine learning implementation in the future.


We want to thank all the interns who spent time with us this summer and wish you the best back at school. We look forward to hearing about all the great things you all develop in the future.

If you are interested in an internship at Bazaarvoice, please contact

HackerX event hosted at Bazaarvoice

The Bazaarvoice headquarters hosted the July 20th HackerX event in Austin, Texas. The event featured not only Bazaarvoice, but also included Facebook, Amazon, and Indeed. 70+ engineers participated in onsite interviews and networking. HackerX commented that “this was one of the most successful events” they have ever seen.

Gary Allison, Executive Vice President of Engineering, kicked off the event with a compelling message about Bazaarvoice and why this is an awesome place to work.

HackerX started in 2012 with invite-only, face-to-face recruiting events that connect tech talent with some of the world’s most innovative companies. Currently, they operate over 100+ events in 40+ cities, 15+ countries annually.

See for additional information.




Conversations API Deprecation for Versions 5.2 and 5.3

Today we are announcing an important change to our Conversations API service:

  • On April 30, 2017 service will end for Conversations API versions 5.2 and 5.3

By deprecating older versions of our API service, we can refocus our energies on the current and future API services, which we feel offer the most benefits to our customers. Please visit our Upgrade Guide to learn more about the Conversations API, our API versioning, and the steps necessary to support the upgrade.

We understand that this change will require effort on your part. Bazaarvoice is committed to making this transition easy for you. We are prepared to assist you in a number of ways:

  • Pre-notification: You have more than 12 months to plan for and implement the change.
  • Documentation: We have specific documentation to help you.
  • Support: Our support team is ready to address any questions you may have.
  • Services: Our services teams are available to provide additional assistance.

In summary, on April 30, 2017, Conversations API versions released before 5.4 will no longer be available. Applications and websites using versions before 5.4 will no longer function properly after April 30, 2017. If your custom application or website is making API calls to Conversations API versions 5.2 or 5.3 you will need to upgrade to the current Conversations API (5.4). Applications using Conversations API versions 5.4 and later will continue to receive uninterrupted API service.

If you have any questions about this notice, please submit a case in Spark. We will periodically update this blog and our developer Twitter feed (@BazaarvoiceDev) as we move closer to the change of service date.

Visit the following page Coversations API 2017 Deprecation to learn more.

Thank you for your partnership,
Chris Kauffman
Manager, Product Management

What does a data scientist do?

More and more companies and industries are grappling with the challenges of extracting value from large amounts of data. Data scientists, the people whose job it is to overcome these challenges, are becoming more prominent, yet what it is they do, and how they’re different than software engineers, is still a mystery to a lot of people. The goal of this article is to explain one of the most important tools that data scientists use: machine learning (ML). The bottom-line is: using ML is slow, costly, and error-prone, but it allows companies to achieve business objectives that are unattainable any other way.

Just like a software engineer, the goal of a data scientist is to develop programs that perform business functions. In software engineering, the engineer writes a program that encodes all of the “rules” for what the program is supposed to do, based on the requirements. For example, take the task of returning all of the product reviews for a given product ID. The rules here include things like how to make sure the product ID is valid (and what to do when it’s not), how to query a database of reviews with the product ID, and how to format the reviews to be returned. A reasonably skilled software engineer can easily write a program that encodes all of these rules.

However, there are many problems where it is not feasible for anyone to write down all of the rules required for a software program to perform some task. Sometimes this is because the rules are simply not known, and other times it’s because there are way too many rules. Several good example of the latter type come from natural language processing (NLP), like the problem of predicting the sentiment of movie reviews (i.e. did the reviewer like the movie or not?). Like nearly all NLP problems, this is something that a human could do reasonably well, but it doesn’t mean that a person could easily write down a set of rules for how they made their decisions. If you had to, you’d probably start by listing key words or phrases that would indicate the reviewer’s sentiment, like “great”, “liked”, “smash hit”, “boring” and “terrible”, and use their appearance in a review to judge the sentiment. This will only work so far, because language is much more complex than that. You’d need additional rules to get around the fact that many of the key words can mean different things in different contexts, more rules to cover all of they ways that you can express sentiment without using a key word, and even more rules to detect when sentiment gets flipped by a negating phrase. Add to this the fact that people don’t actually write in perfect English, and you’d end up with a seemingly endless list of rules and an impossible software engineering problem.

ML has completely different approach: instead of writing out the rules in a top-down fashion, ML attempts to infer the rules in a bottom-up way, from data. For the problem of predicting the sentiment of movie reviews, you’d take a set of movie reviews with the actual sentiment of each, and feed them into a ML program. This ML program then literally outputs another program that takes in a movie review and outputs a prediction of its sentiment (with some expected accuracy). One reason that ML can work (not perfectly) on NLP problems is because where a human would have a very hard time creating millions and millions of rules, a computer has no such limitation.

Of course, there are a few catches to using ML. First, the data is not always cheap. Sentiment analysis of movie reviews has been popular topic only because many movie reviews online come with ratings (usually on a scale of 1 to 5), which tell you the answer — the term for this in ML is “ground truth”. For many problems, the ground truth is not readily available and can be very costly to figure it out. A recent Kaggle competition on this topic used a dataset of 50,000 movie reviews — imagine needing to read every single one of these to determine the ground truth sentiment.

Another catch, which I’ve already mentioned, is that ML will rarely produce a program with perfect accuracy. This is because for most real-world problems, it’s impossible to provide the ML program with all of the relevant information. For NLP problems, humans have a wealth of knowledge about what words mean, while computers do not. Many other real-world problems involve predicting human behavior, but it’s impossible to observe everything that’s going on in people’s heads. While ML algorithms are designed to make the most out of limited information, they’re only viable as a solution when the business objectives can tolerate some amount of error.

ML solutions are also slow to develop, even more so than standard software engineering solutions. As mentioned earlier, ML solutions need to work with limited information, which means that it’s impossible to know whether ML will meet the business’s requirements beforehand. Effective data scientists will make educated guesses about the data, ML algorithms, and algorithm parameters that are most likely to succeed based on the problem, but experimentation is always required. This can mean multiple iterations refining the data, algorithms, and parameters before a definitive answer can be reached about whether an ML solution will work.

Last, ML solutions in production are costly to maintain. Their performance needs to be continuously monitored, because their performance can change over time as the characteristics of the data they’re analyzing changes. In the case of predicting the sentiment of movie reviews, just changes in writing style could drop the accuracy significantly. Processes and instrumentation are required to evaluate a statistically significant sample of the solution’s predictions, and take actions to improve performance whenever it drops such as creating a new program using newer data.

I hope this de-mystifies some of what data scientists do, and explains one of the important tools they use to extract value from large amounts of data.

Automating a Git Rebase Workflow

When I started on the Firebird team at Bazaarvoice, I was happy to learn that they host their code on GitHub and review and land changes via pull requests. I was less happy to learn that they merged pull requests with the big green button. I was able to convince the team to try out a new, rebase-oriented, workflow that keeps the mainline branch linear and clean. While the new workflow was a hit with the team, it was much more complicated than just clicking a button, so I automated the workflow with a simple git extension, git land, which we have released as an open source tool.

What’s Wrong With the Big Green Button?

The big green button is the “Merge pull request” button that GitHub provides to merge pull requests. Clicking it prompts the user to enter a commit message (or accept a default provided by GitHub) and then confirm the merge. When the user confirms the merge, the pull request branch is merged using the –no-ff option, which always creates a merge commit. Finally, GitHub closes the pull request.

For example, given a master branch like this:

git log of an example master branch with three commits

An example master branch


…and a feature branch that diverges from the second commit:

An example feature branch started from the second commit

A feature branch started from the second commit


…this is the result of doing a –no-ff merge:

The result of merging the examples with the --no-ff option

The result of merging the examples with the –no-ff option. Note that the third commit on master is interleaved with the merge commit and the feature branch commits.


Merging with the big green button is frowned upon by many; for detailed discussions of why this is, see Isaac Z. Schlueter and Benjamin Sandofsky. In addition to the problems with merge commits that Isaac and Benjamin point out, the big green button has another downside: it merges the pull request without an opportunity to squash commits or otherwise clean up the branch.

This causes a couple of problems. First, because only the pull request author can clean up the PR branch, merging often became a tedious and drawn out process as reviewers cajoled the author to update their branch to a state that would keep `master`’s history relatively clean. Worse, sometimes messy pull requests were hastily or mistakenly merged.

As a result, the team was encouraged to keep their pull requests squashed into one or two clean commits at all times. This solved one problem, but introduced another: when an author responds to comments by pushing up a new version of the pull request, the latest changes are squashed together into one or two commits. As a result, reviewers had to hunt through the entire diff to ensure that their comments were fully addressed.

An Alternate Workflow

After some lively discussion, the team adopted a new workflow centered on fast-forward merging squashed and rebased pull request branches. Developers create topic branches and pull requests as before, but when updating their pull request, they never squash commits. This preserves detailed history of the changes the author makes in response to review feedback.

When the PR is ready to be merged, the merger interactively rebases it on the latest master, squashes it down to one or two commits, and does a fast-forward merge. The result is a clean, linear, and atomic history for `master`.

The result of merging the example feature branch into master by rebasing and doing a --ff-only merge

The result of merging the example feature branch into master using the described workflow.

One hiccup is that GitHub can’t easily tell that the rebased and squashed commit contains the changes in the pull request, so it doesn’t close the PR automatically. Fortunately, GitHub will close pull requests that contain special keywords. So, the merger has a final task: adding “[closes #<PR number>]” to one of the squashed commit’s message.


The biggest downside to the new workflow is that it transformed merging a PR from a simple operation (pushing a button) to a somewhat tricky multi-step process:

  • update local master to latest from upstream
  • check out pull request branch
  • do an interactive rebase on top of master, squashing down to one or two commits
  • add “[closes #<PR number>]” to the last commit message for the most recent squashed commit
  • do a fast-forward merge of the pull request branch into master
  • push local master to upstream

This process was too lengthy and error-prone to be reliable unless automated. To address this problem, I created a simple git extension: git-land. The Firebird team has been using this tool for a little over a year with very few problems. In fact, it has spread to other teams at Bazaarvoice. We are excited to release it as an open source tool for the public to use.

Front End Application Testing with Image Recognition

One of the many challenges of software testing has always been cross-browser testing. Despite the web’s overall move to more standards compliant browser platforms, we still struggle with the fact that sometimes certain CSS values or certain JavaScript operations don’t translate well in some browsers (cough, cough IE 8).

In this post, I’m going to show how the Curations team has upgraded their existing automation tools to allow for us to automate spot checking the visual display of the Curations front end across multiple browsers in order to save us time while helping to build a better product for our clients.

The Problem: How to save time and test all the things

The Curations front end is a highly configurable product that allows our clients to implement the display of moderated UGC made available through the API from a Curations instance.

This flexibility combined with BV’s browser support guidelines means there are a very large number ways Curations content can be rendered on the web.

Initially, rather than attempt to test ‘all the things’, we’ve codified a set of possible configurations that represent general usage patterns of how Curations is implemented. Functionally, we can test that content can be retrieved and displayed however, when it comes whether that the end result has the right look-n-feel in Chrome, Firefox and other browsers, our testing of this is largely manual (and time consuming).

How can we better automate this process without sacrificing consistency or stability in testing?

Our solution: Sikuli API

Sikuli is an open-source Java-based application and API that allows users to automate web, mobile and OS applications across multiple platforms using image recognition. It’s platform based and not browser specific, so it enables us to circumvent limitations with screen capture and compare features in other automation tools like Webdriver.

Imagine writing a test script that starts with clicking the home button within an iOS simulator, simply by providing the script a .png of the home button itself. That’s what Sikuli can do.

You can read more about Sikuli here. You can check out their project here on github.


Sikuli provides two different products for your automation needs – their stand-alone scripting engine and their API. For our purposes, we’re interested in the Sikuli API with the goal to implement it within our existing Saladhands test framework, which uses both Webdriver and Cucumber.

Assuming you have Java 1.6 or greater installed on your workstation, from’s download page, follow the link to their standalone setup JAR

Download the JAR file and place it in your local workstation’s home directory, then open it.

Here, you’ll be prompted by the installer to select an installation type. Select option 3 if wish to use Sikuli in your Java or Jython project as well as have access to its command line options. Select option 4 if you only plan on using Sikuli within the scope of your Java or Jython project.

Once the installation is complete, you should have a sikuli.jar file in your working directory. You will want to add this to your collection of external JARs for your installed JRE.

For example, if you’re using Eclipse, go to Preferences > Java > Installed JREs, select your JRE version, click Edit and add Sikuli.jar to the collection.

Alternately, if you are using Maven to build your project, you can add Sikuli’s API to your project by adding the following to your POM.XML file:


Clean then build your project and now you’re ready to roll.


Ultimately, we wanted a method we could control using Cucumber that allows us to articulate a web application using Webdriver that could take a screen shot of a web application (in this case, an instance of Curations) and compare it to a static screen shot of specific web elements (e.g. Ratings and Review stars within the Curations display).

This test method would then make an assumption that either we could find a match to the static screen element within the live web application or have TestNG throw an exception (test failure) if no match could be found.

First, now that we have the ability to use Sikuli, we created a new helper class that instantiates an object from their API so we can compare screen output.

import org.sikuli.api.*;
* Created by gary.spillman on 4/9/15.
public class SikuliHelper {

public boolean screenMatch(String targetPath) {
new ImageTarget(new File(targetPath));

Once we import the Sikuli API, we create a simple class with a single class method. In this case, screenMatch is going to accept a path within the Java project relative to a static image we are going to compare against the live browser window. True or false will be returned depending on if we have a match or not.

//Sets the screen region Sikuli will try to match to full screen
ScreenRegion fullScreen = new DesktopScreenRegion();

//Set your taret to compare from
Target target = new ImageTarget(new File(targetPath));

The main object type Sikuli wants to handle everything with is ScreenRegion. In this case, we are instantiating a new screen region relative to the entire desktop screen area of whatever OS our project will run on. Without passing any arguments to DesktopScreenRegion(), we will be defining the region’s dimension as the entire viewable area of our screen.

double fuzzPercent = .9;

try {
    fuzzPercent = Double.parseDouble(PropertyLoader.loadProperty(&quot;fuzz.factor&quot;));
catch (IOException e) {
new ImageTarget(new File(targetPath));

Sikuli allows you to define a fuzzing factor (if you’ve ever used ImageMagick, this should be a familiar concept). Essentially, rather than defining a 1:1 exact match, you can define a minimal acceptable percentage you wish your screen comparison to match. For Sikuli, you can define this within a range from 0.1 to 1 (ie 10% match up to 100% match).

Here we are defining a default minimum match (or fuzz factor) of 90%. Additionally, we load in from a set of properties in Saladhand’s file a value which, if present can override the default 90% match – should we wish to increase or decrease the severity of test criteria.

new ImageTarget(new File(targetPath));

Now that we know what fuzzing percentage we want to test with, we use target’s setMinScore method to set that property.

ScreenRegion found = fullScreen.find(target);

//According to code examples, if the image isn't found, the screen region is undefined
//So... if it remains null at this point, we're assuming there's no match.

if(found == null) {
    return false;
else {
    return true;
new ImageTarget(new File(targetPath));

This is where the magic happens. We create a new screen region called found. We then define that using fullScreen’s find method, providing the path to the image file we will use as comparison (target).

What happens here is that Sikuli will take the provided image (target) and attempt to locate any instance within the current visible screen that matches target, within the lower bound of the fuzzing percentage we set and up to a full, 100% match.

The find method either returns a new screen region object, or returns nothing. Thus, if we are unable to find a match to the file relative to target, found will remain undefined (null). So in this case, we simply return false if found is null (no match) or true of found is assigned a new screen region (we had a match).

Putting it all together:

To completely incorporate this behavior into our test framework, we write a simple cucumber step definition that allows us to call our Sikuli helper method, and provide a local image file as an argument for which to compare it against the current, active screen.

Here’s what the cucumber step looks like:

public class ScreenShotSteps {

    SikuliHelper sk = new SikuliHelper();

    //Given the image &quot;X&quot; can be found on the screen
    @Given(&quot;^the image \&quot;([^\&quot;]*)\&quot; can be found on the screen$&quot;)
    public void the_image_can_be_found_on_the_screen(String arg1) {

        String screenShotDir=null;

        try {
            screenShotDir = PropertyLoader.loadProperty(&quot;screenshot.path&quot;).toString();
        catch (IOException e) {

        Assert.assertTrue(sk.screenMatch(screenShotDir + arg1));
    new ImageTarget(new File(targetPath));

We’re referring to the image file via regex. The step definition makes an assertion using TestNG that the value returned from our instance of SikuliHelper’s screen match method is true (Success!!!). If not, TestNG throws an exception and our test will be marked as having failed.

Finally, since we already have cucumber steps that let us invoke and direct Webdriver to a live site, we can write a test that looks like the following:

Feature: Screen Shot Test
As a QA tester
I want to do screen compares
So I can be a boss ass QA tester

Scenario: Find the nav element on BV's home page
Given I visit &quot;;
Then the image &quot;screentest1.png&quot; can be found on the screen
new ImageTarget(new File(targetPath));

In this case, the image we are attempting to find is a portion of the nav element on BV’s home page:



This is not a full-stop solution to cross browser UI testing. Instead, we want to use Sikuli and tools like it to reduce overall manual testing as much as possible (as reasonably as possible) by giving the option to pre-warn product development teams of UI discrepancies. This can help us make better decisions on how to organize and allocate testing resources – manual and otherwise.

There are caveats to using Sikuli. The most explicit caveat is that tests designed with it cannot run heedlessly – the test tool requires a real, actual screen to capture and manipulate.

Obviously, the other possible drawback is the required maintenance of local image files you will need to check into your automation project as test artifacts. How deep you will be able to go with this type of testing may be tempered by how large of a file collection you will be able to reasonably maintain or deploy.

Despite that, Sikuli seems to have a large number of powerful features, not limited to being able to provide some level of mobile device testing. Check out the project repository and documentation to see how you might be able to incorporate similar automation code into your project today.

Predictively Scaling EC2 Instances with Custom CloudWatch Metrics

One of the chief promises of the cloud is fast scalability, but what good is snappy scalability without load prediction to match? How many teams out there are still manually switching group sizes when load spikes? If you would like to make your Amazon EC2 scaling more predictive, less reactive and hopefully less expensive it is my intention to help you with this article.

Problem 1: AWS EC2 Autoscaling Groups can only scale in response to metrics in CloudWatch and most of the default metrics are not sufficient for predictive scaling.

For instance, by looking at the CloudWatch Namespaces reference page we can see that Amazon SQS queues, EC2 Instances and many other Amazon services post metrics to CloudWatch by default.

From SQS you get things like NumberOfMessagesSent and SentMessageSize. EC2 Instances post metrics like CPUUtilization and DiskReadOps. These metrics are helpful for monitoring. You could also use them to reactively scale your service.

The downside is that by the time you notice that you are using too much CPU or sending too few messages, you’re often too late. EC2 instances take time to start up and instances are billed by the hour, so you’re either starting to get a backlog of work while starting up or you might shut down too late to take advantage of an approaching hour boundary and get charged for a mostly unused instance hour.

More predictive scaling would start up the instances before the load became business critical or it would shut down instances when it becomes clear they are not going to be needed instead of when their workload drops to zero.

Problem 2: AWS CloudWatch default metrics are only published every 5 minutes.

In five minutes a lot can happen, with more granular metrics you could learn about your scaling needs quite a bit faster. Our team has instances that take about 10 minutes to come online, so 5 minutes can make a lot of difference to our responsiveness to changing load.

Solution 1 & 2: Publish your own CloudWatch metrics

Custom metrics can overcome both of these limitations, you can publish metrics related to your service’s needs and you can publish them much more often.

For example, one of our services runs on EC2 instances and processes messages off an SQS queue. The load profile can vary over time; some messages can be handled very quickly and some take significantly more time. It’s not sufficient to simply look at the number of messages in the queue as the average processing speed can vary between 2 and 60 messages per second depending on the data.

We prefer that all our messages be handled within 2 hours of being received. With this in mind I’ll describe the metric we publish to easily scale our EC2 instances.

ApproximateSecondsToCompleteQueue = MessagesInQueue / AverageMessageProcessRate

The metric we publish is called ApproximateSecondsToCompleteQueue. A scheduled executor on our primary instance runs every 15 seconds to calculate and publish it.

private AmazonCloudWatchClient _cloudWatchClient = new AmazonCloudWatchClient();


PutMetricDataRequest request = new PutMetricDataRequest()
  .withMetricData(new MetricDatum()
  .withDimensions(new Dimension()


In our CloudFormation template we have a parameter calledDesiredSecondsToCompleteQueue and by default we have it set to 2 hours (7200 seconds). In the Auto Scaling Group we have a scale up action triggered by an Alarm that checks whether DesiredSecondsToCompleteQueue is less than ApproximateSecondsToCompleteQueue.

"EstimatedQueueCompleteTime" : {
  "Type": "AWS::CloudWatch::Alarm",
  "Condition": "HasScaleUp",
  "Properties": {
    "Namespace": "Custom/Namespace",
    "Dimensions": [{
      "Name": "QueueName",
      "Value": { "Fn::Join" : [ "", [ {"Ref": "Universe"}, "-event-queue" ] ] }
    "MetricName": "ApproximateSecondsToCompleteQueue",
    "Statistic": "Average",
    "ComparisonOperator": "GreaterThanThreshold",
    "Threshold": {"Ref": "DesiredSecondsToCompleteQueue"},
    "Period": "60",
    "EvaluationPeriods": "1",
    "AlarmActions" : [{
      "Ref": "ScaleUpAction"


Visualizing the Outcome

What’s a cloud blog without some graphs? Here’s what our load and scaling looks like after implementing this custom metric and scaling. Each of the colors in the middle graph represents a service instance. The bottom graph is in minutes for readability. Note that our instances terminate themselves when there is nothing left to do.

Screen Shot 2015-04-17 at 11.37.21 AM

I hope this blog has shown you that it’s quite easy to publish your own CloudWatch metrics and scale your EC2 AutoScalingGroups accordingly.

Upgrading Dropwizard 0.6 to 0.7

At Bazaarvoice we use Dropwizard for a lot of our java based SOA services. Recently I upgraded our Dropwizard dependency from 0.6 to the newer 0.7 version on a few different services. Based on this experience I have some observations that might help any other developers attempting to do the same thing.

Package Name Change
The first change to look at is the new package naming. The new io.dropwizard package replaces com.yammer.dropwizard. If you are using codahale’s metrics library as well, you’ll need to change com.yammer.metrics to com.codahale.metrics. I found that this was a good place to start the migration: if you remove the old dependencies from your pom.xml you can start to track down all the places in your code that will need attention (if you’re using a sufficiently nosy IDE).

- com.yammer.dropwizard -> io.dropwizard
- com.yammer.dropwizard.config -> io.dropwizard.setup
- com.yammer.metrics -> com.codahale.metrics

Class Name Change
aka: where did my Services go?

Something you may notice quickly is that the Service interface is gone, it has been moved to a new name: Application.

- Service -> Application

Configuration Changes
The Configuration object hierarchy and yaml organization has also changed. The http section in yaml has moved to server with significant working differences.

Here’s an old http configuration:

  port: 8080
  adminPort: 8081
  connectorType: NONBLOCKING
      enabled: true
      enabled: true
      archive: false
      currentLogFilename: target/request.log

and here is a new server configuration:

    - type: http
      port: 8080
    - type: http
      port: 8081
      - type: console
      - type: file
        currentLogFilename: target/request.log
        archive: true

There are at least two major things to notice here:

  1. You can create multiple connectors for either the admin or application context. You can now serve several different protocols on different ports.
  2. Logging is now appender based, and you can configure a list of appenders for the request log.

Speaking of appender-based logging, the logging configuration has changed as well.

Here is an old logging configuration:

    enabled: true
    enabled: true
    archive: false
    currentLogFilename: target/diagnostic.log
  level: INFO
    "org.apache.zookeeper": WARN
    "com.sun.jersey.spi.container.servlet.WebComponent": ERROR

and here is a new one:

  level: INFO
    "org.apache.zookeeper": WARN
    "com.sun.jersey.spi.container.servlet.WebComponent": ERROR
    - type: console
    - type: file
      archive: false
      currentLogFilename: target/diagnostic.log

Now that you can configure a list of logback appenders, you can write your own or get one from a library. Previously this kind of logging configuration was not possible without significant hacking.

Environment Changes
The whole environment API has been re-designed for more logical access to different components. Rather than just making calls to methods on the environment object, there are now six component specific environment objects to access.

JerseyEnvironment jersey = environment.jersey();
ServletEnvironment servlets = environment.servlets();
AdminEnvironment admin = environment.admin();
LifecycleEnvironment lifecycle = environment.lifecycle();
MetricRegistry metrics = environment.metrics();
HealthCheckRegistry healthCheckRegistry = environment.healthChecks();

AdminEnvironment extends ServletEnvironment since it’s just the admin servlet context.

By treating the environment as a collection of libraries rather than a Dropwizard monolith, fine-grained control over several configurations is now possible and the underlying components are easier to interact with.

Here is a short rundown of the changes:

Lifecycle Environment
Several common methods were moved to the lifecycle environment, and the build pattern for Executor services has changed.


     ExecutorService service = environment.managedExecutorService("worker-%", minPoolSize, maxPoolSize, keepAliveTime, duration);
     ExecutorServiceManager esm = new ExecutorServiceManager(service, shutdownPeriod, unit, poolname);
     ScheduledExecutorService scheduledService = environment.managedScheduledExecutorService("scheduled-worker-%", corePoolSize);


     ExecutorService service = environment.lifecycle().executorService("worker-%")
     ExecutorServiceManager esm = new ExecutorServiceManager(service, Duration.seconds(shutdownPeriod), poolname);
     ScheduledExecutorService scheduledExecutorService = environment.lifecycle().scheduledExecutorService("scheduled-worker-%")

Other Miscellaneous Environment Changes
Here are a few more common environment configuration methods that have changed:



environment.addHealthCheck(new DeadlockHealthCheck());

environment.addFilter(new LoggerContextFilter(), "/loggedpath");

environment.addServlet(PingServlet.class, "/ping");



environment.healthChecks().register("deadlock-healthcheck", new ThreadDeadlockHealthCheck());

environment.servlets().addFilter("loggedContextFilter", new LoggerContextFilter()).addMappingForUrlPatterns(EnumSet.allOf(DispatcherType.class), true, "/loggedpath");

environment.servlets().addServlet("ping", PingServlet.class).addMapping("/ping");

Object Mapper Access

It can be useful to access the objectMapper for configuration and testing purposes.


ObjectMapper objectMapper = bootstrap.getObjectMapperFactory().build();


ObjectMapper objectMapper = bootstrap.getObjectMapper();

This has changed a lot, it is much more configurable and not quite as simple as before.

HttpConfiguration httpConfiguration = configuration.getHttpConfiguration();
int applicationPort = httpConfiguration.getPort();


HttpConnectorFactory httpConnectorFactory = (HttpConnectorFactory) ((DefaultServerFactory) configuration.getServerFactory()).getApplicationConnectors().get(0);
int applicationPort = httpConnectorFactory.getPort();

Test Changes
The functionality provided by extending ResourceTest has been moved to ResourceTestRule.

import com.yammer.dropwizard.testing.ResourceTest;

public class Dropwizard6ServiceResourceTest extends ResourceTest {
  protected void setUpResources() throws Exception {
    addFeature("booleanFeature", false);
    addProperty("integerProperty", new Integer(1));


import io.dropwizard.testing.junit.ResourceTestRule;
import org.junit.Rule;

public class Dropwizard7ServiceResourceTest {

  ResourceTestRule resources = setUpResources();

  protected ResourceTestRule setUpResources() {
    return ResourceTestRule.builder()
      .addFeature("booleanFeature", false)
      .addProperty("integerProperty", new Integer(1))

Dependency Changes

Dropwizard 0.7 has new dependencies that might affect your project. I’ll go over some of the big ones that I ran into during my migrations.

Guava 18.0 has a few API changes:

  • Closeables.closeQuietly only works on objects implementing InputStream instead of anything implementing Closeable.
  • All the methods on HashCodes have been migrated to HashCode.

Metric 3.0.2 is a pretty big revision to the old version, there is no longer a static Metrics object available as the default registry. Now MetricRegistries are instantiated objects that need to be managed by your application. Dropwizard 0.7 handles this by giving you a place to put the default registry for your application: bootstrap.getMetricRegistry().

Compatible library version changes
These libraries changed versions but required no other code changes. Some of them are changed to match Dropwizard dependencies, but are not directly used in Dropwizard.



Coursera Metrics-Datadog


Apache Curator

Amazon AWS SDK

Future Concerns
Dropwizard 0.8
The newest version of Dropwizard is now 0.8, once it is proven stable we’ll start migrating. Hopefully I’ll find time to write another post when that happens.

Thank You For Reading
I hope this article helps.