Three Takeaways from CSSConf 2016

This year Bazaarvoice sponsored CSSConf 2016 in beautiful Boston, MA, USA and I was able to attend!
Here are my three top takeaways from CSSConf 2016:

Flexy Flexy Flexbox

A little over a year ago, our application team wasn’t sure how “stable” Flexbox or its spec were: there was already an old syntax, a new syntax, and a weird IE10 “tweener” syntax.

The layout advantages Flexbox brought were strong enough (*cough* vertical centering) that we decided to move forward with it and prefix all the things. Now browser support is so good that if you can drop IE8 and work around some known IE11 bugs, there is no reason not to use Flexbox in your designs right now.

A great reference I keep going back to for Flexbox is this css-tricks guide. Here are some other tips and tricks from the conference:

  • Flexbox is now available in Bootstrap 4
  • Use CSS Grid (when it becomes available) for major page layout, and Flexbox for UI elements
  • For mobile / small screens: add a media query and set the flex-direciton to column to stack your cells instead
  • Do as much as you can on the container to keep your code DRY (Don’t Repeat Yourself)
  • We can finally get rid of that pesky col-2, col-8, col-crapIAddedWrong grid system!

For more information, I recommend CSS4 Grid: True Layout Finally Arrives by Jen Kramer and It’s Time To Ditch The Grid System by Emily Hayman.

Stop Thinking In Pixels

The talk Stop Thinking In Pixels by Keith Grant was particularly enlightening.
The basic premise was not to micromanage your CSS:

Without fully understanding what CSS is doing for us, we try to push through it to control exactly what is going on in the browser.

Driving this point home, Keith recommended to stop thinking in pixels because

The pixels don’t matter. Let the browser do it.

You should instead be thinking in terms of the em and rem. Tools that simply convert px to em aren’t the answer either — you’re still thinking in terms of pixels and therefore missing out on some important benefits. Instead, learn from something like type scale and approach measurements with a fresh perspective.

I recommend watching the talk in full, but a quick cheatsheet follows:

Property Recommended Unit
font-size rem
padding, margin, border-radius, etc. em
border-width px

When in doubt, use em.

To summarize,

Ems are the most powerful when you fully embrace them.

Apps vs Documents

In this day and age we are all used to thinking in terms of “apps”. But the trinity of HTML, CSS, and JS was not conceived in this day and age. Two great quotes I wrote down from Component-Based Style Reuse by Pete Hunt are

CSS is great for documents, maybe not 2016 Apps


If you sat down and created styling in 2016, you would not come up with CSS

Our newest applications are written in React, which encourages developers to think of things in terms of components — pieces of UI that are reusable in different contexts. The Cascading part of CSS interferes with that, however: depending on the context your component is dropped into, it may look drastically different across usages. When that is not what you want, Pete’s ideas center around reusing components, not CSS classes.

As you can imagine, this idea is largely controversial in a conference with a name like CSSConf, but I will continue to keep my eye on it. Pete’s thought leadership on this topic inspires me to challenge norms and dare to envision things differently. After all, if we’re constantly fighting with our tool (CSS), that tool may not be right for the job.

Thanks for reading! For a full list of talks and slides from the conference, check out

The one thing you need that will make you and your team successful

Over the last 20 years of my career, I have worked with a lot of different people and lot of different teams. Some were very successful, and some were not. I am always trying to understand what makes successful people tick, and what I can do differently to be more successful.

Your Energy

The one thing that I have found that consistently is the determining factor as to whether a person or team will be successful is their energy. Energy takes a lot of different forms and states. There is high energy and low energy, positive energy and negative energy. Everything in the universe is energy. Everything has a force and a pull and a gravity. Every person has an energy. Some people call it a “spark”, or “Spirit”, or “vibration”. If you pay close enough attention, you can see it, and sometimes you can even feel it. Have you ever walked into a room where a group of people had been discussing something serious, and you can feel the negative energy in the room. Have you ever worked with a very successful charismatic leader who just seemed to attract winners? Why is that? What is that?

The most successful people and teams that I have encountered or had the privilege to work with have very positive and high energy. Everyone on the team is excited and passionate to be working there. They love what they do, they love working with the people on their team, they know they are going to win, they know they are making a positive difference in the world, and they have fun winning.

That kind of positive high energy feeds and builds on itself. The more positive high energy people you get together, the better the team will be. It’s like waves in an ocean oscillating at the same frequency. They multiply the goodness. These are the people who see the future, and the solutions and the answers and choose not to focus on the past or the problems.

How does your energy affect others around you? How does other’s energy affect you? How does your boss’s energy affect you?

Emotional Contagion

The opposite of positive high energy, is negative low energy. It manifests itself in people who fixate on the past and the problems. They are the people who think that it can’t be done. They are the “We’ve tried that before and it didn’t work” people. They are the naysayers. They typically start all new requests for change with “No!”. They are always bickering or blaming others or finger pointing or micromanaging.

There are many possible reasons for a person’s negativity. It could be their FUD (Fear Uncertainty and Doubt) about the situation. It could be their fear of failure, or their fear of embarrassment, or their fear of looking stupid in front of their peers, or their insecurity in their position.

The truth is that your emotional state is affected by those around you. Just like positive energy builds on itself, so does negative energy. It’s like a rotten apple, or a cancer. It grows and infects others who are near it. I have seen one negative energy person bring down an entire team.

And as with any cancer, it has to be cut out quickly. It’s easy to say, but hard to do. I have seen too many good managers who I respected wait way too long letting a negative energy person fester in the team.

It’s Up to You

The reality is that you have the power to choose how you exert your own energy. You have the power to choose who you want to work with. Not everyone is positive high energy all the time. We all have our good days and bad days, but you can choose to be passionate and excited and have a high positive energy, or you can be the opposite.

So how can you get started growing your positive high energy?

  • Acknowledge that we aren’t perfect, but we can try to make things better.
  • Ignore the naysayers
  • Don’t fixate on what other people think (or what you think they think about you)
  • Do what you think is right
  • Stay in the present, and dream of the future. You can’t live in the past.
  • Acknowledge it, learn from it, and move on
  • Imagine an amazing future. What would that look like? Feel like?
  • Exercise and eat healthy
  • Use positive words. Say Thank You.
  • Smile and laugh
  • Play games and have fun.

Next Up

Tell me what part of our story you want to hear next. How do you build a team and culture that enables you to execute on your vision? Follow me on twitter @bchagoly and @bazaarvoicedev to be the first to read new related posts and to join the conversation.

Intern Demo Day

As the summer comes to an end, so do the internships for numerous university students here at Bazaarvoice. This past week, the interns were given an opportunity to present a summary of their accomplishments. This afternoon of presentations, known as the Bazaarvoice “Intern Demo Day”, highlighted the various achievements throughout the company, not just in the R&D department.

The following is a short summary of the great work our interns complete this summer as well as some images from the “Intern Demo Day”.

CHASE PORTER: My project, which I have named “The Great (red)Shift”, is intended to improve data accessibility for computed aggregated counts of various canonical events written to HBase. To do this I designed a data warehouse in Amazon Redshift that I loaded with transformed aggregated counts extracted from the tables in HBase. This makes the counts readily SQL query-able in an incredibly fast system whereas before they had to be computed with performance heavy queries from Raw Logs generated by Cookie Monster. The biggest block for this project was in processing the data from HBase which was stored as serialized bytes and needed to be handled uniquely for different types of canonical events (i.e. pageviews, impressions, features) to translate into a readable form for Redshift.

BEN DEVORE: My product is web crawler written in node.js that scrapes clients’ webpages for product data in order to build their product feeds for them. For many of Bazaarvoice’s smaller clients, building and maintaining their product feed is a significant obstacle in the onboarding process. This tool aims to clear that obstacle by taking this task out of there hands.

STONEY MCCRARY: So I have been fortunate enough to get to work on several different pieces in curations but I am going to talk on what I have been hammering on for the last couple of weeks. More and more of our high volume clients are receiving millions of hits a day and this has caused performance to become a higher priority problem for them. In response to this, we are focusing our efforts on building a new display with performance in mind. Performance for the display centers around only providing the minimal amount of data needed and supply the rest as necessary. The piece I will be showing is the display carousel and how it dynamically loads and dumps the data to allow for faster loading and to keep browser memory low.

ZESHAN ANWAR: Eagle is a dashboard built for our Incubator team. With so many moving parts, it was important we had a summarized ‘birds-eye’ view of the team in one place. Eagle was initially meant to be an aggregation of all our Jenkin builds; a single view of all our jobs across our different Jenkins environments. However, it grew to also include JIRA and GitHub statistics. My other project was optimizing our UI tests by having them run concurrently. Our old UI tests were extremely slow, and by running them in parallel we drastically reduced test times.



BRENDON KELLEY: Testing Framework: This summer my project was to help build out a new testing framework for Curations. The current automation tests used for Curations is Saladhands. Before my internship, there wasn’t much if any automation tests for the submission/direct upload capability of Curations. I worked on creating tests and a CI environment for submission in a new testing framework called Intern. One of tests includes a language translation test using mongoDB as an endpoint to store the various languages’ text. Intern is a javascript based testing framework which will allow developers to contribute to writing tests since Curations is mostly javascript. I’ve also worked on updating and creating new console tests in this framework. The foundation built this summer in Intern will enable the ability to further contribute to the framework.

KRYSTINA DIAO: My main project for the summer was to analyze and report the effectiveness of the implementation of the new Connections Knowledgebase. Through Salesforce, I collected and analyzed the number of cases, time spent on each case, etc. After drawing my conclusions, I decided to present my findings via data visualization methods (JavaScript’s C3 and D3 libraries) and provide actionable insights on how this information can be leveraged. This information is valuable in that it can be used for future product KB decisions, as well as understanding how much time, manpower, and money is saved by having a KB.

Krystina Diaos (3)

Krystina Diaos (2)

Krystina Diaos

MARKO SAVIC: Over the summer, I was a part of the SEO Team. I managed to create a tool on pagesManaged and keywordsManaged feed for every Spotlights client. Generated feeds will be consumed by SeoClarity tools on a daily basis. This helps in identifying search rank gains on the specified keywords and pages where Spotlights are present. The SeoClarity reporting will help in proving out Spotlights value and eventually lead to Spotlights renewal/upticks.Also, I created algorithm tweaks on the PINS (Post Interaction Notification System) Generator that take into account product activeness, product relevancy and review count, and use them to ask the user to write reviews on the most relevant products.

TREVOR NELLIGAN: Here is a description of my project: I worked on the Aperture Component library and many of the projects it supports. Aperture is build in React, and its purpose is to be used as an internal Bazaarvoice tool for constructing web pages. Using Aperture, anyone at Bazaarvoice can easily create a functional, intuitive, Bazaarvoice themed webpage, all with the building blocks Aperture provides.

Using the Aperture library, I helped the construction of numerous pages for the curations beta console. I personally built the interface for a new client-facing template builder, which will allow clients to create curations templates quickly and easily without having to go through an implementation engineer and a long process, as was the case previously. I also supplied custom Aperture components for several projects, like the content curation beta page.

RAMIE RAUFDEEN: The mixer is a component of our product recommendations engine which differentiates shoppers, and optimizes recommendations for them. This is primarily derived from their shopping behavior – in real time. Prior to the mixer, product recommendations were aggregated from multiple sources, using the same algorithm for every shopper. Shoppers are now categorized based off of a set of rules (using the shopper’s profile data), each of the rules map to a plan (which you can think of as an ‘algorithm’). A plan defines how recommendations should be mixed from each of the sources. For example, if source B has proven to have a higher conversion rate for ‘heavy-shoppers’, the plan for ‘heavy-shopper’ would give a higher weighting to source B. We can now target specific types of shoppers when it comes to product recommendations. This also sets the groundwork for a more granular machine learning implementation in the future.


We want to thank all the interns who spent time with us this summer and wish you the best back at school. We look forward to hearing about all the great things you all develop in the future.

If you are interested in an internship at Bazaarvoice, please contact

Don’t Look Back in Anger: Retrospectives, Software Development and How Your Team Can Improve

Retrospective – This term can elicit a negative response in people in the software development industry (verbally and physically).  After all, it is a bit of a loaded term.  Looking back can be painful especially since that usually means looking back at mistakes, missteps and decisions we might want to take back.

I have worked for over a decade in software development and information technology fields and during that time I’ve been involved in many meetings you could label as ‘retrospective’.  Some of these have been great, others terrible.  Some were over-long sessions of moaning and wailing resulting in a great deal of sound and fury, signifying little while others have been productive discussions that led to positive, foundational change in the way a team operated.

Looking back at all of that, I’ve realized a few common truths to the process of retrospectives and how it relates to building software:

  • They’re necessary – there’s always room for improvement.
  • Don’t focus on the negative – focus on the constructive (ie: “how we can improve”)
  • You need an actionable to-do-list – figure out what to change, quickly, then act on it

For the past five months, my team has been employing a process to facilitate retrospectives based around the three bullet points above to foment positive change in the way our team works.  This blog post will detail how we do this.  This is not to say, “you should perform retrospectives and you should do them this exact way”, because every software team works differently.  This however is to say, “here’s an example of how the retrospective process was done with success”.  Maybe it could work for you or give you ideas on how you can form your own process that works best for your team.

Step 1: Recognize!








OK, let’s get the grisly stuff out of the way first:  Your team needs to improve.  And that is OK – because no team is perfect and no matter how well we execute on our goals, there’s always room to improve.  In fact, if your team is not at some regular interval assessing or reassessing how you design software, you’re probably ‘doing it wrong’ because (can you smell the theme I’m cooking up here yet?) – there’s always room to improve.  If you’re exploring or actively participating management concepts like Kanban/continuous improvement, this process is essential.

Step 2: Togetherness

OK – you want to improve your process and are ready to discuss how that can be done.  How does the rest of your team feel about that?  You can drag a horse to water and you can drag your team mates into a meeting but you can’t make either drink (unless you’re looking to rack up some workplace violations and future sensitivity training).

It’s important to get a consensus from the entire team in question before proceeding with a retrospective.  This will actually be easier than some might think.  After all – in software engineering, we’re all trying to solve complex problems.  This is just another problem that needs figuring out.

Before going further, you’re going to need a few things:

  • Somewhere to meet
  • Someone to take notes
  • Someone to keep time
  • Somewhere to post your meeting notes (Jira, Confluence page, post-it notes, whiteboard, whatever works for you)
  • Some time (30 minutes to 1 hour, depending on the size of your team)

Step 3: Good Vibrations


Come on, you know you loved this song back in the day.

The tough stuff is out of the way at this point.  You and your team have decided to fine tune your process and have met in a room together to do so (the 2 boxes of Shipley’s donuts you brought with you is barely a tangent in this case).









When we started performing retrospectives regularly on the Shopper Marketing team, we started our discussion by talking about our recent successes as a team.

In this case, we as a team, one by one went around the room and listed one thing during the past period (sprint, release cycle, week, microfortnight, etc. – the interval doesn’t matter, whatever works for your team) and offered 1 thing we felt the team did well.  Doing this has a couple of functions:

  • It helps the team jog its memory (a lot can happen in a short amount of time).
  • It helps celebrate what the team has done recently (wallow in your success!).
  • It sets the right tone (before we focus on what needs to change, lets give a shout out to what we’re doing right).

Note that, especially with larger teams, sometimes one’s perception of how well something might have gone might be very different from someone else’s.  If Jane, your engineering head for your web service says she thought the release of the bulk processor was on time and under budget but Bill, your UI lead contends that it in fact wasn’t on time, wasn’t under budget and wiped out an entire village and made some kids cry, then this is the point where you should pause the retro and have a brief discussion about the matter and possibly a follow-up later.  This leads us to our next step…


Nick Young asking the important questions of our time.

Step 4: What Could Be Better?

Now that you’re full of happy thoughts about what you’ve done well as a team (and full of donuts) its time to discuss change.  This process is the same as the previous step only this time, each team member must call out one thing they think the team could have done better.


Mmmmmmm... Donuts...

Note that this isn’t about what was bad, or what sucked, or so-and-so is a doofus and has bad breath – but what can the team do better.  In order to keep the discussion productive and expedient, each person is limited to 1 item and that item has to be something the team has direct agency over.  You also might find it handy to nominate a moderator for this point of discussion.

Here’s an example of a real issue that was brought up in a retrospective at a previous company I worked for and the right and wrong way to frame it when it comes to focusing on ‘your team’s agency’.

Marketing and sales sold our client a bunch of features that our product doesn’t do and we had to work 90+ hours this week just to deliver it”

-Or –

“We should work and communicate closer with marketing and sales so they understand our product features as well as the effort required in developing said features”

While the first statement may be true it’s the second one that actually encapsulates the problem in a way the team has a manner of dealing with.  I found that focusing strongly on the positive aspects of a retrospective discussion helps some teams to self-align toward reaching toward the kind of perspective found in the latter statement above than the former.

Other teams I found realized the need to appoint a moderator to help keep the discussion focused.  Its important to figure out what sort of need your team has in this regard early in the retrospective process.  This might seem tough at first (and emotionally charged) and that’s OK.  What’s important is to keep a positive frame of mind to work through this discussion.

As previously stated, if there appears to be a major disconnect between team members regarding issues that could have been done better, this is a good time to discuss that and hopefully iron out any misunderstandings or make plans to do so soon after the retrospective.

Whoever is taking notes, I hope you’re getting all of this because we’re about to tie it all together.


Step 5: Decide

By now, you and your team will have come up with a good deal of input on what the team did well and what could be improved.  You might be surprised but having done this numerous times, often a very clear pattern to what the team though went well and could have improved on appears and does so quickly.

As a team, go over the list of items to potentially improve on and note which ones are the strongest, common denominator.  Pick 1 or 2 of these and this will provide some focus for the team for the next period of work (sprint, cycle, dog year, etc.).  These can be process-oriented intangibles or even technical issues.  Here are some examples:

Our front end engineers were idle during much of the sprint waiting on services to be updated”


We spend 1-2 hours doing process Y manually every week and we should probably automate it

The first item has to do with improving the team’s process of how it interacts with itself.  The other is a clear, intangible need that can be solved using a technical solution.


A high-valued issue.

Once the team has identified 1 or 2 high-valued issues they feel need to be addressed, its time to do just that.

Step 6: Act!

This is the most important step.  It’s one thing to identify a problem to be solved, another to actually act on it.  Hold a discussion as to what potential solutions to your issues might be.  The time keeper at this point should be keeping an eye on the clock to help the team keep the meeting moving forward in a productive manner.  At this step, try and keep the following in mind:

  • By now, whatever issues the team has identified that need to be addressed, it should be something the team has full agency to solve (the matter is firmly in the team’s hands)
  • Depending on the issue at hand, it may be something complex and the team might need a couple of tries to solve it. Don’t get discouraged if the issue appears gargantuan or if you can’t solve it within your next period of work alone.







Once you’ve brainstormed on solutions to how you and your team can make improvements, it’s time to really act.  In this case, you should have some ideas that can be transformed into action items (you get a Jira ticket and you get a Jira ticket and…).  The retro should conclude with the team being able to break down a solution for improving into 1 or more workable tasks and then assigning them to team members for the next sprint, cycle or galactic year.

The key to making this all work is that last part.  What comes out of the retrospective should be treated as any other item of work your team normally commits to in the confines of your regular working period (sprint, release, Mayan lunar month, etc.) with one or more team members responsible for delivering the solution, whether technical or otherwise.

Now you’re ready to move forward with improving as a team and dive into that post-donut food coma.


Woooo! Food coma!

Step 7: Going Forward

Whether you use this process or some other framework to run a retrospective, repeating that process is very important.  To what degree or frequency your team does that is for the team to decide.  Some teams I’ve worked with performed retrospectives after every sprint, others only after major releases and some just whenever the team felt it was necessary.  The key is for the team to decide on that interval and after your initial retrospective, be sure to devote some time at the beginning of subsequent retrospectives to review your action items from the previous session (did you complete them and were the issues related to them resolved?).

Hopefully this post has given you an idea how your team can not only perform retrospectives but also improve your team’s working process and even look forward to looking back.

And on that note…

HackerX event hosted at Bazaarvoice

The Bazaarvoice headquarters hosted the July 20th HackerX event in Austin, Texas. The event featured not only Bazaarvoice, but also included Facebook, Amazon, and Indeed. 70+ engineers participated in onsite interviews and networking. HackerX commented that “this was one of the most successful events” they have ever seen.

Gary Allison, Executive Vice President of Engineering, kicked off the event with a compelling message about Bazaarvoice and why this is an awesome place to work.

HackerX started in 2012 with invite-only, face-to-face recruiting events that connect tech talent with some of the world’s most innovative companies. Currently, they operate over 100+ events in 40+ cities, 15+ countries annually.

See for additional information.




Injecting Applications onto Third-Party Webpages Made Easy

Bazaarvoice’s Small Web App Technologies (SWAT) team is pleased to announce that we are open sourcing swat-proxy – a tool to inject applications onto third-party webpages.

In third-party web application development it is difficult to be certain how our applications will look and behave on a client’s webpage until they are implemented. Any number of things could interfere – including other third-party applications! Delivering applications that don’t work can obviously have a severe negative impact on both our clients and us.

One solution is to inject – or proxy – our applications onto the client’s web page. This way we can ensure they work correctly – before they go into production. I wrote swat-proxy to do exactly that, acting as a man-in-the-middle between browser and web server. As the browser requests web pages from the server, swat-proxy intercepts the response and proxies our application into it. The browser renders the web page as if it contained our application all along – exactly simulating the client having implemented it. Now we can be certain how our applications will look and behave.

Other tools exist to accomplish this task, but none are as front-end developer-friendly as swat-proxy: it is written entirely in Javascript – plugging in nicely to our existing workflows – and uses familiar CSS selectors to target DOM elements when injecting content. It is run locally using NodeJS and is very easy to use.

We have found swat-proxy to be incredibly useful when rapidly iterating on prototypes and ensuring the behavior of our applications before they are released to production – we hope you do too! We are releasing it to the larger world as open source, under the Apache 2.0 license. Please download it, try it out, and let us know what you think (in comments below, or as issues or pull requests on Github).

Why we do weekly demos

If you are part of an agile, or lean, or kanban development team, you probably do or have done demos at one point. Some people call them “end of sprint” demos. Some people call them “stakeholder” demos. We are pretty informal and irreverent about it at Bazaarvoice, and we just call them “demos” because giving them too formal of a name, or process will defeat the purpose. Demos are an amazingly valuable part of the development process, and I highly recommend that your team start doing them weekly.

Why not to do demos

There are a lot of reasons why not to do demos. If you are doing demos because of any (or all) of these reasons, you are probably doing it wrong. (Oh, did I mention that this is my personal opinionated viewpoint?)

  • We are forced to do demos so that management can keep an eye on us.
  • We do demos so management can make sure that each person is actually doing work.
  • We do polished demos for our sales and marketing team and they have to be perfect.
  • I don’t really know why we do demos, but my boss told me to, so I just show up.

Prep Work

Before you can do effective demos that help accelerate your development, your delivery, your quality and your culture, you need to set the stage for success. So how do you do that?

At Bazaarvoice, we believe in hiring smart, passionate owners that we can trust to do the right thing for the company, for our customers, and most importantly for our consumers. If everyone is bought into the vision and mission that you are all working to accomplish together, and if everyone is engaged and proactively helping to find creative solutions to those problems, then it is a lot easier to have productive weekly demos. You want to have an environment where everyone can speak without judgment, and where they know that their random idea will be heard and appreciated and not shot down. If everyone is coming from this place of openness, transparency and trust, then you can get some great demos.


We believe that demos are part of the creative, collaborative and iterative process of developing amazing software solutions. The point of the demo is it be a time when everyone on the team (and stakeholders) can get together and celebrate incremental progress. It’s a way to bring out all the current work out into the light and to discuss it. What’s good, what still needs work, what did we learn, what do we need to change, and how could we make it even better. It lets us course correct earlier before we fly into the mountain. And It gives others awareness of what’s going on and brings the “aha” moments.

It’s about learning, and questioning, and celebrating. Everyone who demos gets applause. All work that gets done is important and should be celebrated. All work and everyone means Engineers, QA, UX, Product Managers, Marketing, Documentation, Sales, Management, and anyone who completed something that helped move the cause forward.


I like to do demos at 4pm on Fridays. Ouch right?! Well that’s kind of the point. Doesn’t everyone just want to check out early for the weekend? Nope… Well let’s hope not. If people are leaving early and complaining, then pause and go back and re-read the “Prep Work” section.

We actually use that time because it’s the end of the week. It gives us the most time during the week to get things done, and it gives us free space after the meeting if we need to run long for a brainstorming session or to discuss things more deeply. We have 14 engineers/UX/QA on our team, and we schedule demos for 30 minutes. If we finish early, or not, we usually stick around for beers and games anyway until 6-7pm because we just like hanging out with each other. Imagine that.

So isn’t that great? You get to hang out with your friends, show off your accomplishments, and leave work for the week feeling pride in what was accomplished both personally and by the entire team. You are hopefully excited about the future, and probably brainstorming new ideas over the weekend from what you learned from the demos.


We do planning for our week on Monday mornings. That part is important too and it’s the ying to the demo’s yang. People get into the office on Monday morning and are fresh and ready to go and hungry to know what they can pick up next, so it’s the perfect time to set the stage for the week. What are our top priorities for the week? What do we absolutely have to get done by the end of the week (aka by demos on Friday afternoon)?

Planning is a great venue to ensure everyone is in sync with the vision and Product priorities. Product shares upcoming business epics, UX walks us through new mockups and user findings, Engineering discusses new platform capabilities that we should consider, and QA reminds us of areas we need to harder. It helps form this continuous cycle of planning and validation. It’s a bit odd, because we are very Kanban and flow oriented, but we have found that having some timeboxes around planning and demo’ing gives the team a sense of closure and accomplishment. It’s important to occasionally step back and assess the progress. We have found this sets a really good and sustainable cadence for a productive team.

Next Up

Tell me what part of our story you want to hear next. How do you build a team and culture that enables you to execute on your vision? Follow me on twitter @bchagoly and @bazaarvoicedev to be the first to read new related posts and to join the conversation.

How to seamlessly move 300 Million shoppers to a highly scalable architecture, part 2

Divide and Conquer

As Engineers, we often like nice clean solutions that don’t carry along what we like to call technical debt.  Technical debt literally is stuff that we have to go back to fix/rewrite later or that requires significant ongoing maintenance effort.  In a perfect world, we fire up the the new platform and move all the traffic over.  If you find that perfect world, please send an Uber for me. Add to this the scale of traffic we serve at Bazaarvoice, and it’s obvious it would take time to harden the new system.

The secret to how we pulled this off lies in the architecture choices to break apart the challenge into two parts: frontend and backend.  While we reengineered the front-end into the the new javascript solution, there were still thousands of customers using the template-based front end.  So, we took the original server side rendering code and turned it into a service talking to our new Polloi service.  This enabled us to handle request from client sites exactly like the Classic original system.

Also, we created a service improved upon the original API but was compatible from a specific version forward.  We chose to not try to be compatible for all version for all time, as all APIs go through evolution and deprecation.  We naturally chose the version that was compatible with the new Javascript front end.  With these choices made, we could independently decide when and how to move clients to the new backend architecture irrespective of the front-end service they were using.

A simplified view of this architecture looks like this:


With the above in place, we can switch a Javascript client to use the new version of the API through just changing the endpoint of the API key.  For a template-based client, we can change the endpoint to the new referring service through a configuration in our CDN Akamai.

Testing for compatibility is a lot of work, though not particularly difficult. API compatibility is pretty straight forward, which testing whether a template page renders correctly is a little more involved especially since those pages can be highly customized.  We found the most effective way to accomplish the later since it was a one time event was with manual inspection to be sure that the pages rendered exactly the same on our QA clusters as they did in the production classic system.

Success we found early on was based on moving cohorts of customers together to the new system. At first we would move a few at a time, making absolutely sure the pages rendered correctly, monitoring system performance, and looking for any anomalies.  If we saw a problem, we could move them back quickly through reversing the change in Akamai. At first much of this was also manual, so in parallel, we had to build up tooling to handle the switching of customers, which even included working with Akamai to enhance their API so we could automate changes in the CDN.

From moving a few clients at a time, we progressed to moving over 10s of clients at a time. Through a tremendous engineering effort, in parallel we improved the scalability of our ElasticSearch clusters and other systems which allowed us to move 100s of clients at a time, then 500 clients at time. As of this writing, we’ve moved over 5,000 sites and 100% of our display traffic is now being served from our new architecture.

More than just serving the same traffic as before, we have been able to move over display traffic for new services like our Curations product that takes in and processes millions of tweets, Instagram posts, and other social media feeds.  That our new architecture could handle without change this additional, large-scale use case is a testimony to innovative engineering and determination by our team over the last 2+ years. Our largest future opportunities are enabled because we’ve successfully been able to realize this architectural transformation.

Rearchitecting the Team

In addition to rearchitecting the service to scale, we also had to rearchitect our team. As we set out on this journey to rebuild our solution into a scalable, cloud based service oriented architecture, we had to reconsider the very way our teams are put together.  We reimagined our team structure to include all the ingredients the team needs to go fast.  This meant a big investment in devops – engineers that focus on new architectures, deployment, monitoring, scalability, and performance in the cloud.

A critical part of this was a cultural transformation where the service is completely owned by the team, from understanding the requirements, to code, to automated test, to deployment, to 24×7 operation.  This means building out a complete monitoring and alerting infrastructure and that the on-call duty rotated through all members of the team.  The result is the team becomes 100% aligned around the success of the service and there is no “wall” to throw anything over – the commitment and ownership stays with the team.

For this team architecture to succeed, the critical element is to ensure the team has all the skills and team players needed to succeed.  This means platform services to support the team, strong product and program managers, talented QA automation engineers that can build on a common automation platform, gifted technical writers, and of course highly talented developers.  These teams are built to learn fast, build fast, and deploy fast, completely independent of other teams.

Supporting the service-oriented teams, a key element is our Platform Infrastructure team we created to provide a common set of cloud services to support all our teams.  Platform Infrastructure is responsible for the virtual private cloud (VPC) supporting the new services running in amazon web services. This team handles the overall concerns of security, network, service discovery, and other common services within the VPC. They also set up a set of best practices, such as ensuring all cloud instances are tagged with the name of the team that started them.

To ensure the best practices are followed, the platform infrastructure team created “Beavers” (a play on words describing a engineer at Bazaarvoice, a “BVer”). An idea borrowed from Netflix’s chaos monkeys, these are automated processes that run and examine our cloud environment in real time to ensure the best practices are followed.  For example, the “Conformity Beaver” runs regularly and checks to make sure all instances and buckets are tagged with team names. If it finds one that is not, it infers the owner and emails team aliases of the problem.  If not corrected, Conformity Beaver can terminate the instance.  This is just one example of the many Beavers we have created to help maintain consistency in a world where we have turned teams lose to move as quickly as possible.

An additional key common capability created by the Platform Infrastructure team is our Badger monitoring services. Badger enables teams to easily plug in a common healthcheck monitoring capability and can automatically discover nodes as they are started in the cloud. This service enables teams to easily implement these healthcheck that is captured in a common place and escalated through a notification system in the event of a service degradation.

The Proof is in the Pudding

The Black Friday and Holiday shopping season of 2015 was one of the smoothest ever in the history of Bazaarvoice while serving record traffic. From Black Friday to Cyber Monday, we saw over 300 million visitors.  At peak on Black Friday, we were seeing over 97,000 requests per second as we served up over 2.6 billion review impressions, a 20% increase over the year before.  There have been years of hard work and innovation that preceded this success and it is a testimony to what our new architecture is capable of delivering.

Keys to success

A few ingredients we’ve found to be important to successfully pull off a large scale rearchitecture such as described here:

  • Brilliant people. There is no replacement for brilliant engineers who are fearless in adopting new technologies and tackling what some will say can’t be done.
  • Strong leaders – and the right leaders at the right time. Often the leaders that sell the vision and get an undertaking like this going will need to be supplemented with those that can finish strong.
  • Perseverance and Determination – building a new platform using new technologies is going to be a much bigger challenge than you can estimate, requiring new skills, new approaches, and lots of mistakes. You must be completely determined and focused on the end game.
  • Tie back to business benefit – keep business informed of the benefits and ensuring that those benefits can be delivered continuously rather than a big bang. It will be a large investment and it is important that the business see some level of return as quickly as possible.
  • Make space for innovation – create room for engineers to learn and grow. We support this through organizing hackathons and time for growth projects that benefit the individual, team, and company.

Reachitecture is a Journey

One piece of advice: don’t be too critical of yourself along the way; celebrate each step of the reachitecture journey. As software engineers, we are driven to see things “complete”, wrapped up nice and neat, finished with a pretty bow. When replacing an existing system of significant complexity, this ideal is a trap because in reality you will never be complete.  It has taken us over 3 years of hard work to reach this point, and there are more things we are in the process of moving to newer architectures. Once we complete the things in front of us now, there will be more steps to take since we live in an ever evolving landscape. It is important to remember that we can never truly be complete as there will be new technologies, new architectures that deliver more capabilities to your customers, faster, and at a lower cost. Its a journey.

Perhaps that is the reason many companies can never seem to get started. They understandably want to know “When will it be done?” “What is it going to cost?”, and the unpopular answers are of course, never and more than you could imagine.  The solution to this puzzle is to identify and articulate the business value to be delivered as a step in the larger design of a software platform transformation.  Trouble is of course, you may only realistically be able to design the first few steps of your platform rearchitecture, leaving a lot of technical uncertainty ahead. Get comfortable with it and embrace it as a journey.  Engineer solid solutions in a service oriented way with clear interfaces and your customers will be happy never knowing they were switched to the next generation of your service.

authored by Gary Allison

How to seamlessly move 300 Million shoppers to a highly scalable architecture, part 1

At Bazaarvoice, we’ve pulled off an incredible feat, one that is such an enormous task that I’ve seen other companies hesitate to take on. We’ve learned a lot along the way and I wanted to share some of these experiences and lessons in hopes they may benefit others facing similar decisions.

The Beginning

Our original Product Ratings and Review service served us well for many years, though eventually encountered severe scalability challenges. Several aspects we wanted to change: a monolithic Java code base, fragile custom deployment, and server-side rendering. Creative use of tenant partitioning, data sharding and horizontal read scaling of our MySQL/Solr based architecture allowed us to scale well beyond our initial expectations. We’ve documented how we have accomplished this scaling on our developer blog in several past posts if you’d like to understand more. Still, time marches on and our clients have grown significantly in number and content over the years. New use cases have come along since the original design: emphasis on the mobile user and responsive design, accessibility, the emphasis on a growing network of consumer generated content flowing between brands and retailers, and the taking on of new social content that can come in floods from Twitter, Instagram, Facebook, etc.

As you can imagine, since the product ratings and reviews in our system are displayed on thousands of retailer and brand websites around the world, the read traffic from review display far outweighs the write traffic from new reviews being created. So, the addition of clusters of Solr servers that are highly optimized for fast queries was a great scalability addition to our solution.

A highly simplified diagram of our classic architecture:

Highly simplified view of our Classic Architecture

However, in addition to fast review display when a consumer visited a product page, another challenge started emerging out of our growing network of clients. This network is comprised of Brands like Adidas and Samsung who collect reviews on their websites from consumers who purchased the product and then want to “syndicate” those reviews to a set of retailer ecommerce sites where shoppers can benefit from them. Aside from the challenges of product matching which are very interesting, under the MySQL architecture this could mean the reviews could be copied over and over throughout this network. This approach worked for several years, but it was clear we needed a plan for the future.

As we grew, so did the challenge of an expanding volume of data in the master databases to serve across an expanding network of clients. This, together with the need to deliver more front-end web capability to our customers, drove us to what I hope you will find is a fascinating story of rearchitecture.

The Journey Begins

One of the first things we decided to tackle was to start moving analytics and reporting off the existing platform so that we could deliver new insights to our clients showing how reviews are used by shoppers in their purchase decisions. This choice also enabled us to decouple the architecture and spin up parallel teams to speed delivery. To deliver these capabilities, we adopted big data architectures based on Hadoop and HBase to be able to assimilate hundreds of millions of web visits into analytics that would paint the full shopper journey picture for our clients. By running map reduce over the large set of review traffic and purchase data, we are able to give our clients insight into these shopper behaviors and help our clients better understand the return on investment they receive from consumer generated content. As we built out this big data architecture, we also saw the opportunity to offload reporting from the review display engine. Now, all our new reporting and insight efforts are built off this data and we are actively working to move existing reporting functionality to this big data architecture.

On the front end, flexibility and mobile was a huge driver in our rearchitecture. Our original template-driven, server-side rendering can provide flexibility, but that ultimate flexibility is only required in a small number of use cases. For the vast majority, a client-side rendering via javascript with behavior that can be configured through a simple UI would yield a better mobile-enabled shopping experience that’s easier for clients to control. We made the call early on not to try to force migration of clients from one front end technology to another. For one thing, it’s not practical for a first version of a product to be 100% feature function capable to the predecessor. For another, there was just simply no reason to make clients choose. Instead, as clients redesigned their sites and as new clients were onboard, they opt’ed in to the new front end technology.

We attracted some of the top javascript talent in the country to this ambitious undertaking. There are some very interesting details of the architecture we built that have been described on our developer blog and that are available as open source projects on in our bazaarvoice github organization. Look for the post describing our Scoutfile architecture in March of 2015. The BV team is committed to giving back to the Open Source community and we hope this innovation helps you in your rearchitecure journey.

On the backend, we took inspiration from both Google and Netflix. It was clear that we needed to build an elastic, scalable, reliable, cloud-based data store and query layer. We needed to reorganize our engineering team into autonomous service oriented teams that could move faster. We needed to hire and build new skills in new technologies. We needed to be able to roll this out as transparently as possible to our clients while serving live shopping traffic so no one knows its happening at all. Needless to say, we had our work cut out for us.

For the foundation of our new architecture, we chose Cassandra, an Open Source NoSQL data solution based on influence of ideas from Google and their BigTable architecture. Cassandra had been battle hardened at Netflix and was a great solution for a cloud resilient, reliable storage engine. On this foundation we built a service we call Emo, originally intended for sentiment analysis. As we made progress towards delivery, we began to understand the full potential of Cassandra and its NoSQL based architecture as our primary display storage.

With Emo, we have solved the potential data consistency issues of Cassandra and guarantee ACID database operations. We can also seamlessly replicate and coordinate a consistent view of all the rating and review data across AWS availability zones worldwide, providing a scalable and resilient way to serve billions of shoppers. We can also be selective in the data that replicates for example from the European Union (EU) so that we can provide assurances of privacy for EU based clients. In addition to this consistency capability, Emo provides a databus that allows any Bazaarvoice service to listen for the kinds of changes the service particularly needs, perfect for a new service oriented architecture. For example, a service can listen for the event of a review passing moderation which would mean that it should now be visible to shoppers.

While Emo/Cassandra gave us many advantages, its NoSQL query capability is limited to what Cassandra’s key-value paradigm. We learned from our experience with Solr that having a flexible, scalable query layer on top of the master datastore resulted in significant performance advantages for calculating on-demand results of what to display during a shopper visit. This query layer naturally had to provide the distributed advantages to match Emo/Cassandra. We chose ElasticSearch for our architecture and implemented a flexible rules engine we call Polloi to abstract the indexing and aggregation complexities away from engineers on teams that would use this service. Polloi hooks up to the Emo databus and provides near real time visibility to changes flowing into Cassandra.

The rest of the monolithic code base was reimplemented into services as part of our service oriented architecture. Since your code is a direct reflection of the team, as we took on this challenge we formed autonomous teams that owned everything full cycle from initial conception to operation in production. We built the teams with all the skills needed for success: product owners, developers, QA engineers, UX designers (for front end), DevOps engineers, and tech writers. We built services that managed the product catalog, UI Configuration, syndication edges, content moderation, review feeds, and many more. We have many of these rearchitected services now in production and serving live traffic. Some examples include services that perform the real time calculation of what Brands are syndicating consumer generated content to which Retailers, services that process client product catalog feeds for 100s of millions of products, new API services, and much more.

To make all of the above more interesting, we also created this service-oriented architecture to leverage the full power of Amazon’s AWS cloud. It was clear we had the uncommon opportunity to build the platform from the ground up to run in the cloud with monitoring, elastic resiliency, and security capabilities that were unavailable in previous data center environments. With AWS, we can take advantage of new hardware platforms with a push of a button, create multi datacenter failover capabilities, and use new capabilities like elastic MapReduce to deliver big data analytics to our clients. We build auto-scaling groups that allow our services to automatically add compute capacity as client traffic demands grow. We can do all of this with a highly skilled team that focuses on delivering customer value instead of hardware procurement, configuration, deployment, and maintenance.

So now after two plus years of hard work, we have a modern, scalable service-oriented solution that can mirror exactly the original monolithic service. But more importantly, we have a production hardened new platform that we will scale horizontally for the next 10 years of growth. We can now deliver new services much more quickly leveraging the platform investment that we have made and deliver customer value at scale faster than ever before.

So how did we actually move 300 million shoppers without them even knowing?  We’ll take a look at this in an upcoming post!

authored by Gary Allison


Quick and Easy Web Service Load Testing with JMeter

What is Load Testing and Why Should I Care?

Somewhere between the disciplines of Dev Operations, Database Management, Software Design and Testing, there’s a Venn diagram where at its crunchy, peanut-butter filled center lies the discipline of performance testing.

 Herein lies the performant (sic)

Which is to say, professional performance testers have a very specific set of skills (that they have acquired over many years) that make them a nightmare for non-performing web services. This helps them to answer the following question from our clients:

“Just what is the level of performance we can expect from your service”?

What if your project team doesn’t have access to anyone with a background in perf testing? What if you suddenly find yourself needing to answer the above question but don’t know where to begin?

Scary, right? Bazaarvoice’s Shopper Marketing team recently found itself in this exact situation (they weren’t being hunted by Liam Neeson so they had that going for them though).








The point of this article is not to bootstrap you, the reader, into a perf-testing, dev-ops version of the dude from Taken. Instead, it’s to show how a small team can quickly (and cheaply) performance test a web service in order to insure they can meet a client’s needs.


If you’ve never been involved in any sort of performance testing for web services before, there are essentially two different tracks performance testing can begin from:

Targeted Testing – you have a pre-defined level of service/latency that you need to reach. Generally, you already have an understanding of what your service’s current performance baseline is.

Exploratory Testing – The current performance baseline isn’t really known. Here, the goal is to find out at what point and how quickly performance degrades.

Typically, with small-team oriented projects, you’ll find often that the team starts with the latter path in order to then progress to the former – as was the case with Shopper Marketing’s efforts here.

Our Setup:

We have a RESTful web API (built in Java) which handles requests for shopper profile information stored and sorted across multiple types of data stores. This API will service a JavaScipt based front end widget deployed to a client’s home page to display product data. The client’s home page receives approximately 20 simultaneous unique views per second on average. Can our API service the client at that level?

To test this, we constructed a load test in JMeter that would do the following:

  1. Execute a series of continuous requests to the API that mimic those that will come from the client front end.
  2. Enable the test to run parallel requests to simulate a specific user load
  3. Use real, sampled data so that the requests and responses will be as close to real-world as possible
  4. Measure the response time of the API over time
  5. Measure the number of successful responses vs failures from the API while under load

Why JMeter:

So why are we conducting our test using JMeter? Isn’t that thing two days older than dirt?


Dirt: its old.

Well, for one, JMeter is free.


Jmeter: Its older and free-er

We could just leave it at that but wait, there’s more:

JMeter is a load testing tool that has been around for many years. It isn’t just developed and maintained by the same group responsible for Apache Web Server, JMeter is a modified version of Apache itself.  It specializes not only in sending and receiving HTTP requests (you know, like a web server) but with monitoring and reporting tools available for it as well as a wealth of plugins.


Sure, there are better (cough! more expensive cough!) tools out there that specialize in load testing but in our case, we needed to determine metrics quickly and with a tool that could be easily set up and re-run numerous times (heavy emphasis on quick and cheap).

Performance testing is a very repetitive process. You will be executing tests, reporting findings, then modifying your service in order to improve performance – followed by a lot of washing, rinsing and repeating to further refine performance. Whatever load testing tool you choose, make sure it is one that allows you to quickly and easily modify, re-run and re-report findings as you will be living your own form of dev-ops Groundhog Day when you take on this endeavor.

 A thought provoking documentary on the life and times
 of a performance tester

But enough memes – lets get down to how we programmed Jmeter to show us pretty performance graphs!

Downloading the Application:

You can download JMeter from the Apache Software Foundation here:

Note – Jmeter requires Java 6 or above (you are using Java 8+ right?) and you should have your Java HOME environment variables set up on your local environment (or wherever you plan on deploying and executing your load tests from).

Latest Java download:

Setting up your local Java environment:

Once the JMeter binary package is downloaded and unzipped to your test machine, start JMeter by running ./jmeter from the command line within the application’s bin/ directory.

Configuring Jmeter:

Regardless of what load testing tool you prefer to use; its technical merits will always be tied to its reporting capability. JMeter’s default reporting capabilities are pretty limited. However, there are a wealth of plugins to augment this. Before going any further you will want to install JMeter Plugins Extras and JMeter Plugins Extras Lib in order to get the results you’ll want from JMeter.


Unarchive the contents of these files and place them in the lib/ext directory within your JMeter installation.

Once finished, re-start JMeter.

Note – you can install, update and manage plugins for JMeter using the JMeter plugin manager. This feature is in Beta so your mileage may vary. More on the JMeter plugin manager here:

Designing a Test:

For those new to JMeter, setting up a test is rather simple but there’s a little bit of jargon to explain. A basic JMeter test consists of the following:

Test Plan – A collection of all of the elements that make up your load test

Thread Group – Controls the number of threads and their ramp-up/ramp-down time. Think of each thread as a unique visitor to your site/request to your service.

Listeners – These will be attached to your thread group and will generate your reports

Config Elements – These contain a variety of base information required to execute the test – mainly domain, IP or port information related to the service’s location. Optionally, some config elements can be used to handle situations like having to authenticate through LDAP during tests.

Samplers – These elements are used to generate and handle data as well as options and arguments during test (e.g. request payloads and arguments).

Building the Test – Step by Step:

1. Click on your test plan and assign it a name and check the Run Teardown option
2. Right click on the test plan and select Add > Threads > Thread Group


3. Enter a name for the thread group (e.g. load test 1)
a. Set the number of threads option to the maximum desired number of requests you want to field         to the API per second (simultaneously)
b. Set the ramp-up period option to the number of seconds you wish the test to take before it                 reaches the maximum number of threads set above (e.g. setting thread count to 100 and the             ramp-up to 60 will start the test with 1 thread and add an additional thread per second. After 1           minute, the test will be at a maximum of 100 concurrent requests per second).
c. Set the Loop option for the number of cycles of maximum requests you wish the test to repeat           once it reaches is maximum number of threads. Once this loop finishes, the test will end.
d. Check the forever option if you wish the test to continue to execute at its max thread count                 indefinitely. Note – this will require you to manually shut the test down.


4. Right click on the Thread Group and select Add > Config Element > HTTP Request Defaults
5. Set the Server Name or IP Address (and optionally the Port) fields to the domain/IP/port your             service can be reached at (e.g.


Now we’re ready to program our test – using the options in the HTTP Request element, we’ll construct what we want each request per each thread to contain.

1. Right click on the thread group and select Add > Sampler > HTTP Request
2. In the HTTP Request config panel, set the implementation to HTTPClient 4
3. Set your protocol (http or https) and your method type (in this place, GET)
4. Set the path option to the endpoint you wish to send your request to – do not include any HTTP         arguments (e.g. /path/sub-path1/sub-path2/endpoint)
5. Next, we’ll configure each and every HTTP argument we need to pass within our request.
6. Do this by clicking into the first line of the send parameters table.
7. Enter your first argument name into the name field, the value into the value field, click the include       equals option and, if need be, click the encode option if your argument value needs to be HTTP         encoded.
8. Click Add and repeat this process for each key-value pair you need to send with your request

 Your HTTP Request sampler should look something like this.

Now would be a good time to save your test!

 Don’t be this man

Adding Listeners:

Next, we need to add listeners (JMeter-speak for report generators) in order to report our findings during and after the load test.

Right click on the thread group and select Add > Listeners > and then pick your choice of listener.

The choice of test listeners is quite deep, especially if you installed the reporting add ons as noted above. You can configure whatever listeners you feel you need, though here some you may want to add to your test:

View Results Tree – This listener will tabulate each request response it receives during the test as well as collect its response type, headers and content. I highly recommend configuring two of these listeners and assigning 1 for successes and 1 for failures. This will help sort your response types, allow you to debug your tests in case of authentication errors, malformed requests or troubleshoot issues if your API should suddenly start returning 500 errors.

Response Times Vs Threads – If you’re configuring your test to ramp up its load over time, this listener will provide a chart which you can use to measure the responsiveness of your API over time as the request load is increased. Again, I recommend configuring multiple instances of this listener – one to record requests and another to record error latency if you choose to use this listener.

Response Times Over Time – If your test is being configured to provide a constant load over a period of time, this listener can help chart performance of the API based on a steady load over time. It’s helpful in helping to spot issues such as inadequate load balancing or rate limiting of your requests depending on if your service architecture is really aggressive when it comes to that aspect (cough, cough – load balancers – cough).

 Example of response time over time graph setup (successful responses only)

Now would be another great time to save your progress.

Kicking off a Test:

OK – the moment you’ve been waiting for (be honest). Let’s kick this pig and bask in the glory of our performant API!

Click the big, green button to start the test. Note on the upper right hand side of the JMeter UI, you’ll have an indicator showing the number of threads currently running (out of the total max to ramp up to) as well as an indicator of any warnings or errors being thrown).


Click on the Results Tree listener to view a table of your responses. If you’re seeing errors, click on an error instance in the results tree to view the error type, body and content.

 10 out of 10 threads running, no exceptions

Once you’re ready to stop a test, click the big, red, X icon to shut the test down.


Modifying a Test:

You’re probably thinking, “Hey, that was easy and oh look! Our test results are coming in and they look pretty good. There’s got to be more to it than this”. …And you would be right. Remember that comment about load balancers above? Well, in most modern web service architectures, you’ll encounter some form of load balancing whether it’s part of the web server’s features or an intermediary. In our case, Mashery would have cached our static request after a few seconds at maximum load. After that, we weren’t even talking to the API directly, rather, Mashery simply sent us the cached response for the request. Our results in Jmeter may have looked good but it was lying to us.


Fortunately, JMeter allows us to inject some form of randomness into our requests to circumvent this issue.

One way of accomplishing this is to invoke a randomized ID into your HTTP arguments – especially if your API accepts a random, serialized load ID as an argument. Here’s how you can do that:

1. Right click on the thread group and select Add > Config Elements > Random Variable


2. On the Random Variable config screen, set a value for in the variable name field (e.g.                         my_random_id)
3. Set a minimum and maximum value to define the range your random variable will take (e.g. 1000       and 9999)


4. Set the Per Thread option to true (this will ensure a random value will be set for each thread.

 And this joke is over plaaaaaaaaaaayed!!!!

5. Next, we’ll need to click on the HTTP Sampler and include our newly added random variable to         our test. Let’s assume our API accepts an argument called ‘loadId’ which corresponds to a                 random, 5-digit number.

6. In this case, click on the Send Parameters table and add a new key value pair with the name set       to ‘loadId’ and the value set to ‘{$my_random_id}’ (or whatever you’ve named your variable in the       config element screen.


One of the requirements of our load test request is that we must provide a specific profile ID that relates to a profile to be returned by the API. For our purposes, we exported a list of existing IDs (over 90,000) from the Cassandra database our API reads from and writes to, imported that into our JMeter test and instructed the HTTP Request sampler to randomly grab an ID and include it as the argument for every individual request.

We configured this by doing the following:

1. Right click on the thread group and select Add > Config Element > CSV Data Set Config


2. In the CSV data set config options, set the file name option to the path to your CSV file that                 contains your working data
3. In the variable name field, provide a name for which the test sampler will refer to each instance of       your data as (e.g. myRandomID)
4. Enter ‘,’ into the delimiter option field
5. Set the Recycle on EoF to true, Stop on EoF to false and Sharing Mode to All Threads


This last set of options will ensure that if the test cycles through all elements in your CSV (which it will use for each and every thread) it will simply start back at the top of the list.

Next, click on your HTTP Sampler. Here you will need to add a bash script style variable to the sampler in order for it to automatically pull the shared data variable from your CSV config element (e.g. if you named your variable in the CSV config element to “myRandomID” you need to inject the value {$myRandomID} into the listener somewhere. This will depend on the nature of your API. In our case, we simply appended this to our API endpoint, setting the ID variable to be called between the API domain/endpoint call and the HTTP arguments in the URI.


Yup – good time to save your game – I mean test. After that…

 Ratta-tat-tat yo!

Reading Results:

We’ve gone over how to build and run a performance test but once the test has concluded and you have gathered results, you need to understand what you’re looking at.

To view the results of a particular listener, just click on it in the JMeter UI.

The Results Tree reports are self-explanatory but what about the other reports? In particular, lets look at the Threads Over Time listener. Here is the graph output for this listener from our initial performance test:

 Average response time for successful requests – 1.6 seconds

This listener was configured to only measure time per successful request in order to obtain more focused results. In the graph you can see that over time, there was a great deal of variance with the majority of requests taking around 1.6 seconds to resolve. Note the highest and lowest points on the graph – these are the outlining deviations for test results as opposed to the concentrated area of red (the average time per request).

Generally speaking, the tighter the graph, the more consistent the API’s performance and of course, the lower the average number, the faster the performance.

Large spikes with pronounced peaks and valleys usually indicate there is an issue with the service’s load balancing features or something “mechanical” getting in the way of the test.

Long periods of plateauing are another indicator to watch for. These may indicate some form of rate limiting or timeout.

Caveats and Next Steps:

Now you’re ready to send off your newly minted beast of a load test to go show that MCP who’s boss. Before you go and press that button – some advice.

On Test Performance:

JMeter is a great tool and all – especially for the price you pay but it is old and not necessarily the most high-performance perf testing tool out there (oh the irony). When launching tests off your local machine or a server – keep in mind that each thread you configure for your test will be another thread your CPU will need to handle. You can quickly and easily create a test that will test your local environments to its limit. Doing so can, at times, crash you test (and dump your results into the ether – engage sad panda face). Start with a small-to-medium performance load and build up from there.


Starting and Stopping a Test:

When stopping a test, manually or automatically, you might notice a sudden uptick in errors and latency at the very end of the test (and possibly at the beginning as well). This is normal behavior – when a test is started and stopped you can experience some level of thread abandonment (which JMeter will record as an error because those last requests never receive proper responses). These errors can be ignored when viewing results.

Basically, the test results are kind of like a loaf of bread – no one wants the ends.


One of life's great mysteries

A Word of Caution:

JMeter is basically a multi-threaded web request generator and manager. The traffic patterns it generates can resemble those seen during a DoS attack – especially for very large tests. If there are any internal or external web security policies in place within the network you’re testing, be careful as to not set these off (i.e. just because you can run a test on a production server that sends 400,000 simultaneous requests to a google web service – which then gets your whole office IP range banned from said service – doesn’t mean you should and no, the author of this piece has absolutely no knowledge of any similar event ever happening, ever…).

From Latency to Where?

The above performance graph was from the very first performance test against our internal Shopper Marketing recommendations API. Utilizing the test, its results and monitoring tools like DataDog we were able to find where we needed to improve our service from both the code base as well as hosting environment to reach our performance goal.

After several repeated tests along with re-provisioning new Elastic Search clusters and a lot of code refactoring, we eventually arrived at the following test result:






Average response time for successful requests – 100 milliseconds

From and average response rate of 1.6 seconds to 100 milliseconds is a pretty big leap in performance. Ultimately, our client was pretty happy with our answer.

This is by no means an exhaustive method of load testing but merely a way of doing quick and easy exploratory testing that delivered a good deal of value for our team.

Have fun testing!