Category Archives: Uncategorized

The Tools We Use to Innovate in Bazaarvoice Labs (Part 2)

In the previous post, I provided a rundown on what Bazaarvoice Labs is, our process and why it is important to have flexibility in our toolset choices. I now want to give you some tool examples in the following categories:

  • Operational Tools
  • Server-side Application Development Environments
  • Data Storage and Management
  • Client-side Tools
  • Measurement Tools

Operational Tools

  • Amazon EC2: Well, duh. I mentioned that we need to seamlessly transition from internal prototypes to live running pilots and by using EC2, Elastic Load Balancer and creating a set of mostly standardized AMIs, we’re able to get a machine up and running to demo a prototype or scale out to supporting hundreds of thousands of requests almost instantly. Key to our use of the EC2 is the fact that it has a very robust API and tools like boto so we can automate just about everything that we do. This is important since it’s well documented that EC2 instances can go up and down without rhyme or reason. Which brings me to my next operational tool…
  • Cloudkick: We use Cloudkick for basic monitoring. Its UI is simple and it just plain works. Given how frequently we take services and applications up and down in EC2, it’s really nice to have an easily configurable, straightforward monitoring solution to rely on.

Server-side Application Development Environments

  • Ruby on Rails and Django: While we’ve experimented with microframeworks like Flask, sometimes when you’re moving fast and prototyping, you don’t know exactly what you need or when you’re going to need it. You may not want to think about what ORM or templating language to use or want to re-invent how user sessions are handled and it’s times like these that a nice full-stack web application framework comes in handy. Why both though? Well, quite simply, some engineers on our team prefer Ruby and some (most) prefer Python. This is where our one engineer, one project comes in handy. We work with the tools that will make us fastest. Ultimately, if someone needs to step up and lend a hand on a project when someone is on vacation, we’re all polyglots and can get our hands dirty in any language or framework necessary. The Facebook apps referenced above were written in Rails and the very, very high traffic pilot that we ran with TurboTax was written with Django (as was our Customer Intelligence product).
  • Node.js: The evented asynchronous server built on Google’s V8 Javascript engine. Node is a great tool to use when you’re building an application that needs to pull data in from multiple HTTP-based APIs and mash it together. Its performance is remarkable and it allows a developer to work in the same language in both the client and on the server. While some people think server-side JS is a fad, I think Node is leading a revolution in how people build and think about web applications. Please note that Node is so much more useful than for just building webapps. It can be used, for example as a very effective proxy as well (see Joe Stump’s answer about what technologies SimpleGeo uses on Quora). Data for Travelocity’s Social Connect Discovery pilot is served from Node.js backed with the Bazaarvoice Developer API and custom indices stored in Redis.

Data Storage and Management

  • ElasticSearch: We’re no strangers to Lucene-based search and data stores at Bazaarvoice. Most of our core platform’s displays are backed by queries made to SOLR. However, unlike SOLR, ElasticSearch is schema free and therefore really nice to use for prototyping and pilots where you’re not sure of the kinds of data that you’ll be wanting to index. There are some gotchas with this approach but for Labs projects, we’ll take the flexibility it offers. As a side note, it’s amazing how often Lucene-based tools are left out of the NoSQL discussion (In fact, my colleague RC Johnson did a SXSWi presentation on this). The search functionality in our Ask and Answer for Facebook pilot with Nikon is driven out of ElasticSearch.
  • MongoDB: We’ve used MongoDB in any number of Labs pilots at this point. Most notably, it drives the leaderboard and newsfeed functionality in our Ratings and Reviews for Facebook pilot with Benefit Cosmetics and also the majority of our new product discovery pilot application that we’re running with Sam’s Club in Facebook.

Client-side Tools

  • Dust: Dust is a Javascript templating library well suited to asynchronous applications. We like Dust because it’s a flexible and easy to use templating language, it integrates well with server-side JS tools like Node and allows you to pre-compile your templates for great performance.
  • Protovis: Protovis is an excellent visualization library. It’s declarative and very easy to build complex, interactive visualizations while still having a high degree of flexibilty over how those visualizations are rendered. We use Protovis to create what I believe are visualizations that are way beyond typical for an analytics tool in our Customer Intelligence product.

Measurement Tools

  • Google Analytics: It’d be tough to tell where we’d be without Google Analytics. It’s got its obvious uses, but also has comprehensive APIs that allow you to call custom events, set variables and then suck the data back out as necessary. This allows us to track specific actions that a user takes and to set up funnels based on those actions (even when the actions are clicks within a page vs. full page views).
  • Mixpanel: Mixpanel is a great alternative to Google Analytics. Many of our projects in Bazaarvoice Labs take the form of Javascript plugins or widgets that don’t conform to the traditional page-view-first mentality of most web analytics. Mixpanel focuses much more on tracking individual events that a user takes either in-page or across pages. Their API for doing this is very easy to use and it has the added benefit of being realtime which means you don’t have to wait a few hours to start seeing results from that code change that you just launched.

Of course, no project, prototype or pilot would get off the ground in Bazaarvoice Labs if we couldn’t get at our customer’s data. In order to maintain agility, all Bazaarvoice Labs projects are written as free-standing applications that are not part of our core application stack (a somewhat traditional J2EE application built on Spring MVC). Early on in Labs, even though we had direct access to our databases, we knew we needed to maintain separation between our core stack and Labs applications. Since we maintain a very complex set of business rules that are configurable on a per client basis around content submission and display, if we were to write directly to the databases, there’d be a high risk that we’d compromise data integrity. Generally, we’d use our existing XML API for submission (because it was obvious that trying to write data into the DBs from a separate application was a recipe for disaster) but we’d still use replicas of our core MySQL database clusters for display. That was okay but there were still some business logic mistakes made in the display of content (unacceptable when your pilot clients are some of the biggest online retailers around). In order to get around this, we created a new API that supported significantly higher degree of queryability, JSON and JSON-P data formats and had much lighter weight responses. This allows Bazaarvoice Labs to talk to our core data sets in a much more efficient manner and be assured that business rules are followed. This new API has now be productized as The Bazaarvoice Developer API. We will often create new, experimental method calls or create application local data indexes, but every single Bazaarvoice Labs project leverages this API heavily.

I hope I’ve given you a good overview of how Bazaarvoice Labs operates and the tools that keep us humming. It’s great to be able to work in an environment where exploration of new ideas and technologies are supported and encouraged. By operating the Bazaarvoice Labs team off-stack, it gives the Labs Engineers a chance not only to give input into what new products get built but what technologies get used to build them in a very low risk way.

The Tools We Use to Innovate in Bazaarvoice Labs (Part 1)

Hi everyone! This is my first post to the Bazaarvoice Developer blog and I’d like to take this opportunity to shed some light on some of the tools Bazaarvoice Labs has recently found very useful in creating the pilots and prototypes that ultimately morph into new products and features on the Bazaarvoice platform. Before I talk about our toolset though, I’d like to give you a quick rundown on what Bazaarvoice Labs is, our process and why it’s important for us to be flexible in our toolset choices.

Bazaarvoice Labs is the new product research and development group at Bazaarvoice with emphasis on the new and research. We are actually a team of engineers that report to our Product Management team (rather than through the engineering group) that help our Product Managers realize their wildest (and potentially most game-changing) ideas. Every quarter we evaluate and prioritize new ideas proposed by our Product Management team, customers and Bazaarvoicers around the company in order to research and create prototypes. The ideas we prioritize highest are those that come with big hairy assumptions but could change our business if they work. By building prototypes we’re able to suss out where the trouble might lie if we were to introduce the new product or feature to our entire customer base. We currently have over one thousand of the world’s biggest brands hosting their user-generated content in our platform and many large services organization to boot. The introduction of even a small new feature can have very large consequences to our organization. So on the risky stuff, we like to know where the gotchas lie. Some of the products spawned out of this process include BrandAnswers and Ratings and Reviews for Facebook (part of our SocialConnect Suite).

In order to build a prototype, we assign an engineer to work directly with a Product Manager or Product Designer. These two work together in an agile manner (agile with a little-a, not a capital A) in order to create a tangible prototype that demonstrates the Product Manager’s idea unencumbered by writing lots of requirements or unnecessary process. It’s this one-to-one relationship that makes this process hum, gives the creative process a kick in the pants and really lets these ideas properly gestate. Once more people get involved in the project, due to network effect, managing the project gets exponentially harder with each person you add and the need for process increases as a way to mitigate risk. By imposing a one-to-one structure for our prototyping teams we strip away any unnecessary obstacles to creativity and give real creative ownership to our Product Managers, Designers and Engineers. In a way, these teams become entrepreneurial cofounders as they attempt to prove their ideas. Additionally, by artificially constraining the initial project team to one engineer, the team is focused on building out the Minimally Viable Product needed to prove their assumptions and build a business case before further investment is required. Another nice side effect of this style of working is that it allows the engineer working on the project to choose their own tool-chain for each new project. Since they’re working alone on a project, there’s no need to constrain the tool choices to lowest common denominator of what every team member might already know or be familiar with. Of course, it’s up to the engineer’s discretion to reuse code or tools that may already be in use at Bazaarvoice but that choice ultimately lies with the engineer and the engineer knows to optimize around speed of creation vs. other organizational considerations. One nice side effect of an engineer being able to choose a new tool-chain with every project is that, in addition to proving business and product ideas, emerging technologies can be realistically evaluated and, where appropriate, integrated into our core engineering stack (this happened with requireJS which has become an integral part of how we deploy Javascript on our customer’s sites).

Sometimes simply building a prototype may not answer the questions that we have around the viability of the product and we need to take further steps to answer the questions we might have. For example, we needed to answer the following question for Ratings and Reviews for Facebook: “Will people be willing to read and write Product reviews inside a Facebook app?” In this case, a prototype isn’t enough. We needed to progress to the next phase of the process and actually pilot the application we had built with a couple of customers. For this reason, the “prototypes” we build need to be more robust than what you might initially think. Yes, we’re building concept cars in Labs but our concept cars actually need to run. We generally launch these pilots with three to five customers and generally won’t internationalize them. These restrictions keep us agile and help make sure we don’t have to build too much customizability into the pilots. Even though these pilots will only be launched with a handful of customers, some of these will be placed in some very high profile, high traffic places (some getting over 100,000 hits per day). Examples of running pilots right now include Nikon’s Ask and Answer for Facebook, Travelocity’s Social Connect Discovery Pilot (see the “Traveler Reviews” link) and TurboTax’s People Like You review search tool. Of course, when we launch pilots we track the data and rapidly move to improve the product and build out a suitable business case for productization. In the pilot phase the engineers are free to launch new code whenever they choose and must play the roles of UX, server side and operations engineer. By being chief cook and bottle washer on these projects, it frees the owning engineer (there’s still only one per project) to push releases as frequently as necessary to build that business case and observe how changes affect the project’s KPIs.

So what tools do we use to build software in Labs? Let’s review the two phases of our projects momentarily: The tools we chose to build with in Bazaarvoice Labs must support two phases of a project:

  • Prototyping: When the engineer needs to build a usable, tangible artifact targeted for internal consumption and demonstrations for clients.
  • Pilots: Where we launch our new ideas with a few select clients and measure results to build a business case. Pilots must be stable and scale yet the engineer still has to rapidly iterate on the feature set.

Because our development cycles are so short at Bazaarvoice, projects must also be able to transition between the prototype and pilot phase seamlessly. The tools we select must therefore support the requirements mentioned above. Generally we can divide our tool-chain into a few broad categories:

  • Operational Tools: Tools that help us keep things up and running
  • Server-side Application Development Environments: Application containers, full-stack and micro frameworks. Tools to build web apps with.
  • Data Storage and Management: SQL, noSQL and whatever else you need
  • Client-side Tools: Because there’s a lot you can do with just a browser nowadays
  • Measurement Tools: Without the data to back up our hypotheses, there’s no science

In my next blog post, I’m going to step through each of these categories and talk about a couple of tools that we use and the projects that we’ve used them in. This will not be an exhaustive list since we’re always evaluating new tools but it should give you some insight into the how and why we pick the tools that we do.