Jolt Command Line Interface

Some of y’all may have caught our previous blog post announcing the release of our Java JSON transformation library, Jolt.

Jolt is a powerful tool that can accomplish a variety of useful transformations on JSON data, and even chain multiple transformations together. Jolt has additional functionality that is useful for working with JSON including the ability to intelligently diff JSON documents and sort JSON documents. Users of Jolt can now transform, diff and sort JSON via the command line using the Jolt CLI. The CLI even allows you to string multiple commands together via standard in:

curl -s "http://some.host.com/stuff/data.json" | jolt sort | jolt diffy moreData.json 

The Jolt CLI supports the following sub-commands:

Diffy

The Jolt Diffy sub-command is an excellent way to compare JSON documents at the command line. It gives you a lot of friendly human-readable output, or you can have it run silently and examine the exit code to determine if any differences were found.

Have you ever tried using diff to detect differences in a JSON document? Due to the nature of JSON data the regular diff command can sometimes be inadequate. For Example, consider the following two JSON documents:

diff1.json

{
  "someData": "dude",
  "moreData": "sweet"
}

diff2.json

{
  "moreData": "sweet",
  "someData": "dude"
}

Running the diff command from the command line returns the following:

user@computer:~/Projects/blog-post$ diff diff1.json diff2.json
2,3c2,3
< "someData": "dude",
< "moreData": "sweet"
---
> "moreData": "sweet",
> "someData": "dude"

This really isn’t helpful. Since the example data is in the form of a map, then two documents are essentially equal. However, because the entries are ordered differently, diff detects differences. Diffy ignores the ordering of map entries:

user@computer:~/Projects/blog-post$ jolt diffy diff1.json diff2.json
Diffy found no differences

Diffy does a recursive tree walk to find differences throughout the JSON document, so it can detect differences N levels deep.

diff3.json

{
  "aMap": {
    "stuff": "yeah"
  },
  "someData": "whatever",
  "matchingData": "cool"
}

diff4.json

{
  "aMap": {
    "differentStuff": "woah"
  },
  "differentData": "bleargh",
  "matchingData": "cool"
}
user@computer:~/Projects/blog-post$ jolt diffy diff3.json diff4.json
Differences found. Input #1 contained this:
{
  "someData" : "whatever",
  "aMap" : {
    "stuff" : "yeah"
  }
}
Input #2 contained this:
{
  "differentData" : "bleargh",
  "aMap" : {
    "differentStuff" : "woah"
  }
}

Diffy does flag differences in array ordering. Consider the following two JSON documents:

array1.json

{
  "arrayData": [
    "one",
    "two",
    "three",
    "four"
  ]
}

array2.json

{
  "arrayData": [
    "one",
    "three",
    "two",
    "four"
  ]
}

Diffy detects the differences in the array:

user@computer:~/Projects/blog-post$ jolt diffy array1.json array2.json
Differences found. Input #1 contained this:
{
  "arrayData" : [ null, "two", "three", null ]
}
Input #2 contained this:
{
  "arrayData" : [ null, "three", "two", null ]
}

If for some reason you are crazy (like some of us at Bazaarvoice) and you want to ignore array order, you can use the -a flag for those occasions.

Transform

The Jolt Transform sub-command allows you to perform transforms on JSON documents provided via standard in or file. Transform also takes a spec, which is a JSON document that contains one or more Jolt specs to indicate what transformations should be done on the document. Transform has the option to produce the results with or without pretty print formatting.

You can read more about Jolt transforms here.

Sort

The Jolt Sort sub-command will sort JSON input. The sort order is standard alphabetical ascending, with a special case for “~” prefixed keys to be bumped to the top. This can be useful for debugging when you need to manually inspect the contents of a JSON document. Sort has the option to produce the results with or without pretty print formatting.

That does it for today. Hopefully you have an idea of what the Jolt CLI does in broad strokes. If you’re curious about Jolt, you can read much more about it here.

BV I/O: Peter Wang – Architecting for Data

Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conference was themed “unlocking the power of our data.” You can read more about it here.

In this presentation Peter Wang, co-founder and president of Continuum Analytics, discusses data analysis, the challenges presented by big data, and opportunities technology provides to overcome those challenges. He also discusses the importance of performance and visualization as well as advances the concept of “engineering on principle” which he demonstrates by discussing the design of the A-10 Thunderbolt and SAGE computerized command and control center for United States air defense. Peter ends his talk by discussing the Python programming language and its suitability for data analysis tasks. The full talk is below.

BV I/O: Dr. Jason Baldridge – Scaling Models for Text Analysis

Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conferences was themed “unlocking the power of our data.” You can read more about it here.

The following video is of Dr. Jason Baldridge, currently an associate professor in the Linguistics Dept. at University of Texas and co-founder of People Pattern. Dr. Baldridge presented on the subject of text analysis. During his hour long talk he identified the desirable traits of a good text analysis function and focused on the problems of performing text categorization tasks given different amounts of labeled data. Big thanks to Dr. Baldridge for his informative presentation. The full talk is below:

Conversations API Inspector

An all too familiar scenario

Imagine you’re a developer working for Widgets n’More. The marketing team just came up with a new cross platform social media promotion. It’s going to involve collecting user generated content in the form of ratings and reviews. As luck would have it you remember your friend on the Ecom Team had mentioned working with a company called Bazaarvoice last year. Widgets n’More partnered with Bazaarvoice specifically for the purpose of collecting and displaying reviews.

In short order you find yourself at developer.bazaarvoice.com where you start reading the Conversations API documentation. There are a lot of fields and parameters and content types. What’s more, some of them appear to be customizable. They offer custom rating fields, tags, context data questions, and additional free text fields. One company might have “rating_value” while another might be using “rating_quality”. It’s also not immediately clear how any those should be displayed in a webform. The fields can even have customizable properties like min and max length.

So, you call your friend hoping she’ll be able to shed some light on the situation. She explains that Bazaarvoice can even configure fields based on content type, like reviews, questions or answers, and make different custom fields available depending on the parent category of a product. Unfortunately it’s been so long since the initial Bazaarvoice implementation that she doesn’t remember what was set up. If only there was an easy way for you to see exactly what fields are available taking all those factors into account…

Conversations API Inspector to the rescue

The Conversations API Inspector was created with the above scenario in mind. It is a web based app that shows what fields can be submitted to the Bazaarvoice platform using the Conversations API for any API key + content type + ID combination. With the Conversations API Inspector our imaginary developer would be able to see what fields are available, how they must be submitted in an HTTP request, meta-data about each field and much more

The Conversations API Inspector is ready to use and publicly available at http://api-inspector.bazaarvoice.com/. It is well documented at our Developer Portal, so instead of repeating that here I’ll leave you with some screenshots.

noHistory topNoFields topWithFields
fields fieldDetailsProperties fieldDetailsSubmission

Epilogue

Even without the Conversations API Inspector all would not have been lost for our imaginary developer. He could have used the API itself to determine what fields are available. In fact this is exactly how the Conversations API Inspector does it. Of course, the Inspector provides a much more user friendly and interactive GUI than the raw JSON or XML returned by the Conversations API. You can read more about how the inspector works at the documentation under the heading “How it works”.

Using the Bazaarvoice Conversations API to build a UGC template

(This post is by Devin Carr, one of our Summer 2013 interns.)

LightReviewDisplaySmallWorking as a Developer Advocate intern on the Bazaarvoice Developer Relations team has been a great learning opportunity. At the beginning of the Summer I discussed with my mentors, Chas Peacock and Frederick Feibel, what I wanted to learn while interning. We decided on front-end web-development because it is an area in which I had minimal experience. For the rest of the Summer I spent my time creating a simple review display template using the Bazaarvoice Conversations API and some really cool front-end frameworks & libraries. I also researched other ratings and reviews implementations to learn about common industry best practices and incorporated them into what I am calling my LightReviewDisplay Template.

LightReviewDisplay_jsComing from a background limited to Java, I quickly discovered that Javascript is quite a different programming language. I took a few days to learn basic Javascript and HTML from Codecademy to get a grasp of the syntax and functions. Using that knowledge I created the first version of the LightReviewDisplay. I put all of the Javascript and dynamic HTML in one large file that handled loading the reviews and displaying them on the page. During my first code review Chas and Frederick suggested my code was lacking in structure and layout best practices, such as including an Object Oriented Programing model. Instead of most of my code being in just a few functions they recommended I implement RequireJS, self described as a “JavaScript file and module loader”, to encapsulate my code within a maintainable structure. They also suggested that I use Mustache, a HTML template library. Using these two libraries I could divide my code into multiple modules and avoid having large amounts of HTML embedded within my Javascript.

LightReviewDisplay_js_filterThe second version that I made was starting to look more like a modern and modular Javascript project. Chas and Frederick agreed that the increased readability of my refactored project was an improvement, but they felt it could be further improved by imposing more structure. They suggested I research some MVC (Model, View, Controller) frameworks. I looked into Backbone, KnockoutJS, Angular, and Stapes, all very versatile with separated modules for the model (data), the view (page), and the controller (user input). I ended up choosing Stapes because it is lightweight and is a very minimal framework that allows lots of customization.

LightReviewDisplayRatingsSummaryAt this point the project was functional but the user experience needed improvement. I decided to do some research into how a product review page should actually look. I inspected several retailers to see what was common about their layouts and what would be the best way to integrate reviews onto a product page. They all followed a two part system: a summary section at the top of the page and a more detailed section further down. The summary section shows the product’s price, description and a review ratings summary. It typically had the average rating for the product, the total number of reviews, and a rating breakdown per-star. Then towards the bottom of the product page, there was a review module that held all of the consumer reviews. I combined what I had learned about layout with Twitter Bootstrap resulting in an efficient and usable template for displaying UGC.

My Summer project was a great way to get started with Javascript and get familiar with the Bazaarvoice Conversations API. Throughout my internship I learned a lot about being a software engineer, expanded my CS knowledge and got some firsthand experience working amongst a small team of developers. This internship also gave me the opportunity to learn with the latest front-end libraries, architectural patterns, and work with really great developers. A special thank you goes out to Chas and Frederick for taking time out of their days to assist me with any problem I came across.

More about the LightReviewDisplay Template:
Bazaarvoice Inspiration Gallery: https://developer.bazaarvoice.com/inspiration_gallery/lightreviewdisplay
Source: https://github.com/DevinCarr/LightReviewDisplay

BV I/O: Adrian Cockroft

As part of our internal BV I/O conference we’ve previously profiled on the blog, we had Adrian Cockroft, Cloud Architect at Netflix, come give us an overview of a lot of Netflix’s architecture as well as information on their multitude of open source projects and the ways Netflix is engaging the community to contribute. He also talked about some of his personal sources of inspiration when it’s come to things his teams have developed at Netflix. Big thanks to Adrian for taking the time out to come visit with us in Austin. The full talk is below:

Making of the Bazaarvoice SDK for Windows Phone 8

A page displaying item name, item information, and a list of reviews.

Hi, my name is Ralph Pina, I am a Summer ’13 intern and UT-Austin Computer Science student. During this summer I had the privilege of working with another intern, Devin Carr, on Bazaarvoice’s .NET SDK for Windows Phone 8 and Windows 8 Store apps. Our goal was to provide convenient access to our Conversations API and make the app development experience even better. We want our customers’ developers to spend more time figuring out how to make better use of the data in our network and innovating on ways to increase engagement with their brands, products, and services.

We currently provide SDKs for many other platforms including iOS, Android, so moving to cover the third largest mobile platform was a natural extension. As for Windows 8 Store apps, they are not as numerous or popular as traditional Windows Desktop apps, but as the install base for Windows 8+ devices grows, and developers get more accustomed to working with a touch interface on all their devices, their numbers should increase. This will provide an opportunity for first movers to grab a spot in million (billions?) of user’s Start Screen. We’ve got your back.

We tried to implement best practices in our development. Below are some of the technical challenges we experienced:

  • Networking: we used the newly ported Microsoft HttpClient. This is the most popular client in other .NET platforms and Microsoft is actively working to optimize it for Windows Phone. Our first implementation used the RestSharp library, a popular, open source Http client. However, like the Apache HttpClient in Android, this library loaded all data into a byte array before sending. While Windows Phone 8 has a higher max heap size for apps than Android, there is still a limit you may cross with a large image or video. Switching to HttpClient did not completely solve our problems however. While in other .NET platforms the library will buffer and stream content, this seems to be a limitation for Windows Phone. However, we hope this changes in the near future.
  • The getting started .NET app walks a developer through the installation of the SDK and  how to make a simple http request and dump the JSON response on the screen.Windows Phone does not have a native JSON library available, so we used the Newton JSON.NET library. I especially like the array syntax to access items in the JSON hierarchy: for example JSONObj[“tag1”][“tag2”][“tag3”] will access a property named “tag3” which is inside the JSON object named “tag2”, which is itself inside the JSON object named “tag1”, etc, etc.

So you say there are better ways of doing this, or you want a specific implementation for your enterprise to use across various apps/brands/divisions? You are in luck! Our .NET SDK is open sourced and can be found on GitHub: https://github.com/bazaarvoice/bv-.net-sdk. So head over, hack it, maybe even submit a pull request to lay your claim to glory. While you are in GitHub, browse all the other awesome repos we’ve open sourced over the years.

Below I have included some screenshots of a couple of sample apps that demonstrate how to use the .NET SDK to submit and display sample data from our API.

A review details page using the title, user nickname, data, image, and review text.A example of a product page showing ratings.An example of a review submission page

Cheers,
Ralph Pina

Bazaarvoice SDK for Windows Phone 8 has been Open Sourced!

The Bazaarvoice Mobile Team is happy to announce our newest mobile SDK for Windows Phone 8. It is a .NET SDK that supports Windows Phone 8 as well as Windows 8 Store apps. This will add to our list of current open-source mobile SDKs for iOS, Android and Appcelerator Titanium.

The SDK will allow developers to more quickly build applications for Windows Phone 8 and Windows 8 Store that use the Bazaarvoice API. While the code is the same for both platforms, they each need their own compiled DLL to be used in their perspective Visual Basic projects. As Windows Phone 8 and Windows 8 Store apps gain more traction in the marketplace, we hope the Bazaarvoice SDK will prove a valuable resource to developers using our API.

Learn more at our Windows Phone 8 SDK home page, check out our repo in GitHub, and contribute!

The SDK was developed by two summer interns: Devin Carr from Texas A&M and Ralph Pina from UT-Austin. Go horns!

In the next few days we’ll publish a more in-depth post about how the Windows Phone 8 SDK was built, some of design decisions and challenges.

Project Lassie: who let the dog out!

The Bazaarvoice Platform Infrastructure Team recently open sourced project Lassie. Lassie is a Java library that can manipulate the new DataDog screenboards. The Lassie library can create, get, update, and delete the DataDog screenboards via the REST API.

We use DataDog across various teams to collect metrics at both a system-wide and application level to give our teams a clearer view of what’s happening across all environments.

The project was developed by a Bazaarvoice summer intern, Mykal Thomas. Mykal is a senior at Georgia Tech.

Check out the Github for more information: https://github.com/bazaarvoice/lassie

Documentation for DataDog’s Screenboard API: http://docs.datadoghq.com/api/screenboards/

Maintenance Notice: Bazaarvoice Developer Portal

We’ll soon be giving our Developer Portal some love in the way of functionality and style updates. To facilitate this, we’ll be taking the portal offline for approximately 6 hours on Thursday, August 15th; the estimated times are listed in a few timezones below:

Timezone Start date/time End date/time
UTC Thursday, August 15, 2:00 AM Thursday, August 15, 8:00 AM
EDT Wednesday, August 14, 10:00 PM Thursday, August 15, 4:00 AM
PDT Wednesday, August 14, 7:00 PM Thursday, August 15, 1:00 AM
AEST Thursday, August 15, 12:00 PM Thursday, August 15, 6:00 PM
IST Thursday, August 15, 7:30 AM Thursday, August 15, 1:30 PM

Bookmarks to existing documentation will work through at least part of the downtime, but please excuse any styling inconsistencies as we work through the update.

If you need something quickly, tweet us @BazaarvoiceDev and we will do our best to get you the assistance you need.

Thanks for your patience!