Category Archives: Open Source

Project Lassie: who let the dog out!

The Bazaarvoice Platform Infrastructure Team recently open sourced project Lassie. Lassie is a Java library that can manipulate the new DataDog screenboards. The Lassie library can create, get, update, and delete the DataDog screenboards via the REST API.

We use DataDog across various teams to collect metrics at both a system-wide and application level to give our teams a clearer view of what’s happening across all environments.

The project was developed by a Bazaarvoice summer intern, Mykal Thomas. Mykal is a senior at Georgia Tech.

Check out the Github for more information: https://github.com/bazaarvoice/lassie

Documentation for DataDog’s Screenboard API: http://docs.datadoghq.com/api/screenboards/

Jolt released to the world

We are pleased to announce a new open source contribution, a Java based JSON to JSON transformation tool named Jolt.

Jolt grew out of a BV Platform API project to migrate the backend from Solr/MySql to Cassandra/ElasticSearch.  As such, we were going to be doing a lot of data transformations from the new ElasticSearch JSON format to the BV Platform API JSON format.

Prior to Jolt, there were 3 general strategies for doing JSON to JSON transforms :

  1. Convert to XML, use XSLT, convert back to JSON
  2. Use your input JSON and a template language to build your output JSON
  3. Write custom code

Those options were rather unpalatable, so we went with option “4”, write reusable custom code.

The key insight was that there are actually separable concerns when doing a transform, and that part of the reason the XSLT or template approaches are unpalatable, is that they force you to deal with them all together.

Jolt tackles each separate concern individually :

  1. Identify the pieces of the input data that you care about and place them in the output JSON
    • Jolt provides a transform, “shift”, that has its own JSON based declarative DSL (domain specific language)
  2. Make sure the output JSON looks correct ( apply defaults to the output JSON )
    • Jolt provides a transform, “default”, with its own JSON based declarative DSL
  3. Handle all the JSON text formatting (comma, closing curly brackets etc)
    • Jolt operates on “hydrated” JSON data (Map<String,Object> and List<Object>) and leverages the Jackson library to handle serialization / JSON text formatting
  4. Verify the transform for data and format correctness
    • Jolt provides a test tool called Diffy so that you can unit test your transforms for data and format correctness
    • For format correctness, this is not as good of an answer as an xml dtd is, but you could pull in the JSON schema if you wanted
  5. Perform arbitrary custom data manipulations like adding fields together or performing date conversions
    • Jolt provides an interface where you can implement your own custom logic to be run in series with the other transforms

The code is now available at Github, and jar artifacts are now being published to Maven central.