Tag Archives: xml

Jolt Command Line Interface

Some of y’all may have caught our previous blog post announcing the release of our Java JSON transformation library, Jolt.

Jolt is a powerful tool that can accomplish a variety of useful transformations on JSON data, and even chain multiple transformations together. Jolt has additional functionality that is useful for working with JSON including the ability to intelligently diff JSON documents and sort JSON documents. Users of Jolt can now transform, diff and sort JSON via the command line using the Jolt CLI. The CLI even allows you to string multiple commands together via standard in:

curl -s "http://some.host.com/stuff/data.json" | jolt sort | jolt diffy moreData.json 

The Jolt CLI supports the following sub-commands:

Diffy

The Jolt Diffy sub-command is an excellent way to compare JSON documents at the command line. It gives you a lot of friendly human-readable output, or you can have it run silently and examine the exit code to determine if any differences were found.

Have you ever tried using diff to detect differences in a JSON document? Due to the nature of JSON data the regular diff command can sometimes be inadequate. For Example, consider the following two JSON documents:

diff1.json

{
  "someData": "dude",
  "moreData": "sweet"
}

diff2.json

{
  "moreData": "sweet",
  "someData": "dude"
}

Running the diff command from the command line returns the following:

user@computer:~/Projects/blog-post$ diff diff1.json diff2.json
2,3c2,3
< "someData": "dude",
< "moreData": "sweet"
---
> "moreData": "sweet",
> "someData": "dude"

This really isn’t helpful. Since the example data is in the form of a map, then two documents are essentially equal. However, because the entries are ordered differently, diff detects differences. Diffy ignores the ordering of map entries:

user@computer:~/Projects/blog-post$ jolt diffy diff1.json diff2.json
Diffy found no differences

Diffy does a recursive tree walk to find differences throughout the JSON document, so it can detect differences N levels deep.

diff3.json

{
  "aMap": {
    "stuff": "yeah"
  },
  "someData": "whatever",
  "matchingData": "cool"
}

diff4.json

{
  "aMap": {
    "differentStuff": "woah"
  },
  "differentData": "bleargh",
  "matchingData": "cool"
}
user@computer:~/Projects/blog-post$ jolt diffy diff3.json diff4.json
Differences found. Input #1 contained this:
{
  "someData" : "whatever",
  "aMap" : {
    "stuff" : "yeah"
  }
}
Input #2 contained this:
{
  "differentData" : "bleargh",
  "aMap" : {
    "differentStuff" : "woah"
  }
}

Diffy does flag differences in array ordering. Consider the following two JSON documents:

array1.json

{
  "arrayData": [
    "one",
    "two",
    "three",
    "four"
  ]
}

array2.json

{
  "arrayData": [
    "one",
    "three",
    "two",
    "four"
  ]
}

Diffy detects the differences in the array:

user@computer:~/Projects/blog-post$ jolt diffy array1.json array2.json
Differences found. Input #1 contained this:
{
  "arrayData" : [ null, "two", "three", null ]
}
Input #2 contained this:
{
  "arrayData" : [ null, "three", "two", null ]
}

If for some reason you are crazy (like some of us at Bazaarvoice) and you want to ignore array order, you can use the -a flag for those occasions.

Transform

The Jolt Transform sub-command allows you to perform transforms on JSON documents provided via standard in or file. Transform also takes a spec, which is a JSON document that contains one or more Jolt specs to indicate what transformations should be done on the document. Transform has the option to produce the results with or without pretty print formatting.

You can read more about Jolt transforms here.

Sort

The Jolt Sort sub-command will sort JSON input. The sort order is standard alphabetical ascending, with a special case for “~” prefixed keys to be bumped to the top. This can be useful for debugging when you need to manually inspect the contents of a JSON document. Sort has the option to produce the results with or without pretty print formatting.

That does it for today. Hopefully you have an idea of what the Jolt CLI does in broad strokes. If you’re curious about Jolt, you can read much more about it here.

Jolt released to the world

We are pleased to announce a new open source contribution, a Java based JSON to JSON transformation tool named Jolt.

Jolt grew out of a BV Platform API project to migrate the backend from Solr/MySql to Cassandra/ElasticSearch.  As such, we were going to be doing a lot of data transformations from the new ElasticSearch JSON format to the BV Platform API JSON format.

Prior to Jolt, there were 3 general strategies for doing JSON to JSON transforms :

  1. Convert to XML, use XSLT, convert back to JSON
  2. Use your input JSON and a template language to build your output JSON
  3. Write custom code

Those options were rather unpalatable, so we went with option “4”, write reusable custom code.

The key insight was that there are actually separable concerns when doing a transform, and that part of the reason the XSLT or template approaches are unpalatable, is that they force you to deal with them all together.

Jolt tackles each separate concern individually :

  1. Identify the pieces of the input data that you care about and place them in the output JSON
    • Jolt provides a transform, “shift”, that has its own JSON based declarative DSL (domain specific language)
  2. Make sure the output JSON looks correct ( apply defaults to the output JSON )
    • Jolt provides a transform, “default”, with its own JSON based declarative DSL
  3. Handle all the JSON text formatting (comma, closing curly brackets etc)
    • Jolt operates on “hydrated” JSON data (Map<String,Object> and List<Object>) and leverages the Jackson library to handle serialization / JSON text formatting
  4. Verify the transform for data and format correctness
    • Jolt provides a test tool called Diffy so that you can unit test your transforms for data and format correctness
    • For format correctness, this is not as good of an answer as an xml dtd is, but you could pull in the JSON schema if you wanted
  5. Perform arbitrary custom data manipulations like adding fields together or performing date conversions
    • Jolt provides an interface where you can implement your own custom logic to be run in series with the other transforms

The code is now available at Github, and jar artifacts are now being published to Maven central.