Open sourcing cloudformation-ruby-dsl

Cloudformation is a powerful tool for building large, coordinated clusters of AWS resources. It has a sophisticated API, capable of supporting many different enterprise use-cases and scaling to thousands of stacks and resources. However, there is a downside: the JSON interface for specifying a stack can be cumbersome to manipulate, especially as your organization grows and code reuse becomes more necessary.

To address this and other concerns, Bazaarvoice engineers have built cloudformation-ruby-dsl, which turns your static Cloudformation JSON into dynamic, refactorable Ruby code.

https://github.com/bazaarvoice/cloudformation-ruby-dsl

The DSL closely mimics the structure of the underlying API, but with enough syntactic sugar to make building Cloudformation stacks less painful.

We use cloudformation-ruby-dsl in many projects across Bazaarvoice. Now that it’s proven its value, and gained some degree of maturity, we are releasing it to the larger world as open source, under the Apache 2.0 license. It is still an earlier stage project, and may undergo some further refactoring prior to it’s v1.0 release, but we don’t anticipate major API changes. Please download it, try it out, and let us know what you think (in comments below, or as issues or pull request on Github).

A big thanks to Shawn Smith, Dave Barcelo, Morgan Fletcher, Csongor Gyuricza, Igor Polishchuk, Nathaniel Eliot, Jona Fenocchi, and Tony Cui, for all their contributions to the code base.

Output from bv.io

Looks like everyone had a blast at bv.io this year! Thank yous go out to the conference speakers and hackathon participants for making this year outstanding. Here are some tweets and images from the conference:

Continue reading

HTTP/RESTful API troubleshooting tools

As a developer I’ve used a variety of APIs and as a Developer Advocate at Bazaarvoice I help developers use our APIs. As a result I am keenly aware of the importance of good tools and of using the right tool for the right job. The right tool can save you time and frustration. With the recent release of the Converstations API Inspector, an inhouse web app built to help developers use our Conversations API, it seemed like the perfect time to survey tools that make using APIs easier.

The tools

This post is a survey covering several tools for interacting with HTTP based APIs. In it I introduce the tools and briefly explain how to use them. Each one has its advantages and all do some combination of the following:

  • Construct and execute HTTP requests
  • Make requests other than GET, like POST, PUT, and DELETE
  • Define HTTP headers, cookies and body data in the request
  • See the response, possibly formatted for easier reading

Firefox and Chrome

Yes a web browser can be a tool for experimenting with APIs, so long as the API request only requires basic GET operations with query string parameters. At our developer portal we embed sample URLs in our documentation were possible to make seeing examples super easy for developers.

Basic GET

http://api.example.com/resource/1?passkey=12345&apiversion=2

Some browsers don’t necessarily present the response in a format easily readable by humans. Firefox users already get nicely formatted XML. To see similarly formatted JSON there is an extension called JSONView. To see the response headers LiveHTTP Headers will do the trick. Chrome also has a version of JSONview and for XML there’s XML Tree. They both offer built in consoles that provide network information like headers and cookies.

CURL

The venerable cURL is possibly the most flexable while at the same time being the least usable. As a command line tool some developers will balk at using it, but cURL’s simplicity and portability (nix, pc, mac) make it an appealing tool. cURL can make just about any request, assuming you can figure out how. These tutorials provide some easy to follow examples and the man page has all the gory details.

I’ll cover a few common usages here.

Basic GET

Note the use of quotes.

$ curl "http://api.example.com/resource/1?passkey=12345&apiversion=2"

Basic POST

Much more useful is making POST requests. The following submits data the same as if a web form were used (default Content-Type: application/x-www-form-urlencoded). Note -d "" is the data sent in the request body.

$ curl -d "key1=some value&key2=some other value" http://api.example.com/resource/1

POST with JSON body

Many APIs expect data formatted in JSON or XML instead of encoded key=value pairs. This cURL command sends JSON in the body by using -H 'Content-Type: application/json' to set the appropriate HTTP header.

$ curl -H 'Content-Type: application/json' -d '{"key": "some value"}' http://api.example.com/resource/1

POST with a file as the body

The previous example can get unwieldy quickly as the size of your request body grows. Instead of adding the data directly to the command line you can instruct cURL to upload a file as the body. This is not the same as a “file upload.” It just tells cURL to use the contents of a file as the request body.

$ curl -H 'Content-Type: application/json' -d @myfile.json http://api.example.com/resource/1

One major drawback of cURL is that the response is displayed unformatted. The next command line tool solves that problem.

HTTPie

HTTPie is a python based command line tool similar to cURL in usage. According to the Github page “Its goal is to make CLI interaction with web services as human-friendly as possible.” This is accomplished with “simple and natural syntax” and “colorized responses.” It supports Linux, Mac OS X and Windows, JSON, uploads and custom headers among other things.

The documentation seems pretty thorough so I’ll just cover the same examples as with cURL above.

Basic GET

$ http "http://api.example.com/resource/1?passkey=12345&apiversion=2"

Basic POST

HTTPie assumes JSON as the default content type. Use --form to indicate Content-Type: application/x-www-form-urlencoded

$ http --form POST api.example.org/resource/1 key1='some value' key2='some other value'

POST with JSON body

The = is for strings and := indicates raw JSON.

$ http POST api.example.com/resource/1 key='some value' parameter2:=2 parameter3:=false parameter4:='["http", "pies"]'

POST with a file as the body

HTTPie looks for a local file to include in the body after the < symbol.

$ http POST api.example.com/resource/1 < resource.json

PostMan Chrome extension

My personal favorite is the PostMan extension for Chrome. In my opinion it hits the sweet spot between functionality and usability by providing most of the HTTP functionality needed for testing APIs via an intuitive GUI. It also offers built in support for several authentication protocols including Oath 1.0. There a few things it can’t do because of restrictions imposed by Chrome, although there is a python based proxy to get around that if necessary.

Basic GET

The column on the left stores recent requests so you can redo them with ease. The results of any request will be displayed in the bottom half of the right column.

postman_get

Basic POST

It’s possible to POST files, application/x-www-form-urlencoded, and your own raw data

postman_post

POST with JSON body

Postman doesn’t support loading a BODY from a local file, but doing so isn’t necessary thanks to its easy to use interface.

postman_post_json

RunScope.com

Runscope is a little different than the others, but no less useful. It’s a webservice instead of a tool and not open source, although they do offer a free option. It can be used much like the other tools to manually create and execute various HTTP requests, but that is not what makes it so useful.

Runscope acts a proxy for API requests. Requests are made to Runscope, which passes them on to the API provider and then passes the responses back. In the process Runscope logs the requests and responses. At that point, to use their words, “you can view the request/response details, share requests with others, edit and retry requests from the web.”

Below is a quick example of what a Runscopeified request looks like. Read their official documentation to learn more.

before: $ curl "http://api.example.com/resource/1?passkey=12345&apiversion=2"
after: $ curl "http://api-example-com-bucket_key.runscope.net/resource/1?passkey=12345&apiversion=2"

Conclusion

If you’re an API consumer you should use some or all of these tools. When I’m helping developers troubleshoot their Bazaarvoice API requests I use the browser when I can get away with it and switch to PostMan when things start to get hairy. There are other tools, I know because I omitted some of them. Feel free to mention your favorite in the comments.

(A version of this post was previously published at the author’s personal blog)

BV I/O: Nick Bailey – Cassandra

Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conference was themed “unlocking the power of our data.” You can read more about it here.

Nick Bailey is a software developer for datastax, the company that develops commercially supported, enterprise-ready solutions based on the open source Apache Cassandra database. In his BV I/O talk he introduces Cassandra, discusses several useful approaches to data modeling and presents a couple real world use-cases.

Jolt Command Line Interface

Some of y’all may have caught our previous blog post announcing the release of our Java JSON transformation library, Jolt.

Jolt is a powerful tool that can accomplish a variety of useful transformations on JSON data, and even chain multiple transformations together. Jolt has additional functionality that is useful for working with JSON including the ability to intelligently diff JSON documents and sort JSON documents. Users of Jolt can now transform, diff and sort JSON via the command line using the Jolt CLI. The CLI even allows you to string multiple commands together via standard in:

curl -s "http://some.host.com/stuff/data.json" | jolt sort | jolt diffy moreData.json 

The Jolt CLI supports the following sub-commands:

Diffy

The Jolt Diffy sub-command is an excellent way to compare JSON documents at the command line. It gives you a lot of friendly human-readable output, or you can have it run silently and examine the exit code to determine if any differences were found.

Have you ever tried using diff to detect differences in a JSON document? Due to the nature of JSON data the regular diff command can sometimes be inadequate. For Example, consider the following two JSON documents:

diff1.json

{
  "someData": "dude",
  "moreData": "sweet"
}

diff2.json

{
  "moreData": "sweet",
  "someData": "dude"
}

Running the diff command from the command line returns the following:

user@computer:~/Projects/blog-post$ diff diff1.json diff2.json
2,3c2,3
< "someData": "dude",
< "moreData": "sweet"
---
> "moreData": "sweet",
> "someData": "dude"

This really isn’t helpful. Since the example data is in the form of a map, then two documents are essentially equal. However, because the entries are ordered differently, diff detects differences. Diffy ignores the ordering of map entries:

user@computer:~/Projects/blog-post$ jolt diffy diff1.json diff2.json
Diffy found no differences

Diffy does a recursive tree walk to find differences throughout the JSON document, so it can detect differences N levels deep.

diff3.json

{
  "aMap": {
    "stuff": "yeah"
  },
  "someData": "whatever",
  "matchingData": "cool"
}

diff4.json

{
  "aMap": {
    "differentStuff": "woah"
  },
  "differentData": "bleargh",
  "matchingData": "cool"
}
user@computer:~/Projects/blog-post$ jolt diffy diff3.json diff4.json
Differences found. Input #1 contained this:
{
  "someData" : "whatever",
  "aMap" : {
    "stuff" : "yeah"
  }
}
Input #2 contained this:
{
  "differentData" : "bleargh",
  "aMap" : {
    "differentStuff" : "woah"
  }
}

Diffy does flag differences in array ordering. Consider the following two JSON documents:

array1.json

{
  "arrayData": [
    "one",
    "two",
    "three",
    "four"
  ]
}

array2.json

{
  "arrayData": [
    "one",
    "three",
    "two",
    "four"
  ]
}

Diffy detects the differences in the array:

user@computer:~/Projects/blog-post$ jolt diffy array1.json array2.json
Differences found. Input #1 contained this:
{
  "arrayData" : [ null, "two", "three", null ]
}
Input #2 contained this:
{
  "arrayData" : [ null, "three", "two", null ]
}

If for some reason you are crazy (like some of us at Bazaarvoice) and you want to ignore array order, you can use the -a flag for those occasions.

Transform

The Jolt Transform sub-command allows you to perform transforms on JSON documents provided via standard in or file. Transform also takes a spec, which is a JSON document that contains one or more Jolt specs to indicate what transformations should be done on the document. Transform has the option to produce the results with or without pretty print formatting.

You can read more about Jolt transforms here.

Sort

The Jolt Sort sub-command will sort JSON input. The sort order is standard alphabetical ascending, with a special case for “~” prefixed keys to be bumped to the top. This can be useful for debugging when you need to manually inspect the contents of a JSON document. Sort has the option to produce the results with or without pretty print formatting.

That does it for today. Hopefully you have an idea of what the Jolt CLI does in broad strokes. If you’re curious about Jolt, you can read much more about it here.

BV I/O: Peter Wang – Architecting for Data

Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conference was themed “unlocking the power of our data.” You can read more about it here.

In this presentation Peter Wang, co-founder and president of Continuum Analytics, discusses data analysis, the challenges presented by big data, and opportunities technology provides to overcome those challenges. He also discusses the importance of performance and visualization as well as advances the concept of “engineering on principle” which he demonstrates by discussing the design of the A-10 Thunderbolt and SAGE computerized command and control center for United States air defense. Peter ends his talk by discussing the Python programming language and its suitability for data analysis tasks. The full talk is below.

BV I/O: Dr. Jason Baldridge – Scaling Models for Text Analysis

Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conferences was themed “unlocking the power of our data.” You can read more about it here.

The following video is of Dr. Jason Baldridge, currently an associate professor in the Linguistics Dept. at University of Texas and co-founder of People Pattern. Dr. Baldridge presented on the subject of text analysis. During his hour long talk he identified the desirable traits of a good text analysis function and focused on the problems of performing text categorization tasks given different amounts of labeled data. Big thanks to Dr. Baldridge for his informative presentation. The full talk is below: