Check out Bazaarvoice IO 2014 Technical Conference keynote speaker Otis Gospodnetic, @otisg, Founder of Sematext, Committer on Lucene, Solr, Nutch, Mahout, OpenRelevance, and author of Lucene in Action discussing the Open Source Search Evolution
Check out Bazaarvoice IO 2014 Technical Conference Keynote speaker Bob Metcalfe, @BobMetcalfe, Professor of Innovation at the University of Texas, discussing Metcalfe’s Law After 40 Years of Ethernet
Cloudformation is a powerful tool for building large, coordinated clusters of AWS resources. It has a sophisticated API, capable of supporting many different enterprise use-cases and scaling to thousands of stacks and resources. However, there is a downside: the JSON interface for specifying a stack can be cumbersome to manipulate, especially as your organization grows and code reuse becomes more necessary.
To address this and other concerns, Bazaarvoice engineers have built cloudformation-ruby-dsl, which turns your static Cloudformation JSON into dynamic, refactorable Ruby code.
https://github.com/bazaarvoice/cloudformation-ruby-dsl
The DSL closely mimics the structure of the underlying API, but with enough syntactic sugar to make building Cloudformation stacks less painful.
We use cloudformation-ruby-dsl in many projects across Bazaarvoice. Now that it’s proven its value, and gained some degree of maturity, we are releasing it to the larger world as open source, under the Apache 2.0 license. It is still an earlier stage project, and may undergo some further refactoring prior to it’s v1.0 release, but we don’t anticipate major API changes. Please download it, try it out, and let us know what you think (in comments below, or as issues or pull request on Github).
A big thanks to Shawn Smith, Dave Barcelo, Morgan Fletcher, Csongor Gyuricza, Igor Polishchuk, Nathaniel Eliot, Jona Fenocchi, and Tony Cui, for all their contributions to the code base.
Looks like everyone had a blast at bv.io this year! Thank yous go out to the conference speakers and hackathon participants for making this year outstanding. Here are some tweets and images from the conference:
RT @bazaarbrett: Hackathon is kicking off, very glad to be here! #bvhackathon pic.twitter.com/q8dnfqlQxh
— Bazaarvoice (@Bazaarvoice) April 2, 2014
BV.IO 2014 is on! #bvhackathon pic.twitter.com/OkRnOsoGo3
— Benton Porter (@bentonporter) April 2, 2014
As a developer I’ve used a variety of APIs and as a Developer Advocate at Bazaarvoice I help developers use our APIs. As a result I am keenly aware of the importance of good tools and of using the right tool for the right job. The right tool can save you time and frustration. With the recent release of the Converstations API Inspector, an inhouse web app built to help developers use our Conversations API, it seemed like the perfect time to survey tools that make using APIs easier.
This post is a survey covering several tools for interacting with HTTP based APIs. In it I introduce the tools and briefly explain how to use them. Each one has its advantages and all do some combination of the following:
Yes a web browser can be a tool for experimenting with APIs, so long as the API request only requires basic GET operations with query string parameters. At our developer portal we embed sample URLs in our documentation were possible to make seeing examples super easy for developers.
http://api.example.com/resource/1?passkey=12345&apiversion=2
Some browsers don’t necessarily present the response in a format easily readable by humans. Firefox users already get nicely formatted XML. To see similarly formatted JSON there is an extension called JSONView. To see the response headers LiveHTTP Headers will do the trick. Chrome also has a version of JSONview and for XML there’s XML Tree. They both offer built in consoles that provide network information like headers and cookies.
The venerable cURL is possibly the most flexable while at the same time being the least usable. As a command line tool some developers will balk at using it, but cURL’s simplicity and portability (nix, pc, mac) make it an appealing tool. cURL can make just about any request, assuming you can figure out how. These tutorials provide some easy to follow examples and the man page has all the gory details.
I’ll cover a few common usages here.
Note the use of quotes.
$ curl "http://api.example.com/resource/1?passkey=12345&apiversion=2"
Much more useful is making POST requests. The following submits data the same as if a web form were used (default Content-Type: application/x-www-form-urlencoded). Note -d ""
is the data sent in the request body.
$ curl -d "key1=some value&key2=some other value" http://api.example.com/resource/1
Many APIs expect data formatted in JSON or XML instead of encoded key=value pairs. This cURL command sends JSON in the body by using -H 'Content-Type: application/json'
to set the appropriate HTTP header.
$ curl -H 'Content-Type: application/json' -d '{"key": "some value"}' http://api.example.com/resource/1
The previous example can get unwieldy quickly as the size of your request body grows. Instead of adding the data directly to the command line you can instruct cURL to upload a file as the body. This is not the same as a “file upload.” It just tells cURL to use the contents of a file as the request body.
$ curl -H 'Content-Type: application/json' -d @myfile.json http://api.example.com/resource/1
One major drawback of cURL is that the response is displayed unformatted. The next command line tool solves that problem.
HTTPie is a python based command line tool similar to cURL in usage. According to the Github page “Its goal is to make CLI interaction with web services as human-friendly as possible.” This is accomplished with “simple and natural syntax” and “colorized responses.” It supports Linux, Mac OS X and Windows, JSON, uploads and custom headers among other things.
The documentation seems pretty thorough so I’ll just cover the same examples as with cURL above.
$ http "http://api.example.com/resource/1?passkey=12345&apiversion=2"
HTTPie assumes JSON as the default content type. Use --form
to indicate Content-Type: application/x-www-form-urlencoded
$ http --form POST api.example.org/resource/1 key1='some value' key2='some other value'
The =
is for strings and :=
indicates raw JSON.
$ http POST api.example.com/resource/1 key='some value' parameter2:=2 parameter3:=false parameter4:='["http", "pies"]'
HTTPie looks for a local file to include in the body after the <
symbol.
$ http POST api.example.com/resource/1 < resource.json
My personal favorite is the PostMan extension for Chrome. In my opinion it hits the sweet spot between functionality and usability by providing most of the HTTP functionality needed for testing APIs via an intuitive GUI. It also offers built in support for several authentication protocols including Oath 1.0. There a few things it can’t do because of restrictions imposed by Chrome, although there is a python based proxy to get around that if necessary.
The column on the left stores recent requests so you can redo them with ease. The results of any request will be displayed in the bottom half of the right column.
It’s possible to POST files, application/x-www-form-urlencoded, and your own raw data
Postman doesn’t support loading a BODY from a local file, but doing so isn’t necessary thanks to its easy to use interface.
Runscope is a little different than the others, but no less useful. It’s a webservice instead of a tool and not open source, although they do offer a free option. It can be used much like the other tools to manually create and execute various HTTP requests, but that is not what makes it so useful.
Runscope acts a proxy for API requests. Requests are made to Runscope, which passes them on to the API provider and then passes the responses back. In the process Runscope logs the requests and responses. At that point, to use their words, “you can view the request/response details, share requests with others, edit and retry requests from the web.”
Below is a quick example of what a Runscopeified request looks like. Read their official documentation to learn more.
before: $ curl "http://api.example.com/resource/1?passkey=12345&apiversion=2" after: $ curl "http://api-example-com-bucket_key.runscope.net/resource/1?passkey=12345&apiversion=2"
If you’re an API consumer you should use some or all of these tools. When I’m helping developers troubleshoot their Bazaarvoice API requests I use the browser when I can get away with it and switch to PostMan when things start to get hairy. There are other tools, I know because I omitted some of them. Feel free to mention your favorite in the comments.
(A version of this post was previously published at the author’s personal blog)
Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conference was themed “unlocking the power of our data.” You can read more about it here.
Nick Bailey is a software developer for datastax, the company that develops commercially supported, enterprise-ready solutions based on the open source Apache Cassandra database. In his BV I/O talk he introduces Cassandra, discusses several useful approaches to data modeling and presents a couple real world use-cases.
Some of y’all may have caught our previous blog post announcing the release of our Java JSON transformation library, Jolt.
Jolt is a powerful tool that can accomplish a variety of useful transformations on JSON data, and even chain multiple transformations together. Jolt has additional functionality that is useful for working with JSON including the ability to intelligently diff JSON documents and sort JSON documents. Users of Jolt can now transform, diff and sort JSON via the command line using the Jolt CLI. The CLI even allows you to string multiple commands together via standard in:
curl -s "http://some.host.com/stuff/data.json" | jolt sort | jolt diffy moreData.json
The Jolt CLI supports the following sub-commands:
The Jolt Diffy sub-command is an excellent way to compare JSON documents at the command line. It gives you a lot of friendly human-readable output, or you can have it run silently and examine the exit code to determine if any differences were found.
Have you ever tried using diff
to detect differences in a JSON document? Due to the nature of JSON data the regular diff
command can sometimes be inadequate. For Example, consider the following two JSON documents:
diff1.json
{ "someData": "dude", "moreData": "sweet" }
diff2.json
{ "moreData": "sweet", "someData": "dude" }
Running the diff
command from the command line returns the following:
user@computer:~/Projects/blog-post$ diff diff1.json diff2.json 2,3c2,3 < "someData": "dude", < "moreData": "sweet" --- > "moreData": "sweet", > "someData": "dude"
This really isn’t helpful. Since the example data is in the form of a map, then two documents are essentially equal. However, because the entries are ordered differently, diff
detects differences. Diffy ignores the ordering of map entries:
user@computer:~/Projects/blog-post$ jolt diffy diff1.json diff2.json Diffy found no differences
Diffy does a recursive tree walk to find differences throughout the JSON document, so it can detect differences N levels deep.
diff3.json
{ "aMap": { "stuff": "yeah" }, "someData": "whatever", "matchingData": "cool" }
diff4.json
{ "aMap": { "differentStuff": "woah" }, "differentData": "bleargh", "matchingData": "cool" }
user@computer:~/Projects/blog-post$ jolt diffy diff3.json diff4.json Differences found. Input #1 contained this: { "someData" : "whatever", "aMap" : { "stuff" : "yeah" } } Input #2 contained this: { "differentData" : "bleargh", "aMap" : { "differentStuff" : "woah" } }
Diffy does flag differences in array ordering. Consider the following two JSON documents:
array1.json
{ "arrayData": [ "one", "two", "three", "four" ] }
array2.json
{ "arrayData": [ "one", "three", "two", "four" ] }
Diffy detects the differences in the array:
user@computer:~/Projects/blog-post$ jolt diffy array1.json array2.json Differences found. Input #1 contained this: { "arrayData" : [ null, "two", "three", null ] } Input #2 contained this: { "arrayData" : [ null, "three", "two", null ] }
If for some reason you are crazy (like some of us at Bazaarvoice) and you want to ignore array order, you can use the -a
flag for those occasions.
The Jolt Transform sub-command allows you to perform transforms on JSON documents provided via standard in or file. Transform also takes a spec, which is a JSON document that contains one or more Jolt specs to indicate what transformations should be done on the document. Transform has the option to produce the results with or without pretty print formatting.
You can read more about Jolt transforms here.
The Jolt Sort sub-command will sort JSON input. The sort order is standard alphabetical ascending, with a special case for “~” prefixed keys to be bumped to the top. This can be useful for debugging when you need to manually inspect the contents of a JSON document. Sort has the option to produce the results with or without pretty print formatting.
That does it for today. Hopefully you have an idea of what the Jolt CLI does in broad strokes. If you’re curious about Jolt, you can read much more about it here.
Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conference was themed “unlocking the power of our data.” You can read more about it here.
In this presentation Peter Wang, co-founder and president of Continuum Analytics, discusses data analysis, the challenges presented by big data, and opportunities technology provides to overcome those challenges. He also discusses the importance of performance and visualization as well as advances the concept of “engineering on principle” which he demonstrates by discussing the design of the A-10 Thunderbolt and SAGE computerized command and control center for United States air defense. Peter ends his talk by discussing the Python programming language and its suitability for data analysis tasks. The full talk is below.
Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conferences was themed “unlocking the power of our data.” You can read more about it here.
The following video is of Dr. Jason Baldridge, currently an associate professor in the Linguistics Dept. at University of Texas and co-founder of People Pattern. Dr. Baldridge presented on the subject of text analysis. During his hour long talk he identified the desirable traits of a good text analysis function and focused on the problems of performing text categorization tasks given different amounts of labeled data. Big thanks to Dr. Baldridge for his informative presentation. The full talk is below: