Category Archives: Uncategorized

Upgrading Dropwizard 0.6 to 0.7

At Bazaarvoice we use Dropwizard for a lot of our java based SOA services. Recently I upgraded our Dropwizard dependency from 0.6 to the newer 0.7 version on a few different services. Based on this experience I have some observations that might help any other developers attempting to do the same thing.

Package Name Change
The first change to look at is the new package naming. The new io.dropwizard package replaces com.yammer.dropwizard. If you are using codahale’s metrics library as well, you’ll need to change com.yammer.metrics to com.codahale.metrics. I found that this was a good place to start the migration: if you remove the old dependencies from your pom.xml you can start to track down all the places in your code that will need attention (if you’re using a sufficiently nosy IDE).

- com.yammer.dropwizard -> io.dropwizard
- com.yammer.dropwizard.config -> io.dropwizard.setup
- com.yammer.metrics -> com.codahale.metrics

Class Name Change
aka: where did my Services go?

Something you may notice quickly is that the Service interface is gone, it has been moved to a new name: Application.

- Service -> Application

Configuration Changes
The Configuration object hierarchy and yaml organization has also changed. The http section in yaml has moved to server with significant working differences.

Here’s an old http configuration:

http:
  port: 8080
  adminPort: 8081
  connectorType: NONBLOCKING
  requestLog:
    console:
      enabled: true
    file:
      enabled: true
      archive: false
      currentLogFilename: target/request.log

and here is a new server configuration:

server:
  applicationConnectors:
    - type: http
      port: 8080
  adminConnectors:
    - type: http
      port: 8081
  requestLog:
    appenders:
      - type: console
      - type: file
        currentLogFilename: target/request.log
        archive: true

There are at least two major things to notice here:

  1. You can create multiple connectors for either the admin or application context. You can now serve several different protocols on different ports.
  2. Logging is now appender based, and you can configure a list of appenders for the request log.

Speaking of appender-based logging, the logging configuration has changed as well.

Here is an old logging configuration:

logging:
  console:
    enabled: true
  file:
    enabled: true
    archive: false
    currentLogFilename: target/diagnostic.log
  level: INFO
  loggers:
    "org.apache.zookeeper": WARN
    "com.sun.jersey.spi.container.servlet.WebComponent": ERROR

and here is a new one:

logging:
  level: INFO
  loggers:
    "org.apache.zookeeper": WARN
    "com.sun.jersey.spi.container.servlet.WebComponent": ERROR
  appenders:
    - type: console
    - type: file
      archive: false
      currentLogFilename: target/diagnostic.log

Now that you can configure a list of logback appenders, you can write your own or get one from a library. Previously this kind of logging configuration was not possible without significant hacking.

Environment Changes
The whole environment API has been re-designed for more logical access to different components. Rather than just making calls to methods on the environment object, there are now six component specific environment objects to access.

JerseyEnvironment jersey = environment.jersey();
ServletEnvironment servlets = environment.servlets();
AdminEnvironment admin = environment.admin();
LifecycleEnvironment lifecycle = environment.lifecycle();
MetricRegistry metrics = environment.metrics();
HealthCheckRegistry healthCheckRegistry = environment.healthChecks();

AdminEnvironment extends ServletEnvironment since it’s just the admin servlet context.

By treating the environment as a collection of libraries rather than a Dropwizard monolith, fine-grained control over several configurations is now possible and the underlying components are easier to interact with.

Here is a short rundown of the changes:

Lifecycle Environment
Several common methods were moved to the lifecycle environment, and the build pattern for Executor services has changed.

0.6:

     environment.manage(uselessManaged);
     environment.addServerLifecycleListener(uselessListener);
     ExecutorService service = environment.managedExecutorService("worker-%", minPoolSize, maxPoolSize, keepAliveTime, duration);
     ExecutorServiceManager esm = new ExecutorServiceManager(service, shutdownPeriod, unit, poolname);
     ScheduledExecutorService scheduledService = environment.managedScheduledExecutorService("scheduled-worker-%", corePoolSize);

0.7:

     environment.lifecycle().manage(uselessManaged);
     environment.lifecycle().addServerLifecycleListener(uselessListener);
     ExecutorService service = environment.lifecycle().executorService("worker-%")
             .minThreads(minPoolSize)
             .maxThreads(maxPoolSize)
             .keepAliveTime(Duration.minutes(keepAliveTime))
             .build();
     ExecutorServiceManager esm = new ExecutorServiceManager(service, Duration.seconds(shutdownPeriod), poolname);
     ScheduledExecutorService scheduledExecutorService = environment.lifecycle().scheduledExecutorService("scheduled-worker-%")
             .threads(corePoolSize)
             .build();

Other Miscellaneous Environment Changes
Here are a few more common environment configuration methods that have changed:

0.6

environment.addResource(Dropwizard6Resource.class);

environment.addHealthCheck(new DeadlockHealthCheck());

environment.addFilter(new LoggerContextFilter(), "/loggedpath");

environment.addServlet(PingServlet.class, "/ping");

0.7

environment.jersey().register(Dropwizard7Resource.class);

environment.healthChecks().register("deadlock-healthcheck", new ThreadDeadlockHealthCheck());

environment.servlets().addFilter("loggedContextFilter", new LoggerContextFilter()).addMappingForUrlPatterns(EnumSet.allOf(DispatcherType.class), true, "/loggedpath");

environment.servlets().addServlet("ping", PingServlet.class).addMapping("/ping");

Object Mapper Access

It can be useful to access the objectMapper for configuration and testing purposes.

0.6

ObjectMapper objectMapper = bootstrap.getObjectMapperFactory().build();

0.7

ObjectMapper objectMapper = bootstrap.getObjectMapper();

HttpConfiguration
This has changed a lot, it is much more configurable and not quite as simple as before.
0.6

HttpConfiguration httpConfiguration = configuration.getHttpConfiguration();
int applicationPort = httpConfiguration.getPort();

0.7

HttpConnectorFactory httpConnectorFactory = (HttpConnectorFactory) ((DefaultServerFactory) configuration.getServerFactory()).getApplicationConnectors().get(0);
int applicationPort = httpConnectorFactory.getPort();

Test Changes
The functionality provided by extending ResourceTest has been moved to ResourceTestRule.
0.6

import com.yammer.dropwizard.testing.ResourceTest;

public class Dropwizard6ServiceResourceTest extends ResourceTest {
  @Override
  protected void setUpResources() throws Exception {
    addResource(Dropwizard6Resource.class);
    addFeature("booleanFeature", false);
    addProperty("integerProperty", new Integer(1));
    addProvider(HelpfulServiceProvider.class);
  }
}

0.7

import io.dropwizard.testing.junit.ResourceTestRule;
import org.junit.Rule;

public class Dropwizard7ServiceResourceTest {

  @Rule
  ResourceTestRule resources = setUpResources();

  protected ResourceTestRule setUpResources() {
    return ResourceTestRule.builder()
      .addResource(Dropwizard6Resource.class)
      .addFeature("booleanFeature", false)
      .addProperty("integerProperty", new Integer(1))
      .addProvider(HelpfulServiceProvider.class)
      .build();
  }
}

Dependency Changes

Dropwizard 0.7 has new dependencies that might affect your project. I’ll go over some of the big ones that I ran into during my migrations.

Guava
Guava 18.0 has a few API changes:

  • Closeables.closeQuietly only works on objects implementing InputStream instead of anything implementing Closeable.
  • All the methods on HashCodes have been migrated to HashCode.

Metrics
Metric 3.0.2 is a pretty big revision to the old version, there is no longer a static Metrics object available as the default registry. Now MetricRegistries are instantiated objects that need to be managed by your application. Dropwizard 0.7 handles this by giving you a place to put the default registry for your application: bootstrap.getMetricRegistry().

Compatible library version changes
These libraries changed versions but required no other code changes. Some of them are changed to match Dropwizard dependencies, but are not directly used in Dropwizard.

Jackson
2.3.3

Jersey
1.18.1

Coursera Metrics-Datadog
1.0.2

Jetty
9.0.7.v20131107

Apache Curator
2.4.2

Amazon AWS SDK
1.9.21

Future Concerns
Dropwizard 0.8
The newest version of Dropwizard is now 0.8, once it is proven stable we’ll start migrating. Hopefully I’ll find time to write another post when that happens.

Thank You For Reading
I hope this article helps.

Open sourcing cloudformation-ruby-dsl

Cloudformation is a powerful tool for building large, coordinated clusters of AWS resources. It has a sophisticated API, capable of supporting many different enterprise use-cases and scaling to thousands of stacks and resources. However, there is a downside: the JSON interface for specifying a stack can be cumbersome to manipulate, especially as your organization grows and code reuse becomes more necessary.

To address this and other concerns, Bazaarvoice engineers have built cloudformation-ruby-dsl, which turns your static Cloudformation JSON into dynamic, refactorable Ruby code.

The DSL closely mimics the structure of the underlying API, but with enough syntactic sugar to make building Cloudformation stacks less painful.

We use cloudformation-ruby-dsl in many projects across Bazaarvoice. Now that it’s proven its value, and gained some degree of maturity, we are releasing it to the larger world as open source, under the Apache 2.0 license. It is still an earlier stage project, and may undergo some further refactoring prior to it’s v1.0 release, but we don’t anticipate major API changes. Please download it, try it out, and let us know what you think (in comments below, or as issues or pull request on Github).

A big thanks to Shawn Smith, Dave Barcelo, Morgan Fletcher, Csongor Gyuricza, Igor Polishchuk, Nathaniel Eliot, Jona Fenocchi, and Tony Cui, for all their contributions to the code base.

Output from bv.io

Looks like everyone had a blast at bv.io this year! Thank yous go out to the conference speakers and hackathon participants for making this year outstanding. Here are some tweets and images from the conference:

Continue reading

HTTP/RESTful API troubleshooting tools

As a developer I’ve used a variety of APIs and as a Developer Advocate at Bazaarvoice I help developers use our APIs. As a result I am keenly aware of the importance of good tools and of using the right tool for the right job. The right tool can save you time and frustration. With the recent release of the Converstations API Inspector, an inhouse web app built to help developers use our Conversations API, it seemed like the perfect time to survey tools that make using APIs easier.

The tools

This post is a survey covering several tools for interacting with HTTP based APIs. In it I introduce the tools and briefly explain how to use them. Each one has its advantages and all do some combination of the following:

  • Construct and execute HTTP requests
  • Make requests other than GET, like POST, PUT, and DELETE
  • Define HTTP headers, cookies and body data in the request
  • See the response, possibly formatted for easier reading

Firefox and Chrome

Yes a web browser can be a tool for experimenting with APIs, so long as the API request only requires basic GET operations with query string parameters. At our developer portal we embed sample URLs in our documentation were possible to make seeing examples super easy for developers.

Basic GET

http://api.example.com/resource/1?passkey=12345&apiversion=2

Some browsers don’t necessarily present the response in a format easily readable by humans. Firefox users already get nicely formatted XML. To see similarly formatted JSON there is an extension called JSONView. To see the response headers LiveHTTP Headers will do the trick. Chrome also has a version of JSONview and for XML there’s XML Tree. They both offer built in consoles that provide network information like headers and cookies.

CURL

The venerable cURL is possibly the most flexable while at the same time being the least usable. As a command line tool some developers will balk at using it, but cURL’s simplicity and portability (nix, pc, mac) make it an appealing tool. cURL can make just about any request, assuming you can figure out how. These tutorials provide some easy to follow examples and the man page has all the gory details.

I’ll cover a few common usages here.

Basic GET

Note the use of quotes.

$ curl "http://api.example.com/resource/1?passkey=12345&apiversion=2"

Basic POST

Much more useful is making POST requests. The following submits data the same as if a web form were used (default Content-Type: application/x-www-form-urlencoded). Note -d "" is the data sent in the request body.

$ curl -d "key1=some value&key2=some other value" http://api.example.com/resource/1

POST with JSON body

Many APIs expect data formatted in JSON or XML instead of encoded key=value pairs. This cURL command sends JSON in the body by using -H 'Content-Type: application/json' to set the appropriate HTTP header.

$ curl -H 'Content-Type: application/json' -d '{"key": "some value"}' http://api.example.com/resource/1

POST with a file as the body

The previous example can get unwieldy quickly as the size of your request body grows. Instead of adding the data directly to the command line you can instruct cURL to upload a file as the body. This is not the same as a “file upload.” It just tells cURL to use the contents of a file as the request body.

$ curl -H 'Content-Type: application/json' -d @myfile.json http://api.example.com/resource/1

One major drawback of cURL is that the response is displayed unformatted. The next command line tool solves that problem.

HTTPie

HTTPie is a python based command line tool similar to cURL in usage. According to the Github page “Its goal is to make CLI interaction with web services as human-friendly as possible.” This is accomplished with “simple and natural syntax” and “colorized responses.” It supports Linux, Mac OS X and Windows, JSON, uploads and custom headers among other things.

The documentation seems pretty thorough so I’ll just cover the same examples as with cURL above.

Basic GET

$ http "http://api.example.com/resource/1?passkey=12345&apiversion=2"

Basic POST

HTTPie assumes JSON as the default content type. Use --form to indicate Content-Type: application/x-www-form-urlencoded

$ http --form POST api.example.org/resource/1 key1='some value' key2='some other value'

POST with JSON body

The = is for strings and := indicates raw JSON.

$ http POST api.example.com/resource/1 key='some value' parameter2:=2 parameter3:=false parameter4:='["http", "pies"]'

POST with a file as the body

HTTPie looks for a local file to include in the body after the < symbol.

$ http POST api.example.com/resource/1 < resource.json

PostMan Chrome extension

My personal favorite is the PostMan extension for Chrome. In my opinion it hits the sweet spot between functionality and usability by providing most of the HTTP functionality needed for testing APIs via an intuitive GUI. It also offers built in support for several authentication protocols including Oath 1.0. There a few things it can’t do because of restrictions imposed by Chrome, although there is a python based proxy to get around that if necessary.

Basic GET

The column on the left stores recent requests so you can redo them with ease. The results of any request will be displayed in the bottom half of the right column.

postman_get

Basic POST

It’s possible to POST files, application/x-www-form-urlencoded, and your own raw data

postman_post

POST with JSON body

Postman doesn’t support loading a BODY from a local file, but doing so isn’t necessary thanks to its easy to use interface.

postman_post_json

RunScope.com

Runscope is a little different than the others, but no less useful. It’s a webservice instead of a tool and not open source, although they do offer a free option. It can be used much like the other tools to manually create and execute various HTTP requests, but that is not what makes it so useful.

Runscope acts a proxy for API requests. Requests are made to Runscope, which passes them on to the API provider and then passes the responses back. In the process Runscope logs the requests and responses. At that point, to use their words, “you can view the request/response details, share requests with others, edit and retry requests from the web.”

Below is a quick example of what a Runscopeified request looks like. Read their official documentation to learn more.

before: $ curl "http://api.example.com/resource/1?passkey=12345&apiversion=2"
after: $ curl "http://api-example-com-bucket_key.runscope.net/resource/1?passkey=12345&apiversion=2"

Conclusion

If you’re an API consumer you should use some or all of these tools. When I’m helping developers troubleshoot their Bazaarvoice API requests I use the browser when I can get away with it and switch to PostMan when things start to get hairy. There are other tools, I know because I omitted some of them. Feel free to mention your favorite in the comments.

(A version of this post was previously published at the author’s personal blog)

BV I/O: Nick Bailey – Cassandra

Every year Bazaarvoice holds an internal technical conference for our engineers. Each conference has a theme and as a part of these conferences we invite noted experts in fields related to the theme to give presentations. The latest conference was themed “unlocking the power of our data.” You can read more about it here.

Nick Bailey is a software developer for datastax, the company that develops commercially supported, enterprise-ready solutions based on the open source Apache Cassandra database. In his BV I/O talk he introduces Cassandra, discusses several useful approaches to data modeling and presents a couple real world use-cases.

Bazaarvoice SDK for Windows Phone 8 has been Open Sourced!

The Bazaarvoice Mobile Team is happy to announce our newest mobile SDK for Windows Phone 8. It is a .NET SDK that supports Windows Phone 8 as well as Windows 8 Store apps. This will add to our list of current open-source mobile SDKs for iOS, Android and Appcelerator Titanium.

The SDK will allow developers to more quickly build applications for Windows Phone 8 and Windows 8 Store that use the Bazaarvoice API. While the code is the same for both platforms, they each need their own compiled DLL to be used in their perspective Visual Basic projects. As Windows Phone 8 and Windows 8 Store apps gain more traction in the marketplace, we hope the Bazaarvoice SDK will prove a valuable resource to developers using our API.

Learn more at our Windows Phone 8 SDK home page, check out our repo in GitHub, and contribute!

The SDK was developed by two summer interns: Devin Carr from Texas A&M and Ralph Pina from UT-Austin. Go horns!

In the next few days we’ll publish a more in-depth post about how the Windows Phone 8 SDK was built, some of design decisions and challenges.

Interns and graduates – Keys to job search success

Bazaarvoice R&D had a great year of intensive university recruiting with 12 interns joining our teams last summer and working side-by-side with the developers on our products. We have further expanded the program this year to accommodate two co-op positions for students from the University of Waterloo. The influx of fresh ideas and additional energy from these students has been great for the whole organization!

For many students, looking for an internship or graduate employment may be their first time experiencing the interview process and creating resumes, and I’d like to offer some advice for those of you in this position. These guidelines are intended to help you think about how to present your capabilities in the best possible light, and in my experience, apply to tech interviews at most companies.

What we’re looking for

For new graduate and internship positions, it often surprises students that tech companies are, in general, less focused on them knowing specific technologies or languages. They are more focused on determining whether you have:

  • solid CS fundamentals (data structures, algorithms, etc.)
  • passion for problem-solving and software development
  • an ability to learn quickly

It is generally expected that you have confidence in at least one programming language, with a solid grip on its syntax and common usage. It is also helpful for you to demonstrate an understanding of object-oriented concepts and design but, again, this can be independent of any specific language.

Resumes

Your resume is a critical chance to sell the reader on your abilities. While it can be uncomfortable to ‘toot your own horn’ it is important that you use your resume to try to differentiate yourself from the sea of other candidates. A dry list of courses or projects is unlikely to do this, so it really is worth investing a lot of thought in expressing on the resume what was particularly interesting, important, or impressive about what you did.

  • Definitely include details of any side projects that you’ve worked on, and don’t be afraid to demo them if you get the chance (mobile apps, websites, etc.). Some students are embarrassed because they are just hobby-projects and not commercial-grade applications – this doesn’t matter!
  • Include details of anything that you are proud of, or that you did differently or better than others.
  • If you mention group/team projects be sure to make it clear what YOU did, rather than just talking about what the team did. Which bits were you responsible for?
  • Don’t emphasize anything on your resume that you are not prepared for a detailed technical discussion on. For example, if you make a point of calling out a multi-threaded, C-programming project, you should be confident talking about threading, and being able to do impromptu coding on a whiteboard using threads. We’re not expecting perfection, but are looking for a solid grasp on fundamentals and syntax.
  • Leave out cryptic course numbers – it’s unlikely that the person reading your resume knows what ‘CS252’ means, but they will understand ‘Data Structures’.
  • Make sure you have good contact info – we do occasionally see resumes with no contact info, or where the contact info had typos.

Interview technique

While an interview can be nerve-wracking, interviewers love to see people do well and are there to be supportive.

  • Coding on a whiteboard is difficult (but the interviewer knows that) – a large chunk of most technical interviews is problem-solving or coding on a whiteboard. Interviewers are very understanding that coding on a whiteboard is not easy, so don’t worry about building neat code from the outset.
  • Don’t rush to put code on the board – think about the problem, ask clarifying questions, and maybe jot a few examples down to help you get oriented.
  • Talk through what you are thinking – a large part of a technical interview is understanding how the person is thinking, even if you’re running through approaches to eliminate. Getting to an answer is only part of what the interviewer is looking for, and they want to see what your thought process is.
  • Ask for help (but not too quickly!) – it’s OK that you don’t know everything, and sometimes get stuck. If you get stuck, explain what you are stuck on and the interviewer will be prepared to guide you.
  • Use what you are familiar with – you will likely be asked to code in the language you are most comfortable with. Do it! Some students think the interviewer is ‘expecting’ them to use a certain language because it’s one we used at the hiring company, but that’s not the case.
  • Perfection is not required – while no interviewer is ever going to complain if you do everything perfectly, forgetting a little piece of syntax, or a particular library function name is not fatal. It’s more important that you write code that is easy to follow and logically reasoned through. Also remember it’s OK to ask for help if you are truly stuck. At the same time, if your syntax is way off, and you’re asking for help on every line of code then you’re probably not demonstrating the level of mastery that is expected.
  • Consider bringing a laptop if you have code to show – while the interviewer may choose to focus on whiteboard problem-solving, it is a nice option to be able to offer showing them code you’ve written. Be sure to bring your best examples; ones that show off your strengths or originality. Make sure it is code you know inside-out as you will likely be questioned in detail about why you did things a certain way.
  • Come prepared with questions for the interviewer – the interview is an opportunity for you to get to know the company in more detail, and see if it’s somewhere you’d like to work. Think about the things that are important to you, and that you’d use to decide between different employment/internship offers.

Over my career I’ve found that these rules of thumb apply well in all technical interview/application processes, and hopefully they are useful guidance for students out there. Any other advice from readers?

5 Ways to Improve Your Mobile Submission Form

Here at Bazaarvoice, we’re constantly focused on improving the user experience for our products. From the initial email invitation, to the submission form, to the way in which reviews are presented, we want to make sure that our interfaces are as flexible and intuitive as possible.

Part of my job on the mobile team at Bazaarvoice is to make sure that our products reflect best practices when displayed on mobile devices. In reality, that means running hands-on user tests, A/B testing different designs, and gathering detailed information about the way in which users interact with our products.

Recently, we ran a test with Buckle, one of our partner clients, to experiment with various mobile-friendly submission forms. What follows are some of the takeaways from those experiments.

1. Handle Landscape Mode Gracefully

It is important that users are able to navigate forms easily while in landscape mode. It becomes particularly important to support landscape for form fields that solicit text input. We found that mobile users will, on average, input about 20% fewer words in their reviews than desktop users, so the last thing we want to do is to make it even more difficult to enter text. Many users prefer to type in landscape mode as it provides for a larger keyboard.

2. Make Interactions Easy

Generally, a desktop user with a mouse can interact much more precisely than a mobile user with a clumsy finger. Therefore, it is important to make sure that elements are large enough to be either clicked or tapped. Apple recommends that tappable elements be at least 44×44 pixels. In our experimental form, we intentionally oversized our radio buttons, selection drop-downs and sliders to make the form easier to interact with and to prevent form errors.

Additionally, mobile devices provide a number of different keyboard layouts for different types of inputs. For instance, an input type of “email” might surface the @ symbol to make it more readily accessible. In order to take advantage of the various keyboard layouts, be sure to properly specify the input type on your form elements.

3. Snappy Over Flashy

The first version of our experimental form involved a heavy amount of JavaScript to do things like alpha animations and transforms. While our animations generally ran smoothly on a desktop, they became sluggish on the iPhone and lower end Android devices.

When designing for mobile, be sure to prioritize function over flashiness. Slick animations can greatly improve the usability and “wow” factor of a site, but they should be used sparingly. If necessary, use hardware-accelerated transforms to minimize sluggishness.

4. Choose The Most Efficient Form Path

Overall, our goal is to allow the user to complete our form in the quickest, simplest manner possible. In our testing, we found that a surprising number of users preferred to navigate and exit form elements via the “done” button rather than using the next/previous buttons. This has several interesting consequences.

First, short forms are better than tall forms. While some users “tab” through fields, most users scroll. By minimizing the vertical spacing between elements, users do not need to scroll as far to get to the next field.

Second, for most users, the interaction with a select element will involve 3 clicks: open, select, and done. Therefore, if a user is selecting between just a few options, it is better to use oversized radio buttons than select elements.

5. Provide Instant Feedback

If a user submits an invalid form value such as a malformed email address, provide a clear error message that instructs the user how to fix the error. If possible, provide an error near the offending field. Additionally, once the form field becomes valid, notify the user immediately rather than requiring the user to submit the form again.

For our experimental form, we used the JQuery validation library, which makes basic form validation dead simple. Since it is all client side, it makes validation snappy as well.

Our tests are ongoing, so be on the lookout for more updates soon. Until then, hopefully these insights will be valuable to others as the Internet becomes more mobile-friendly.

SELECT developers FROM Bazaarvoice UNION SELECT flags FROM Stripe_CTF;

Stripe (https://stripe.com/) held their second capture the flag event, this time the CTF was dedicated to web-based vulnerabilities and exploits. As a new Security Engineer here at BV the timing of this was perfect. It allowed me to use it as a vehicle for awareness and to ramp up curiosity, interest and even excitement for web application security (webappsec).

Let’s start with some definitions to get us all on the same page.

Capture the flag is a traditional outdoor game where two teams each have a flag and the objective is to capture the other team’s flag, located at the team’s base. In computer security the flag is a piece of data stored somewhere on a vulnerable computer and the goal is to use tools and know how to “hack” the machine to find the flag.

Webappsec is the acronym for Web Application Security a branch of Information Security that deals specifically with security of websites and web applications. In my opinion there is more to it than just that. As we see from the OWASP Top 10 (https://www.owasp.org/index.php/Category:OWASP_Top_Ten_Project) and the emerging DevOps movement, web developers are doing more than just coding. They are now configuring full stacks to support their applications. So, webappsec needs to encompass more than just the application. It has to deal with developer and management education, configuration management, networking security, OS hardening, just to name a few areas, and of course not to forget the code.

Now that we know what CTF and webappsec are lets move on to the details of the event.

The CTF started on Wednesday, August 22nd, 2PM CDT today at 2PM CDT and ran for a week until Wednesday, August 29th, 2012 2PM CDT. One week == Seven Days == 168 hours and, for me at least, this was not enough. This capture the flag included 9 levels (0-8) to conquer before arriving at the actual flag. Each level was progressively more difficult and when completed provided you with the password for the next level.

The CTF was very challenging, yet no security specific training/knowledge was needed in order to participate. In my opinion the minimum requirements to participate and find it fun and enjoyable was some knowledge of web application programming and a willingness to learn and research possible vulnerabilities and potential exploits.

In total about 15 or so Bazaarvoice cohorts tried their hand at the CTF. Four actually captured the flag! When did they have the time? Well, we used our 20% time while at work and personal time when away from work. Late nights, lunch hours, weekend, etc.…believe me once you get started you are hooked. You end up thinking about it all day. You think about the level you are on, how to solve it and whether the previous levels offer up any hints or possible staging areas for potential exploits. If you are anything like me, first a developer at heart then a security professional, this type of activity is a great way to test out your chops and have some fun while learning new things and potentially bringing some of the lessons back to your teams. This is the perfect avenue for awareness – code wins after all!

Here are quotes from some of the BV participants:

I got to 8 last night. I don’t think I’ll get it in the next 4 hours, but it was a fun challenge. – Cory

Made it to level 8 last night, but not until after midnight. Don’t think I’ll be finishing, but it was fun all the same. Level 8 is really interesting, and I look forward to hearing how folks finished it. – RC

I’ve been having a lot of fun with it. – Jeremy

Overall everyone that tried it had a good time with it no matter how far they actually got.

All in all a total of four BVers completed the CTF and captured the final flag!

 

Congratulations to the following BVers for successfully capturing the flag:

 

 

For their last CTF Stripe made available the solutions and a working VM with each level. I hope they do the same for this one; this will give everyone the opportunity to learn and see what vulnerabilities were found and how they were exploited in order to complete each level and eventually capture the flag.

For now I leave you with a quote which I think embodies web application security, its education and use in the development community:

Secure programming is a mind-set. It may start with a week or two of training, but it will require constant reinforcement. – Karl Keller, President of IS Power

As with many things here at Bazaarvoice, the education and growth of our developers and their skills often take on a fun and enjoyable approach.

MongoDB Arrays and Atomicity

Over time, even technologies that are tried and true begin to show their age. This is especially true for data stores as the shear amount of data explodes and site traffic increases. Because of this, we are continually working with new technologies to determine whether they have a place in our primary stack.

To that end, we began using MongoDB for one of our new internal systems several months ago. Its quick setup and ease of use make it a great data store for starting a new project that’s constantly in flux. Its robustness and scalability make it a worthy contender for our primary data store, if it should prove capable of handling our use cases.

Being traditionally a MySQL shop, our team is used to working with databases in a certain way. When we write code that interacts with the DB, we use transactions and locks rather liberally. Therefore using MongoDB is a rather significant paradigm shift for our developers.

Though MongoDB doesn’t have true ACID compliance, it does support atomic updates to an individual object. And because a “single” object in Mongo can actually be quite complex, this is sufficient for many operations. For example, for the following complex object, setting values, incrementing numbers, adding and removing items from an array, can all be done at the same time atomically:

db.employees.update(
  {_id: ObjectId("4e5e4c0e945a66e30892e4e3")},
  {$set: {firstName: "Bobby", lastName: "Schmidt"}}
);

Even setting values of child objects can be done atomically:

db.employees.update(
  {_id: ObjectId("4e5e4c0e945a66e30892e4e3")},
  {$set: {"position.title": "Software Developer", "position.id": 1015}}
)

In fact, individual items in an array can be changed:

db.employees.update(
  {_id: ObjectId("4e5e4c0e945a66e30892e4e3")},
  {$set: {"employmentHistory.0.company": "Google, Inc."}}
);

However, one thing you may notice is that the code above to modify a specific item in the array requires us to know the specific index of the item in the array we wish to modify. That seems simple, because we can pull down the object, find the index of the item, and then change it. The difficulty with this is that other operations on an array do not necessarily guarantee that the order of the elements will be the same. For example, if an item in an array is removed and then added later, it will always be appended to the array. The only way to keep an array in a particular order is to $set it (replacing all the objects in it), which for a large array may not have the best performance.

This problem is best demonstrated with the following race conditions:

One solution to this problem is to $pull a particular item from the array, change it, and then $push it back onto the array. In order to avoid the race condition above, we decided that this was a safer solution. After all, these two operations should be doable atomically.

db.employees.update(
  {_id: ObjectId("4e5e4c0e945a66e30892e4e3")},
  {$pull: {badges: {type: "TOP_EMPLOYEE"}}, $push: {badges: {type: "TOP_EMPLOYEE", date: null}}}
);

So, what’s the problem? The issue is that MongoDB doesn’t allow multiple operations on the same property in the same update call. This means that the two operations must happen in two individually atomic operations. Therefore it’s possible for the following to occur:

That is bad enough, but it can be dealt with by updating an object only if it is in the correct state. Unfortunately, the following can happen as well:

That looks very odd to the reader, because at one moment it would see the item, then the next read would lose it, and then it would come back. To an end user that’s bad enough, but to another system that is taking actions based on the data, the results can be inconsistent at best, and deleterious at worst.

What we really want is the ability to modify an item (or items) in an array by query, effectively $set with $pull semantics:

db.employees.update(
  {_id: ObjectId("4e5e4c0e945a66e30892e4e3")},
  {$update: {badges: {$query: {type: "TOP_EMPLOYEE"}, $set: {date: null}}}}
);

Since that’s not yet supported by MongoDB, we decided to use a map. This makes a lot of sense for our use case since the structure happened to be a map in Java. Because MongoDB uses JSON-like structures, the map is an object just like any other, and the keys are simply the property names of the object.

This means that individual elements can be accessed and updated from the map:

db.employees.update(
  {_id: ObjectId("4e5e4c0e945a66e30892e4e3")},
  {$set: {"badges.TOP_EMPLOYEE.date": null}}
);

This seems like a rather elegant solution, so why didn’t we start out this way? One pretty major problem: if you don’t know all of the possible keys to the map, there is no good way to index the fields for faster querying. Indexes to child objects in MongoDB are created using dot notation, just like the query syntax. This works really well with arrays because MongoDB supports indexing fields of objects in arrays just like any other child object field. But for a map, the key is part of the path to the child field, so each key is another index:

// Simple fields
db.employees.ensureIndex({lastName: 1, firstName: 1})

// Child object fields
db.employees.ensureIndex({"position.id": 1})

// Array object fields
db.employees.ensureIndex({"employmentHistory.company": 1})

// Map fields, treating map values like child objects
db.employees.ensureIndex({"badges.TOP_EMPLOYEE.date": 1})

For our purposes, we are actually able to index every key individually, and treat the map as an object with known properties.

The sinister part of all of this is that you won’t normally run into any problems while doing typical testing of the system. And in the case of the pull/read/push diagram above, even typical load testing wouldn’t necessarily exhibit the problem unless the reader is checking every response and looking for inconsistencies between subsequent reads. In a complex system, where data is expected to change constantly, this can be a difficult bug to notice, and even more difficult to track down the root cause.

There are ways to make MongoDB work really well as the primary store, especially since there are so many ways to update individual records atomically. However, working with MongoDB and other NoSQL solutions requires a shift in how developers think about their data storage and the way they write their query/update logic. So far it’s been a joy to use, and getting better all the time as we learn to avoid these types of problems in the future.