Third-Party Front-end Performance: Part 1

Bazaarvoice is a third-party application provider. We have a growing number of applications running on our own domain, but our core business is providing user-generated content applications and widgets that are hosted by us, but run on our clients’ webpages. Scaling an application platform of our size certainly has its challenges at the data layer. We use all the coolest noSQL tools and post up big request per seconds numbers every month, but an oft forgotten part of the story lives long after all of that is cached and delivered:

On the front end.

Steve Souders, the guy who (literally) wrote the book on web performance, estimates that 80% of average page load times actually occur after the markup is completely downloaded. That means a 50% speed up in your front-end code is going to mean a lot more than a 50% speed up in your backend code. This is important stuff!

I’d like to go through some of our front-end performance considerations and explain the optimizations that we take, or the ones we’d like to start taking soon.

I recently read that “the history of web development is just increasingly difficult ways of concatenating strings together.” So we’ll start at that point: We’ve got everything ready to go on the server and now we have to put it on client’s site and make it work.

From this point, we’ll identify three distinct areas of the front-end performance:

The network
Parse and evaluate
Application responsiveness

This post will cover The Network and the other two will be posted as continued parts.

The Network

The network is by far the most commonly discussed area of optimization on the client. That’s for good reason, though, as it’s often the bottleneck. There is really no great ‘secret’ to making this fast, though. The smaller your payload is and the fewer amount of files that you request, the faster your page will generally be. There are lots of articles on reducing image sizes, and spriting css, and you should read those, but I wanted to focus on some less commonly talked about techniques that are especially important in third party applications.

The only non-obvious part of this equation to me (in regards to 3rd party code) is the caching as well as the cacheability of your resources. There are a few unique problems when doing this as a third party, but more or less, the higher percentage of cacheable static resources that make up your app, the more opportunity you have to make the network problem a lot less scary/slow. There are whole books and companies who specialize in this stuff, but we’ll go through at a high level.

— Obligatory Wu-tang Cache Rules Everything Around Me reference. —

Edge caching

One of the easiest wins is using a CDN that implements what’s known as ‘edge-caching.’ This is the practice of a putting a distributed system of cache servers around the world (or country, etc) and serving content from the geographically closest server to each request. This not only decreases load on individual servers, but it also vastly improves latency of requests.

Note that edge-caching is still not quite cached on a user’s computer, but still on a server in the CDN. Even if you have amazing cacheability in your app, you can still benefit from edge-caching for the initial load of the application (when the cache isn’t primed).

Bazaarvoice does this with almost all of its content, even relatively dynamic content. We are one of the largest users of the Akamai CDN, and well over 90% of our total traffic actually just hits Akamai servers and never makes it back to our origin servers.

You may be familiar with something like Amazon S3 to serve your static files. S3 is not distributed by default, but adding on the CloudFront option to your Amazon S3 buckets will achieve many of the same results (assuming you serve from the CloudFront URLs).

Maximum Cacheability

The content in most web apps changes frequently. However, the application really only changes when new code is written and pushed (read: hopefully far less frequently). In an ideal world, everything except for the data would be cached for a length of time (in the end-user’s browser).

The larger the percentage of your application that you can statically generate, the more you can cache.

This seems pretty easy at a glance, but can actually get a bit hairy in a third party environment. Due to the fact that the application is included via JavaScript, the markup for the page is not generally served in the initial request and is injected afterwards instead. This differs from normal webapps that can send their rendered markup along with the initial page request.

Dynamic Content

Currently, Bazaarvoice’s template system integrates with our Java backend, so after the static resources are loaded, we make a request for the rendered content of the page we’re on. This comes back to us as a string, which is saved to a javascript variable.

// simplified example - uncacheable data and templates
var html = "<ul id=\"reviews\"><li>review 1</li><li>review 2</li></ul> ";

Exactly how this works is unimportant, but the key is that this entire additional request is not cacheable (at least for any significant amount of time). Since the data in the rendered templates changes, we not only have to send the fresh data on each request, we also have to send up all the templates each time, multiplied times the number of times they are invoked. Given a significant amount of data, this can add quite a bit of weight to these requests. GZip can help quite a bit here with repeated data, but if you also factor in all the escaping characters to save these as strings in JavaScript, you can see where this could reduce performance by increasing the file size of noncacheable requests.

For the most part, in practice, we don’t serve so much data in an individual request to be effected by this problem on an alarming level, but it’s always nice to optimize further, and we’d love to save the money on bandwidth costs.

In our simple example above, we had the added duplication of a single set of <li> tags. The solution to only sending that once and caching it after the initial load is client-side templating.

If we are able to include our templates in our cached, static resources, you only need to request them once. So even though the library to render them has some cost, it usually pays off quickly (often still on the initial load).

// cacheable template
var tmpl = "<ul id=\"reviews\">{{#each reviews}}<li>{{name}}</li>{{/each}}</ul>";

// uncacheable data 
{ reviews: [{ name: "review 1" }, { name: "review 2" }] };

We’ll get into the actual performance and responsiveness of the app in a later post, but note that in production, these templates can actually be ‘pre-compiled’ into javascript functions that run very quickly (more or less as quickly as string concatenation can take you). Also in production, the templates and data are much larger, so this reduction pays off at scale.

In a third-party application that can cache its templates alongside the application code, the data (usually JSON) is the only unique and the only non-cached bytes going over the wire after the initial load. This is great. I’ll leave optimizing data payloads to a different blog post, but just make sure you only grab the minimum amount of data that you need.

For extreme performance (and SEO to boot!), a third party tool could also integrate the markup injection on the server-side. The markup would go in the initial payload and be instantly visible upon page load (great perceived performance). You would have to write your JavaScript in a way that could bind to pre-rendered templates, but it’s all possible. This is usually a more difficult thing to get clients to agree to. You end up duplicating the markup in loops, like in our first example, but you also save an http request.

Since the client can cache the request for every end user on their app server this ends up being by far the fastest integration in practice. This is a good complement for a high-performance third-party system, even if it can be tough to get clients to agree to.

Data Caching too!

If you want to go even further (and I very much encourage it), you may want to look into caching your data too. This wouldn’t do much for new page loads, but it could really help for common widgets across pages, or reloads/revisits of the same page. Anything that doesn’t take a significant amount of code that can reduce origin requests is probably going to be worth it eventually at scale.

Consider a “Most Recently Reviewed Products” widget that goes across every page on a site. This widget would list 5 products or so. The recency of the products in the list is important, but the exact accuracy is likely not important within a few minute period (unless you’re an unusually high-review-volume webpage). Unfortunately, though, people usually visit quite a few pages in that same time period.

As a third-party, you’re more than likely relying on JSONP to request this information for the widget. Each subsequent page load is highly cached (static resources) because you’re doing all the things we’ve mentioned previously, but you’re forced to re-request this redundant data each time.

I know what you’re thinking…

“Since JSONP injects a script tag under the hood, it is just a GET request for a “javascript file” that can be cached by the browser.” — you

While this is technically correct, it’s usually difficult to actually realize in the real-world for a couple of reasons.

At Bazaarvoice, and in many (read: most of the ones I’ve worked with) APIs, all requests are sent with a ‘no-cache’ header. This stops all caching from occuring. If you do have an API that allows browser caching you’re one step closer, but still usually have another issue.

Most people are using jQuery or other AJAX libraries in order to make JSONP requests to the server. Unfortunately for fans of API caching, these libraries almost always bust your cache, either intentionally, or unintentionally.

This code:

$.ajax({
  url : "http://bestreviewapiever.com/api", 
  data : {"filter": "recent"}, 
  dataType : "jsonp" 
});

actually results in a request to:

http://bestreviewapiever.com/api?filter=recents&callback=jQuery171984098_234234&_=98723498723

And every time you request it, those last two numbers will change. Even if you set the cache option to false – you still end up with unique callback values in each request. This will still bust your cache.

There’s a touch too much code to include here, but the solution is to set cache to false as well as override the jsonpCallback with a value that stays the same. However, you must then manage callback collisions yourself. That has some significant code behind it.

If you successfully implement all that, this means that, as long as edge cases don’t arise, the url will be the same each time, and the jsonp requests would be cache-hits for the cache TTL of the api server.

Another approach that we are in the process of testing and implementing at Bazaarvoice is utilizing localStorage and sessionStorage.

There is not a lot of detail needed, but in browsers that support these DOM Storage utilities, you can write logic to check them for data before making JSONP requests, and invalidate the data after a given time. Make sure you clean up old data when you invalidate, because space is limited, but this is a nice solution, because the fallback is just the natural case.

Currently I’d suggest using sessionStorage with an appropriate TTL on your data. That way it would get cleaned up automatically at the end of the ‘session.’ This solution seems a lot more elegant to me than api cache-headers and JSONP overrides.

Now, after the first request for our data, each subsequent page load has a cache-hit (for our allotted interval) and saves a request from occurring. FASTER!

BONUS TIP!!

If you have very real-time data, don’t count out this method.

One very slick ‘realtime-y’ touch is to have visibly updating data. In our example, even if we wanted to make an uncached request each time, we could instantly display the old data on page load from sessionStorage, wait until all the critical page elements were fully loaded, as to not block their speed, and then load in the new data. If the data was updated, you can put in a smooth update transition to let the user know that your app is awesome and real time. Best of both worlds!

Cache-busting, or “how long do I cache?”

“But Alex! How long do I cache my resources? If I cache them too long, then when I update my app, everyone will have different files!” — you, again.

Very insightful of you. In Bazaarvoice’s current system, each file has a relatively low cache max-age (it varies, but usually between no-cache and ~2 hours). This means that we can be sure that all the files that people have after 2 hours of pushing a change will be the most current one.

This actually ends up being fairly practical. For the most part, people are done browsing a site well before their cache runs out. Then the next time they just have to get it again one time. But we can do better! 😀

We’re working on a new system for caching that affords us more update control, and much higher cache-hit rates. Win-win!

The general idea is to have a single “bootstrap” JavaScript file that we refer to as a “scout” file. This file is usually unique for each our clients, and kept as small as we can keep it. It’s usually included via a script tag (partially for ease of implementation), but could be injected asynchronously, as well.

** Including it as a script tag (synchronously) has the added benefit of caching the dns lookup for the domain in a single request, which should actually speed up subsequent requests in many browsers.

This scout file should have a very low cache max-age — in our case, somewhere in the ballpark of 0 to 5 minutes — depending on the need for control.

<script src="http://bazaarvoice.com/static/scout.js"></script>

EVERYTHING ELSE IS CACHED FOREVER.

In this “scout” file, we kick off our real application by asynchronously injecting our built core javascript file. Concatenation of this main application file and its dependencies is well understood and necessary, so I’m not covering it in this post, but it is essential to good performance.

The trick is that at build time, we embed a version number into our scout file. Then the asynchronous inject function uses this version number as part of the request for the file. This can be done by putting each build in a folder with its version as the name of the folder, or it can just be done via url-rewrites.

I would encourage you to not use the query string (get params) in order to do cache busting, as VPNs and other old browser situations can sometimes ruin the cache, and get you into weird situations.

The affect of this setup is that we have a single, very small file that is constantly looking for updates (hence: “scout file”). It generally is still cached for the average ‘time on site’ of a consumer on one of our clients’ pages, but not much longer. However, if the version of the build doesn’t change, then all the requested scripts after that will be cached forever, or until the version changes again.

If you don’t push a new build, a visitor can still have cached content from years prior and only have to download the “scout” file, but at any point in time, if you run a build, their cache is busted and they get all the new files within the amount of time the scout file was cached for.

// An example of a simple scout file 
(function(){ 
  var VERSION = "12345";

  function injectScript(url){ /*… async script injection …*/} 

  injectScript("http://bazaarvoice.com/static/"+VERSION+"/widget.build.js"); 
})();

This obviously becomes more complex when you have multiple scripts loading. Bazaarvoice uses the require.js library, and its baseUrl configuration option is more or less the only place we end up having to update to make this work for us. It may be more complicated in more hand-rolled solutions, but still very worth it.

The numbers in the end are generally very much in your favor when taking average browsing habits into account. You end up with a few small non-cached hits, but a much better long tail of cacheability of the large application files — with the added benefit of quickly being able to force an update in the event of an emergency.

Conclusion

We glazed over the reducing file-size and concatenation section in half of a paragraph, but I’d like to reiterate that this is just not optional if you care about performance. Your mechanism for doing this may vary, and we don’t mind that, as long as it works for you.

I would however like to point out that you should be constantly testing the performance of your app and identifying your bottlenecks specifically. So often I see people byte-shaving licenses at the tops of their libraries while simultaneously sending improper cache-headers.

The key to tackling performance is always tackling the bottleneck first. Then you’ll have a new bottle neck to tackle.

You can measure network performance of all these techniques in your browser. Google Chrome (and other webkit based browsers) as well as the Firebug extension on Firefox can give you great insight in to the latency, time-to-load, and file size of your files (in the Network tab). You can export this data as a .har file and then import it into a variety of other evaluation tools if you need more info.

At Bazaarvoice we’re tackling our performance issues so we can help give the best experience for our clients’ users.

Remember: after you have a responsible amount of http requests and total file size, cache as much as you can, and then cache more. It’s as simple as that.

Next

Stay tuned for information on Parsing and evaluation, and application responsiveness.

bazaarvoice: engineering

The official blog of Bazaarvoice R&D

Third-Party Front-end Performance: Part 1