Making a faster website

Since I'm a nerd and slightly above average preoccupied with performance in apps and websites, I decided to put my then present knowledge to the test by attempting to create a highly performant website and maybe see if I could learn something in the process. So, like I said, the goal, build a website which will load insanely fast and score a perfect 100 on both mobile and desktop using the Google Page Speed benchmark. In the end I did it and you can see the resulting website here: http://kinderasweb.azurewebsites.net.

Note that this is my take on creating a fast website with a specific goal of scoring a 100/100 on the Google Page Speed benchmark. This might not be your goal and your site might have other considerations, so this is no definitive answer.

Click the image to see the full report.

The server setup.

My server setup includes a Node.js server running an Express.js application hosted on a Windows Azure website.

My experience with Node.js and Express was somewhat limited before this project, I had mainly used Node with Socket.IO before and had really never written a "full" Express application. So I learned a few things and I'll like to share some of those now.

Firstly, Windows Azure man, I'm not exactly what you would call a Microsoft fan, but Azure is pretty much a joy to work with and I highly recommend it if you're building a website or any kind of service for web or native applications. It's that good!

The Express application

The application running the site is pretty straightforward. There are a bunch of templates written in Jade and some routes rendering the content into the templates. The content comes from a heap of markdown files which are parsed and then rendered. Pretty standard really, but there where some challenges though.

Server response time

Just to make it clear, this is not an Azure ad and I'm not sponsored in any way or form. However, since Azure websites uses a CDN infrastructure to distribute content, response time for Azure websites are pretty good. As long as you make sure that your actual server application responds quickly Azure will handle the delivery pretty flawlessly. So my only advice on the server response time is to stick you application on properly configured infrastructure. I do have some experience with this and it really matters!

Low latency means that the server responds quickly and the users of your site will start to see the content sooner

Low latency means that the server responds quickly and the users of your site will start to see the content sooner

Compression

Compression means that the server will compress the content (using «gzip» in most cases), resulting in the files transferred being much smaller.

Enabling compression in an Express application is easy peasy lemon squeeze. It's a setting and you enable it trough the Connect middleware like so:

app.use(express.compress());

Note that it is critical to set this before any other settings in your Express application to ensure that everything is transferred in a compressed state.

Browser caching

For me, this was the trickiest part. Browser caching is in short the how server defines some http headers on the content allowing the browser to cache the content for some pre-defined amount of time. Having mosts servers set these headers is quite easy, in Express you can do this when rendering the content, like so:

res.header("Cache-Control", "public, max-age=" + cacheTime);

The tricky part is for how long you are going to allow the server to cache content. According to Google you should set the cache time for "static content" like JavaScript and CSS up to a year. However for "dynamic" content like HTML pages, browser caching is not recommended. Now, this depends on what kind of content you are serving of course and how important it it to get new content out quickly.

Static content

For my fictional website I decided on caching static files like CSS and JavaScript for one year and then using «URL fingerprinting» to tell the browser when the file had changed. «URL fingerprinting» simply means that you append a value to the end of the URL like «styles.css?cache=1.0.0» where the "1.0.0" part could be a version number or really anything that changes. In my app I simply exposed the «package version» to the main template and used that to bust any cache.

// in app.js
// Get the css app version
var version = pkg.version || (new Date().getTime());
// Expose the version number to the templates
app.use(function (req, res, next) {
res.locals.version = version;
next();
});

// in layout.jade
link(rel='stylesheet', href='/stylesheets/style.css?v=#{version}')

This - again can be done in a multitude of ways depending on server software and such. Setting cache headers for static content in Express is easy and is done once for all content.

app.use(express.static(path.join(__dirname, 'public'), { maxAge: oneYear }));

Dynamic content (HTML)

Google prefers to look at HTML as dynamic content and in most commercial cases I totally agree. For my case - the need for having the user instantly see any changes was not so important. So I decided to set the cache for all HTML pages (templates) to one day. This is set via the cache control header mentioned above.

Server caching

In contrast to browser caching, server caching simply means that templates and data is cached on the server avoiding database lookups and I/O operations every time a user requests a page. Most CMS systems and web applications do this out of the box. For my case, the application needed to read a bunch of markdown files. These files where kept in their parsed form in memory until they changed or the server restarted, this way the content was most of the time read from memory and the application would respond immediately. How you do this depends heavily on what your server-application does - but a goal here is that the server should use a maximum of 200 milliseconds before responding to any request.

Inlining styles

Apparently this is somewhat controversial and is sometimes referred to as «critical path CSS». Basically this means that instead of linking to an external CSS file in the header of your HTML document you instead inline the CSS styles needed to render parts of or the entire page. There are some tools to help you detect which parts of your stylesheet should be inlined.

What this gives you is fewer server requests. For users on mobile networks this is a big deal. In the mobile network setting, latency is often in terms of seconds. This means that requesting additional files will add a new roundtrip of latency before your pages can be rendered, since browsers generally do not start to paint a page before the styles have downloaded and have been parsed.

You can read more about Googles reasoning and how Google Page Speed looks at this specific measurement.

Inlining CSS with Jade

This is kinda a front-end thing, but it is most practical in my option to have the server do this for you. Using Jade with Express this is as simple as including your CSS file directly into the layout.jade file.

| <style type='text/css'>
include ../public/stylesheets/style.css
| </style>

Front-end

80-90% of the end-user response time is spent on the frontend. 
Start there.
- Steve Sounders

Mr. Sounders coined the quote above as the performance golden rule back in 2012. I do actually think the man knows what he is talking about, but I do want to stress that it is completely possible to utterly fuck up your entire site by creating a crappy server-application. So do both the server part and the front-end part properly is my advice.

There are many things you can do in the front-end to improve the performance of your site, let's start with minification.

Minification

Minification is the process of "compiling" static files like JavaScript, HTML and CSS files in a way that reduces their file size before serving them to the browser. This can be done by the server or before the files are uploaded to the server.

For my site I use the build in functionality of Jade to minify the templates (HTML) when serving them to the browser. These minified HTML files are automatically cached by Express when you go in to production-mode so you don't have to worry about additional server response time on this.

I use CoffeeScript (which is awesome) for scripts and Uglify.js to minify the resulting JavaScript files(s). For styles I use SCSS (which also rocks) which has a build in "compressed" mode for outputting minified CSS. The tool I use to handle these files is CodeKit, which I highly recommend for front-end designers / developers!

CodeKit is a tool to handle front-end projects (click the image to read more)

CodeKit is a tool to handle front-end projects (click the image to read more)

Minification and compression (gzip)

If you're clever you might be thinking something like: "Hmm, if I enable gzip on the server, then why do I need to minify the files as well?".

For JavaScript the answer is easy, minification does more than just compressing the file, it also renames variables to shorter name and removes comments and so on. In other words, there is less content to read for the browser after the files has been downloaded. And therein lies the whole point. In transfer size you will not save that much by minifying and then compressing, compressing will pretty much do the job. However if you do not minify the file, the browser on the end-user device will need to spend more memory when parsing it, because, well, the file is bigger. So it is a good idea to both minify and compress static files.

Handling images

Images are static data and should be cached along the same lines as other static files like stylesheets. See the part about browser caching.

Also, images needs to be optimized. There's a bunch of completely redundant metadata in images which can be stripped away to achieve a smaller file size.

Images should be served to the browser in the size it will be rendered in. Don't scale images in HTML or CSS. The exception being HI-Def (retina images) which will be twice the size.

Serve the images from a different domain than the one your application is hosted on. This allows the browser to download more stuff in parallel, resulting in shorter load times.

More about optimizing images from Google.

Enter Cloudinary

cloudinary_logo_square_500x500.png

Cloudinary is an end-to-end image management solution for your Web and mobile applications.
- Cloudinary website

All of the things above, Cloudinary does them and more. It is also a CDN allowing for faster downloads of image files and allows for downloading in paralell.

This service is for images what Azure is for application hosting. It is that good! Learn it, use it!

Wrapping up

There are other related things to performance my test-site handles, like handling retina images and so on, but more on that in separate post.

I believe that the main thing is to realize how much performance affects the users of your site and to act accordingly. There are a bunch of research done on this and in one famous case Amazon found that they could achieve a 1% increase in revenue for every 100 millisecond of load time they shaved off their product pages.

Turns out, performance matters..