Blazing fast node.js: 10 performance tips from LinkedIn Mobile
In a previous post, we discussed how we test LinkedIn's mobile stack, including our Node.js mobile server. Today, we’ll tell you how we make this mobile server fast. Here are our top 10 performance takeaways for working with Node.js:
1. Avoid synchronous code
By design, Node.js is single threaded. To allow a single thread to handle many concurrent requests, you can never allow the thread to wait on a blocking, synchronous, or long running operation. A distinguishing feature of Node.js is that it was designed and implemented from top to bottom to be asynchronous. This makes it an excellent fit for evented applications.
Unfortunately, it is still possible to make synchronous/blocking calls. For example, many file system operations have both asynchronous and synchronous versions, such as writeFile and writeFileSync. Even if you avoid synchronous methods in your own code, it's still possible to inadvertently use an external library that has a blocking call. When you do, the impact on performance is dramatic.
Our initial logging implementation accidentally included a synchronous call to write to disc. This went unnoticed until we did performance testing. When benchmarking a single instance of Node.js on a developer box, this one synchronous call caused a performance drop from thousands of requests per second to just a few dozen!
2. Turn off socket pooling
The Node.js http client automatically uses socket pooling: by default, this limits you to 5 sockets per host. While the socket reuse may keep resource growth under control, it will be a serious bottleneck if you need to handle many concurrent requests that all need data from the same host. In these scenarios, it's a good idea to increase maxSockets or entirely disable socket pooling:
3. Don't use Node.js for static assets
For static assets, such as CSS and images, use a standard webserver instead of Node.js. For example, LinkedIn mobile uses nginx. We also take advantage of Content Delivery Networks (CDNs), which copy the static assets to servers around the world. This has two benefits: (1) we reduce load on our Node.js servers and (2) CDNs allow static content to be delivered from a server close to the user, which reduces latency.
4. Render on the client-side
Let's quickly compare rendering a page server-side vs. client-side. If we have Node.js render server-side, we'll send back an HTML page like this for every request:
Note that everything on this page, except for the user's name, is static: that is, it's identical for every user and page reload. So a much more efficient approach is to have Node.js return just the dynamic data needed for the page as JSON:
The rest of the page - all the static HTML markup - can be put into a JavaScript template (such as an underscore.js template):
Here's where the performance benefit comes in: as per tip #3, the static JavaScript template can be served from your webserver (e.g. nginx) and, even better, from a CDN. Moreover, JavaScript templates can be cached in the browser or saved in LocalStorage, so after the initial page load, the only data sent to the client is the dynamic JSON, which is maximally efficient. This approach dramatically reduces the CPU, IO, and load on Node.js.
5. Use gzip
Most servers and clients support gzip to compress requests and responses. Make sure you take advantage of it, both when responding to clients and when making requests to remote servers:
6. Go parallel
Try to do all your blocking operations - that is, requests to remote services, DB calls, and file system access - in parallel. This will reduce latency to the slowest of the blocking operations rather than the sum of each one in sequence. To keep the callbacks and error handling clean, we use Step for flow control.
7. Go session-free
LinkedIn mobile uses the Express framework to manage the request/response cycle. Most express examples include the following configuration:
By default, session data is stored in memory, which can add significant overhead to the server, especially as the number of users grows. You could switch to an external session store, such as MongoDB or Redis, but then each request incurs the overhead of a remote call to fetch session data. Where possible, the best option is to store no state on the server-side at all. Go session free by NOT including the express config above and you'll see better performance.
8. Use binary modules
When available, use binary modules instead of JavaScript modules. For example, when we switched from a SHA module written in JavaScript to the compiled version that comes with Node.js, we saw a big performance bump:
9. Use standard V8 JavaScript instead of client-side libraries
Most JavaScript libraries are built for use in a web browser, where the JavaScript environment is inconsistent: for example, one browser may support functions like forEach, map and reduce, but other browsers don't. As a result, client-side libraries usually have a lot of inefficient code to overcome browser differences. On the other hand, in Node.js, you know exactly what JavaScript functions are available: the V8 JavaScript engine that powers Node.js implements ECMAScript as specified in ECMA-262, 5th edition. By directly using the standard V8 functions instead of client libraries, you may see significant performance gains.
10. Keep your code small and light
Working with mobile, where devices are slower and latencies are higher, teaches you to keep your code small and light. Apply this same idea to your server code as well. Revisit your decisions from time to time and ask yourself questions like: “Do we really need this module?”, “Why are we using this framework? Is it worth the overhead?”, “Can we do this in a simpler way?”. Smaller, lighter code is usually more efficient and faster.
Try it out
We worked hard to make our mobile apps fast. Try them out and let us know how we did: we have an iPhone app, Android app and an HTML5 mobile version.
Engineering
Comments
Thanks for sharing. Pretty much all of the points are valid for any kind of HTTP-based application, and many are valid for any kind of client-server application too, but good to see these in the context of Node.js
Regarding point 7 (sessions), presumably you mean that for request/response pairs which do not need session data, don't use session. If you need to maintain state (e.g. checkout process) then the best practice is to restrict the data stored in session to simple types (which do not need complex (de)serialization). I cannot see how you're going to avoid server-side session altogether.
Shravya,
I agree with Matthew's point, can you elaborate on point number 7 and your approach to managing session?
You have to be managing session somewhere - so you couldn't possibly mean - "store no state on the server-side at all" ?
@Mathew, The general idea is that you can only hold sessions on the server side. WIth LocalStorage for Web and with other Storage on client devices, you can really push a lot of the "session state" to the client side. This is much better than just the limit that cookies had with just a few K of state stored. The only issue is security of the data, but in general session data does not contain much detailed personal information. The other thing that we tend to do is just request the objects we care to use to make decisions from another service which actually holds the session object. This other service can then just implement its own cache/load systems. We imply a versino of both of these to help us not have "session state" on the server. I think the general statement was just dont use sessino objects on each node instance in memory of V8 for the session store.
Thanks for sharing.
Is it possible to share the number of request per sec current solution is handling?
For scalability and B2C apps, node.js is the answer. I liked your point on client side rendering. We are doing similar things for generating charts and have mostly replaced Flash charts with Ext-JS charts.
"8. Use binary modules" really depends in the use-case as the bridging between native (C/C++) comes at a (high) cost.
awesome article, thanks!
A few of these tips are about minimizing I/O volume (using gzip, rendering client-side, keeping static assets out of Node). What's the story there? Were the frontend servers network-bound rather than CPU-bound? Does Node run into particular problems with large amounts of data? Either way, fascinating (and I know that a comment on a months-old post probably won't get a reply, but figured I'd ask).
Hi Shravya
Great article thanks! A wee question concerning parallel processing and the node.js queue.
What would happen if requests were made to the socket that occurred at exactly the same time? Unlikely I know but.... If the timing mattered then does node.js in some way queue the requests so we know which one was the first or last prior to the sync broadcast being made?
This kind of goes against the advice you give, but if all this happens in parallel could there be an unwanted conflict in this instance?
Is there a way to test if something is blocking operation?
For example. In express I am doing request.session.user = userData; (Since I'm storing my sessions in redis, using connect-redis).. I am not sure if this is blocking or not.
Thanks for brining to light #9.
Thanks for all those valuables insights !
FYI, the mobile HTML5 version does not work on Windows Phone 7.
Thanks for this write up. That was an eye opener. I'd really like to check out nodejs in a non-trivial app, but at this point, I don't want to use it for everything; just probably for serving up json for the mobile version of an existing application. You seem to be using it for serving up your entire application. Do you have any advice on how I could go about integrating it say for serving up the api for an existing rails app?