Journey to the Promises Land
December 6, 2013
In October, we released our completely redesigned iPad application, culminating months of implementing brand new functionality on both the client and the server. We use a node.js server that aggregates data from multiple back-end services, meaning we deal with complex asynchronous data flows.
Why do we need flow control patterns in node.js?
Asynchronous, callback-based code is prone to what's known as the pyramid of doom.
This would be even more complicated if we handled the errors.
Control flow libraries solve this problem by introducing higher-level constructs that abstract common flows. Step abstracts two flows: sequential flows, like the one above, and parallel flows, in which multiple asynchronous calls are made in parallel with a final callback being called only when all the calls complete.
However, these libraries are only de-facto standards and have no formal specification. This introduces gotchas that can only be understood by reading the library source code. Additionally, error propagation is a manual process in Step, where each step of the above pipeline must pass along errors to the next step in the pipeline.
Alternative control flow libraries have cleaner solutions to these problems, and the cleanest of these, we found, were libraries that used promises.
Promises are an implementation of the Promises/A+ specification which requires a conforming implementation to.
- Provide a uniform interface to chain sequential steps in a
pipeline in the form of the
- Always proceed from one step of a pipeline to the next asynchronously.
- Handle and propagate errors in a predictable, well-defined manner.
So, of course, we were excited to use such an implementation! Based
on some searching, we found that
widely-used and provided useful convenience functions on top of the
functionality in the specification. With a library in hand, we went to
work using promises throughout our codebase.
Adding promises incrementally
As with any existing code base, the LinkedIn iPad server consisted of code that used callbacks liberally. Luckily, as is the convention with libraries such as Step, these callbacks were node-style callbacks, which accepted an optional error as the first argument, followed by data as the subsequent arguments in the non-error case.
This made it easy to introduce promises incrementally, using q's
...args). This function translates a callback-accepting
function into one that does not require a callback but returns a
promise. The promise is resolved or rejected based on the values that
would have been passed to the callback, causing the correct handler to
Converting back to a callback-accepting function is similarly simple;
the two functions passed to
.then() are mapped to a callback.
Again, q makes this pattern trivial to implement using its nodeify function:
With these techniques, we introduced promises into each new file we wrote, and by the end of it, about 35% of the application (non-library) logic uses promises as for its asynchronous control flow needs. We didn't modify existing code, including a substantial portion providing functionality relevant only to older versions of the iPad application.
Modelling complex data flows
Sequential data flows are simple, but at LinkedIn, we see much more complex data flows. We frequently need to gather data from multiple sources in parallel. When doing so, we want to ignore certain data if they cannot be retrieved in the reasonable time, while requiring some other data to be available before we continue to render. Additionally, we want to chain operations that depend on each other while not blocking operations that can proceed independently. Promises provide straightforward solutions to these requirements.
A common use case, we wish to retrieve some data that is not mandatory to rendering a page. In that case, we handle this case by logging the error or exception that happened while fetching the data and returning a default value. For example, if we are expecting a list of items and the backend errored out, we may default to an empty list.
From this, we have established a pattern in which we create a function that is able to take an error message and return it. We can then use this function as the error handler attached to a promise
q provides a convenience method
.then(undefined, errorHandler). We use this
method liberally for this use case.
Parallel data calls
Another common use case is to make multiple HTTP requests in parallel, while degrading gracefully if some of those requests fail.
In our updates stream, we blend together your network updates, various types of recommendations (such as people you may know), and news articles to create a much relevant and richer feed. While we would ideally like to have all three sources of content, we can tolerate failures in retrieving the latter two.
This means we fire off requests to retrieve all three pieces of data in parallel. If the people you may know call fails to return in a reasonable time, we can default to empty list of recommended people. Similarly for the news articles. After all three pieces of data have been resolved either from the backend or with substituted defaults, we can combine them and send back partial data to the client. Only in the case of failing to fetch the network updates, we will send back an error to the client. The flow looks like this.
While this is possible to accomplish using a library such as Step,
q's helper functions, make it
straightforward to map into code.
q provides a convenience function,
onRejected), that takes the array generated by
q.all() and spreads the data out into a variadic success
handler. For those familiar with Step, this is similar to
A complex data dependency graph
Promises excel at representing dependency directed acyclic graphs (DAGs) in a way that mirrors the structure of the graph itself. Consider our job page. In this page, we present information about the job, and the company where the position is held. We also present how you're connected to that company along with a list of similar jobs. Finally, we also present whether or not you have saved the job to consider in near future
Given that the jobID present in the request URL, we can immediately fetch information about the job (along with its corresponding company) and if you have saved this jobID in your interest list. Once we have the company information, we can extract the company ID, and find the connections who work at that company and also match the similar jobs at that company. Only the first of these four calls is strictly required, the latter three are optional
Conceptually, this is simple, and again, it can be mapped to code
using a library such as Step. However, such code will not reveal the
inherent DAG shown in the above diagram because ultimately, Step
represents a linear flow. With
q, the dependencies are
explicitly stated as below
What we loved about promises were the strong, logical guarantees made by the Promises/A+ specification. But here are some scenarios that caught us by surprise based on the assumptions we made
Early on, we liked the idea of explicitly spelling out error handlers
.fail function instead of simply
passing both the success and error handlers to
However, these two constructs are not equivalent. Suppose that the
handleData function encounters an runtime exception
undefined object. In the first flow, the
error goes on to the next handler in the sequence. In the second flow,
the error is passed along to
handleBackendError! In that
handleBackendError is now actually dealing with
to deal with the back-end error. This is illustrated below in the example.
The first diagram is the flow we want to have, where even in the case
of an runtime error in the
handleData, we propagate the
error to the next step in the pipeline. The second diagram is the
incorrect flow, where
handleError handles errors both
from the back-end call as well as from
some cases, we do want the latter, but we need to be aware of having
the distinguish between the two use cases
In our groups page, you'll find two sources of data, groups you have joined, and groups you may like to join. The HTTP endpoint for that functionality accepts an optional query parameter in the URL, allowing you to filter to one of the two types of results, defaulting to both if the parameter is absent.
To support this functionality, we created a promise for each of the
two types of results, as they come from different back-end services.
Based on the query parameter, we only attached a
handler to one of the promises, or piped both through
q.all. What we failed to realize was that whether or not
a handler is attached to a promise, the back-end service call will be
invoked immediately for both when the promise is created! What we
indeed wanted was to invoke one or the other conditionally if the
filter was present
Promises play well with testing frameworks, such as Mocha. When we unit test a function that returns a promise, we can attach handlers to the promise and make our assertions in the handler.
Because Mocha provides a node-style callback as means of running
asynchronous tests, we can pass in
done right into
A successful experiment
- Made it simple to implement and reason about the complex data flows present in the iPad server.
- Simplified our error handling significantly.
- Allowed us to provide more visibility into our system due to almost all error handling passing through shared functions that log encountered errors.
We hope this results in a better experience for you as you browse the redesigned iPad application.