Ember Timer Leaks: The Bad Apples in Your Test Infrastructure
January 3, 2018
Editor’s Note: This piece talks about test stability and makes mention of Ember in particular. While topics mentioned in this post will apply to all single page applications (SPAs), they’re covered here specifically from an Ember point of view, since that is the framework of choice at LinkedIn.
Background: 3x3 at LinkedIn
At LinkedIn, we pride ourselves on our 3x3 system: the notion that we should be able to ship code to production three times a day, with no more than three hours between releases, so that our members experience the latest and greatest of our platform. With 3x3, we’re shipping more code than ever to our members, which in turn reduces the amount of time we have to manually test our releases. In order for us to confidently ship to production three times a day, we need to be able to heavily rely on the health of our automated tests, as well as on our test infrastructure as a whole.
SPAs and test (in)stability
LinkedIn.com is built as a SPA, and if your site is built using Ember, you’re in the same boat. Building LinkedIn as a SPA offers many benefits to our members, such as faster page loads between routes and fewer round trips between the client and the server. On the other hand, developing SPAs comes with unique challenges, such as managing memory and asynchrony. In a traditional non-SPA, the browser gives us a clean slate on each page reload. We do not have this luxury in a SPA environment. If care is not taken when writing a SPA, it’s easy to end up with asynchronous code, such as xhr requests and setTimeouts executing after the components that initiated these calls have been destroyed. This can lead to nasty side effects, a poor user experience, and test instability. More on this later.
Testing asynchronous code in Ember: A crash course
Let’s talk about the wait helper (recently renamed to settled), Ember’s solution to testing asynchronous code. The wait helper, at the most basic level, is a utility function that returns a promise that will resolve when all asynchrony in the application has been completed. In Ember, an asynchronous timer is created with a call to Ember.run.* methods (e.g., Ember.run.later). Each of these methods call setTimeout under the hood, hence the name "asynchronous timer." Because the wait helper pauses the execution of the test runner while at least one async timer exists, it’s a great way to ensure that all async code has been completed prior to running your assertions.
Is using the wait helper causing test timeouts? You may have a leak.
This section makes references to “leaking timers,” a timer that’s set up at some point during the application’s lifecycle but not torn down when the application is destroyed.
At LinkedIn, there was a point in time when a lot of our tests were timing out, and we weren’t sure what the cause was. We eventually noticed that many of the tests timing out were using the wait helper, which pauses the test runner until all asynchrony has finished. Had the same set of tests been timing out consistently, we could have safely assumed that the application code being run by the tests in question was likely leaking async timers. Unfortunately, though, different tests were failing between multiple executions of our test runner, which led us to believe that we likely had timers leaking between tests. From here, we needed to come up with a way to a) prove that we had async timers leaking in our test suite; and b) locate the source of said leaks in order to remove them from our application code.
Identifying leaking timers in Ember
Within the Ember.run namespace, Ember exposes a handy method called hasScheduledTimers, which returns a boolean (true if there are any running async timers). As seen in the code below, combining hasScheduledTimers with QUnit’s testDone method makes for a convenient way to confirm async leaks in tests, as well as to pinpoint the source of the leaks.
Tracking down an async leak
Congrats, you’ve detected a timer leak in your application and you’re now ready to clean it up. Let’s walk through what a leak might look like, and how to track it down. In the example below, you’ll notice that we’re setting up a timer to run after five seconds, but we’re never making a call to cancel the timer when the component is destroyed.
The above code results in a leak if the component is destroyed before five seconds have passed, since five seconds is the delay we used in the run.later call. Let’s create an integration test around this component, along with a check for leaks in the QUnit.testDone hook. Note: for illustration purposes, we’re hooking into QUnit from within our test file; in a typical Ember app, this would be done from within tests/test-helpers.js.
The console.log calls in the above snippet will log out three messages. As seen in the screenshot below, the message labeled "Existing timers" contains two items. The first item in the array represents the time at which Ember will execute the timer’s callback, while the second item in the array is the callback itself.
By drilling down into the [[scopes]].method property, we’re able to get a reference to the function definition of the timer’s callback, which enables us to make the changes necessary for cleaning up the leak. In this case, myRunLater is the name of the callback function that the timer calls when it finishes executing.
Patching a leaky timer
We’ve figured out where the leak is coming from, so now we need to patch the leak. Any timers set up during the lifecycle of a component need to be cleaned up when the component is torn down. To accomplish this, let’s modify our previous example by making a call to Ember.run.cancel within the willDestroy hook.
You can try this out for yourself by opening up my-component.js in the linked twiddle above.
Lots of timers? Enter ember-lifeline.
Remembering to clean up all of your async timers can be a challenging task, especially when working in a large application split amongst lots of engineers. Luckily for us, ember-lifeline takes care of this hassle by providing a wrapper API around run.later, run.debounce, and run.throttle. Ember-lifeline keeps a reference to each running async timer and cancels them from within the willDestroy hook.
SPAs are powerful, but they come with the added responsibility of managing asynchronous requests throughout the lifecycle of the application. As engineers, it’s important that we take this added responsibility seriously so that we can confidently ship our best product to our members.
Big shout out to Steve Calvert, Robert Jackson, Scott Khamphoune, and Kris Selden for their immense help in ridding of our codebase of these tricky leaks.