Upgrade Testing on iOS: Keeping up with 3x3
Automating Upgrade Testing to Better Avoid AppCrash Surprises
April 15, 2016
Our newly introduced 3x3 release cadence is an effort to keep up in the fast-paced world of native apps where increasing consumer demands and decreasing release cycle times require us to be more agile with our app releases.
As a corollary, we have more flexibility to experiment (A/B test) with new features, to make more data-driven design decisions, and most importantly, to reduce the turnaround time for bug fixes. And that’s not all. We are now better positioned to keep up with hardware and OS updates.
That said, faster release velocity means shorter release cycles. This comes at a cost of certifying releases faster, without compromising product quality. Relying primarily on automated testing has helped us speed up build verifications. However, despite the comprehensive UI and integration automation frameworks in place, we have still been resorting to manual efforts for app upgrade testing.
What has always been done?
When a build is marked as a release candidate (meaning it is ready to be shipped to App Store or Google Play), the on-call engineer runs a series of upgrades from various App Store versions of the LinkedIn Flagship app to the release candidate version and verifies that the updated app launches without crashing. This verification process is repeated across multiple devices and iOS versions.
With faster release cycles, we are rolling out more releases to our members; more releases available means more app versions for our members to upgrade from and to. Statistically speaking, this further cultivates diversity in our user-per-app version demographic, making it more necessary than ever for us to support more app versions and upgrade paths. Needless to say, the mix of app version/device/OS compatibility necessitates a complex and extensive test plan entailing many hours of manual effort. In case of an app crash, the process becomes even more laborious with steps to reproduce the crash and capture crash logs. Testing varied combinations of client and platform compatibilities is a tall ask of the engineer, not to mention the scalability concerns that it raises. Obviously, yet not optimally, the manual upgrade testing process has worked for us thus far.
To the rescue...
The in-house iOS upgrade testing tool makes life a little easier for us by automating the upgrade process and parallelizing execution on multiple devices. Agnostic of the iOS and device versions, it requires little-to-no manual intervention to run. Currently, it is setup to run independently on a dedicated Macbook but it can also be executed from any OS X machine. Also, it can be configured to best accommodate the iOS app’s upgrade testing needs. By default, it runs hourly and creates a set of five build versions consisting of a combination of one latest iOS build artifact and four popular production/App Store builds.
If not already cached, the source code is checked out directly from the repository based on its version tag and archived as a debug-device build subsequently. In order to mimic the user-facing production builds as closely as possible, it is necessary to build with the device as the target and with modified compiler-level optimization and preprocessor macros:
- GCC_OPTIMIZATION_LEVEL set to -Os; release builds have optimization enabled (-O3 or -Os) while debug builds do not (-O0).
- GCC_PREPROCESSOR_DEFINITIONS set to non-debug compiler directive.
Note: Release builds do not allow the debugger process to be attached, which is necessary for programmatically launching the app, so alternatively the debug build is configured with release build settings.
The end-to-end upgrade testing process involves periodic upgrade runs kicked off by a scheduler at configurable intervals. This will be replaced by Hudson in the future. The initialization phase of each run comprises of querying latest app versions to test, archiving builds to generate app binaries, instantiating a list of USB-connected iOS test devices, gathering device information like iOS and hardware version, and finally, setting up database connections. Once the initial setup is done, a subprocess is spawned per device that sets up device-level debug- and error-logging—these are execution logs that show the progression of upgrade tests. Alongside the execution logs, the device logs from Xcode’s console are monitored for any crashes as well. This is done using idevicesyslog which redirects Xcode’s console logs to the terminal. From there, the script greps the output for the specific app and device ID, and flushes it to a log file to further parse it for detecting crashes at the end of the run.
The tool uses ios-deploy to install the app binary on the device. Once deployed, it uses the LLDB debugger to launch the app for 60s (or as configured). Building the app in debug mode sets the task_for_pid-allow key in the app’s entitlements file which allows the debugger to attach to the app process on the device. The upgrade testing tool maintains its own library of routines for performing upgrades and fresh installs. Leveraging these utilities, it performs the following combinations of upgrades and fresh installs on each device in order to give us comprehensive coverage of likely upgrade paths:
- Fresh installs: uninstalls any previously installed app (based on the bundle ID); deploys and launches the latest app.
- Upgrade installs: uninstalls any previously installed app; deploys one (of the four) production apps followed by deploying the release candidate build (the release candidate build replaces the production app simulating app update); verifies app launches without crashing; uninstalls the app and repeats the process for all four production apps.
- Ripple upgrade installs: uninstalls any previously installed app; deploys the oldest production app and sequentially deploys each of the later production and artifactory apps without uninstalling at any stage; each time an app is installed, it verifies that it launches without crashing.
A subprocess running in parallel monitors device connectivity state. In case the device is disconnected during execution, the test terminates and the test failure with an appropriate error message is reported in the logs. At the end of the run, once all the subprocesses have returned (or terminated), the error and crash logs, if any, along with device information are aggregated for all devices and stored in the database. Storing execution logs in the database makes it easier to reproduce and triage bugs.
Time-efficient and independent in execution, the iOS upgrade tool gives us the scalability we need to be more agile in our release process. It's generic enough to run upgrade tests against any of our apps—LinkedIn, Lynda, Job Search App, etc.— if provided with the corresponding source code, builds paths and bundle IDs. As is, the iOS upgrade testing tool caters to most of our current upgrade testing needs. However, we are considering extending its functionality to test some common user flows and converting it into a deployable tool that can also be used for upgrade testing for Android apps.