31 7 / 2012

A/B testing for mobile apps made easy - how we built Switchboard

TLDR: Implement A/B testing for your app in hours. Switchboard is a lightweight mobile A/B testing framework with consistent user segmentation. It can be used for A/B testing, stage rollout and remote configuration. It’s designed to server high traffic and be as flexible as possible. Released under the Apache License, you can download it from: http://www.github.com/KeepSafe/Switchboard

Problem: On mobile, you can’t just simply roll back

Many mobile devs have been through this: added a new feature, submitted it to the app store and just after it is finally out there you realize that something is not working. For whatever reason.

If you are building a web-only product, solving this is usually fairly easy. You fix your bugs, deploy your code to your servers - done!

On mobile, it’s different. Once it’s shipped, you can’t take it back. The only thing you can do is fix, push an update and pray that it gets published quickly. On Android it will take a couple hours. For iPhone…. you know, longer.

And then you still haven’t answered emails from those annoyed users who get too frequent updates.

You can avoid all that by plugging Switchboard into your app and use that to quickly react. Our mobile app, KeepSafe has millions of users on a multitude of different devices, form factors, OS versions, language settings etc. We do thorough testing but we don’t have all the devices in all the configurations available. Our solution: Switchboard.

We built Switchboard for three main use cases

  1. Staged rollout of new features
  2. A/B testing of features
  3. remote configuration

Staged Rollout

Because of the Android device fragmentation, the staged rollout was really something we needed to ensure a good user experience. With Switchboard, we can release a new feature to only a subset of users and see what causes trouble and what does not. This comes in really handy because we also use Crittercism to get real-time crash reports. We can roll back for some or all devices as soon as we see problems with a particular configuration.

A/B testing

Observation beats theorizing when it comes to determining user preferences. There are many optimization issues as well as design decisions where testing the best variant would have been really helpful.

In previous projects, we were working on web products and could really leverage the power of A/B testing. On the mobile device that’s a little harder, especially for parts of the app that just have to be native. We didn’t really find a lightweight solution so we found our staged rollout to work just as well for this case.

Remote configuration

Switchboard allows you to wrap 3rd party libraries. This helps you to turn off 3rd party code where you don’t have control over in the case it starts breaking. There is nothing worse than having a 3rd party SDK in your app that crashes it while it’s out there. Be it an analytics package or the customer support API, with Switchboard, we can turn it off. Another use case for remote configuration is to change your API end points on old apps that users don’t update.

Simple does it: Switchboard design

For weeks we wished for a tool to do staged rollout with. We shied away from building it because it would distract us and take up a lot of time. Eventually, we came to a point where we changed a major part of the KeepSafe and errors would really impact millions of users. Now we really needed to be able to switch the next release off, should it break. With our next code cut-off 3 days away we said: “Build something we can make work by Tuesday”. And so we built it: quick, simple and not polished, but it does the job. Startup life :)

Switchboard was designed as a super lightweight tool that can handle lots of traffic from a few servers and scale horizontal. It should be as flexible as possible in terms of usage and robust against downtime or connectivity problems in the communication.

Client — (example code from android SDK)

Every part of our app contains one or more multivariate switches. The app ships with a default setting for each of those.

On app start, the app downloads a on the fly generated configuration file that determines the latest settings for this specific user. The configuration file is generated based on device, OS version, language etc.

SwitchBoard.loadConfig(getApplicationContext());

If there is a successful response, the switchboard config is updated. Otherwise, it just uses the last saved version and catches every possible exception so that your app does not crash. Request are made asynchronously so they don’t block the UI. The generated configuration file can also contain custom variables that are set on the server and parsed in the client code during runtime.

The configuration update should be done at a point in the app where the user is not getting confused about changes. In our case, we have a login screen where we update our config.

All the logic is on the server side, so you have fast access and ongoing control over the executed code.

We implemented a client library for android and iOS, that makes it easy to get started. The client supports productions and staging environment and creates its own unique user ID (UUID) if you don’t have one already.

Embedding an experiment on the client is super easy.

//get settings from Switchboard
boolean isSmiling = Switchboard.isInExperiment(getApplicationContext(), "experimentName");

//Switching code for testing
if (isSmiling) {
    //execute code for people who are in the experiment here.
    showSmileyWelcomeMessage();
}
else
    showNormalFaces();

Have a look at our example application for more details.

Server

All the configuration logic resides on the server. Client requests for configuration are processed here and the configuration is sent to the client in a simple JSON string. The core parameter is is the unique device ID that is parsed from the client. Based on that each user is segmented into buckets. Each user will remain in the same bucket over his lifetime. The server is designed to work without a database or any IO operations for maximum performance on a large user base with as little servers as possible. We don’t want to worry about how to scale our tools.

The device sends parameters to the server. By default this is the UUID (automatically generates user buckets), device OS version, app version, language and other system parameters.

Based on these the server decides which switch/experiment is set ON or OFF.

$manager->turnOnBucket(0, 50); // an experiment for 50% of all user

Beside a boolean status to indicate if the switch/experiment is turned on for a specific user, you can pass values to the client for each each experiment. Values are parsed as a JSON object, so you can put everything in there that JSON can handle. These can then be used dynamically in code.

if($this->manager->isInBucket(0, 50)) {
    $values = array();
    $values['message'] = 'You are not an english user dude. So the message is not displayed';
    $values['messageTitle'] = 'get KeepSafe ver 2';
    return $this->manager->activeExperimentReturnArray($values);
}

Infrastructure

Because we wanted to have something fast out of the door that does the job, we couldn’t afford thinking much about infrastructure setup. We went with an easy and robust solution, Heroku. Currently, we serve more than 1m requests per day with two Dynos and we could service way more, since there is not that much heavy lifting done. Heroku also allows us to scale Switchboard horizontal without any effort.

To make it easy for you to try Switchboard, we have set up a running instance of the server with a sample implementation. All the Switchboard example apps are pointing to that instance so you can play with it.

How we use Switchboard

User grouping - buckets We divide our users in buckets based on a UDID that the client computes and sends to the server at request time. We split our whole user base in 100 buckets. This allows us to address our user base on a fairly granular level. You can divide your users into 1000 buckets if you have a larger user base.

This bucket grouping is orthogonal to any other parameter by which we segment our user base, be it device, OS, country, language. This makes it easy to keep an overview of how many people see what feature.

Keeping track

Switchboard does not take care of tracking. You can implement your existing logging/analytics solution to track results.

We currently simply use a combination of Google Spreadsheet, Google analytics and Crittercism. We use a basic spreadsheet to keep track when we turn which experiments on/off or change something together with our core metrics. This makes it super easy to see changes in the core metrics and to associate them with the experiments or new features we run/released.

For tracking we use Google analytics. The main reason is that it’s free. We used Google’s custom variables to build our own segmentation by app version and install date. To track experiments we use standard Google Analytics events. Depending on the experiment we pass the event label from the server to the client config. This allows us to change the experiments without updating the app. This makes sense in particular when you want to test messaging and click through rates.

For rolling out new features, we leverage Crittercism who boast live crash reports on a device level. This is especially powerful on Android when you roll out a new feature that might break on some individual devices. Using Crittercism and the user feedback in our helpdesk, we can see problems post-launch and quickly roll back or turn on new features when they prove to be stable. Switchboard also allows us to remotely turn off features that are not supported by specific devices.

How you should use it

We don’t know. But please - download it, try it, fork it, improve it.

We have example apps for all clients and server on the github project page. The example applications are working out of the box with the example server code running on our server.

We’d love to hear what you think. Here is the link again: http://www.github.com/KeepSafe/Switchboard

When you start implementing it, using it or want to help improve Switchboard, I would be happy to hear from you. Write me an email: philipp [at] getkeepsafe (dot) com

Comments and Discussion

Please comment on HN: http://news.ycombinator.com/item?id=4319905

Share: