While working for an iconic magazine, owned by a large corporation, it was announced by some upper level executive that she had a plan to “help” the magazine regain the greatness it had twenty years before. The staff of the magazine was in shock at the plan, which included “losing the current readership to gain a whole new fan base.”
The plan was without any research and seemed to go forward based on whatever thought the executive had that day. Out of desperation, mostly to keep our jobs, we suggested the plan create an A/B testing in which a whole new magazine be created, rather than throwing out what still had a large readership, in the hopes to increase sales, for which there were no guarantees. It went ahead with millions of dollars spent on promotion and advertising and ultimately, it failed miserably.
The immediate costs, aside from money spent to promote the “new” look of the magazine, was a loss of a third of the readers. The obvious reaction was that the magazine, that had been an American icon for several decades, lost the nostalgia reader who enjoyed the same look and feel they were comfortable with and the next step was the biggest question the staff had.
Ideally, the magazine should have switched back, taken its lumps in the press and admit a bad decision but the executive scrambled to point a finger of blame at the staff while readership shrank with each issue.
One of the problems was the change of the cover, which on the pockets of the newsstand and magazine racks, was indiscernible from the previous issue. The staff finally convinced the executive and the weasely little publisher, who hid while this whole mess took place, to try an A/B testing of covers to see if that would bring back some sales.
Two different covers were published and sales were tracked across markets and the old, tried and true cover ended up winning out over the new cover design. The switch back was approved with the codicil that the interior would stay with the new format. At least the new cover slowed the death of the magazine that has jumped the shark several times since in a desperate attempt to keep publishing.
Had the magazine done an A/B testing with the entire product, the failure would have been quiet and much less costly. It is examples such as this that have led to A/B testing as a standard marketing tool for products and services.
What are the Basics of A/B Testing?
A/B testing — also called split testing, is a testing method used in marketing to compare results between two samples with the goal to improve conversion or response rates.
In web design, A/B tests are generally used to test design elements — sometimes against the existing design, to better determine which design elements will get the best response from visitors.
According to an article in WIRED, Brian Christian writes:
Over the past decade, the power of A/B testing has become an open secret of high-stakes web development. It’s now the standard (but seldom advertised) means through which Silicon Valley improves its online products. Using A/B, new ideas can be essentially focus-group tested in real time: Without being told, a fraction of users are diverted to a slightly different version of a given web page and their behavior compared against the mass of users on the standard site. If the new version proves superior — gaining more clicks, longer visits, more purchases — it will displace the original; if the new version is inferior, it’s quietly phased out without most users ever seeing it. A/B allows seemingly subjective questions of design — color, layout, image selection, text — to become incontrovertible matters of data-driven social science.
Mr. Christian also points out:
Today, A/B is ubiquitous, and one of the strange consequences of that ubiquity is that the way we think about the web has become increasingly outdated. We talk about the Google homepage or the Amazon checkout screen, but it’s now more accurate to say that you visited a Google homepage, an Amazon checkout screen. What percentage of Google users are getting some kind of “experimental” page or results when they initiate a search? Google employees I spoke with wouldn’t give a precise answer — “decent,” chuckles Scott Huffman, who oversees testing on Google Search. Use of a technique called multivariate testing, in which myriad A/B tests essentially run simultaneously in as many combinations as possible, means that the percentage of users getting some kind of tweak may well approach 100 percent, making “the Google search experience” a sort of Platonic ideal: never encountered directly but glimpsed only through imperfect derivations and variations.
He also brings up an interesting part of A/B testing – one which killed the chance of the aforementioned magazine to do A/B testing and the outcome when it was done:
What this means goes way beyond just a nimbler approach to site design. By subjecting all these decisions to the rule of data, A/B tends to shift the whole operating philosophy — even the power structure — of companies that adopt it. A/B is revolutionizing the way that firms develop websites and, in the process, rewriting some of the fundamental rules of business.
The Rules of Working with A/B Testing
As with any marketing process, decisions have to be ruled by the data itself. That presents problems with a work process with what Christian calls, HiPPO — “highest-paid person’s opinion.”
Christain imparts a story about an important attempt at A/B testing:
Tech circles are rife with stories of the clueless boss who almost killed a project because of a “mere opinion.” In Amazon’s early days, developer Greg Linden came up with the idea of giving personalized “impulse buy” recommendations to customers as they checked out, based on what was in their shopping cart. He made a demo for the new feature but was shot down. Linden bristled at the thought that the idea might not even be tested. “I was told I was forbidden to work on this any further. It should have stopped there.”
Instead Linden worked up an A/B test. It showed that Amazon stood to gain so much revenue from the feature that all arguments against it were instantly rendered null by the data. “I do know that in some organizations, challenging an SVP would be a fatal mistake, right or wrong,” Linden wrote in a blog post on the subject. But once he’d done an objective test, putting the idea in front of real customers, the higher-ups had to bend. Amazon’s culture wouldn’t allow otherwise.
Even with the A/B testing done for the cover of the magazine I mentioned, the executive who pushed the change refused to take the testing any further, hoping to save face by keeping the content of the magazine changed. Falling sales figures merely brought more finger-pointing at the hapless staff who had no control and were outspoken opponents of the changes and eventually a key staff member was fired, saddled with the blame for the failure yet further changes after the staff member’s departure, the magazine continued a downward spiral. The lesson of A/B testing was ignored and HiPPO continued to cause bad decisions that has cut the readership to dangerously low levels.
Industry rumors claim the magazine is planning on discontinuing printing in favor of an online presence only. Probably the best decision they could make in the past decade of huge mistakes and one taken by a key competitor many years ago, with great success.
Christian points out an important factor in A/B testing:
A/B increasingly makes meetings irrelevant. Where editors at a news site, for example, might have sat around a table for 15 minutes trying to decide on the best phrasing for an important headline, they can simply run all the proposed headlines and let the testing decide. Consensus, even democracy, has been replaced by pluralism — resolved by data.
While this sounds like the perfect solution to decisions based on facts and not egos, it’s hard to remove the human need to dominate and grab credit for success — or heap blame on others for failure.
Christian adds a note of the negative effect of A/B testing:
One consequence of this data-driven revolution is that the whole attitude toward writing software, or even imagining it, becomes subtly constrained. A number of developers told me that A/B has probably reduced the number of big, dramatic changes to their products. They now think of wholesale revisions as simply too risky—instead, they want to break every idea up into smaller pieces, with each piece tested and then gradually, tentatively phased into the traffic.
While I see his point, I don’t necessarily agree with the exception that regular traffic can be lost while the small steps are taken over a period of time. It’s hard to regain traffic once it’s lost. Somewhere between a change of entirety and cautious steps can be achieved. That, unfortunately, falls upon the boldness and assertive actions made through human decision and that removes data-driven decisions with ones that may be watered down or jumped into blindly, depending on HiPPO.
What Should I Test?
There are several points of A/B testing experts agree upon: Test only one thing at a time, allow for ample testing time and test on new site visitors and not your regular traffic.
If the response to elements on your site are not getting the results you believe you deserve, a simple A/B test will show you if response can be achieved and by using one element at a time testing, you can pinpoint which elements can be improved. Whether it’s gaining more followers on Twitter or to your RSS feed or selling more of your most popular products, it can be changing the positioning of design elements, reducing the number of pages needed to sell or order merchandise or general page positioning for a more responsive web site.
Unlike the example of the published magazine, which took just over a year to test for results (which took months just for the testing), the results came in too late for those who didn’t question the immediate loss of sales and continued monthly decline. With web responsiveness and the ability to make almost instant changes, A/B testing can save your business and income flow before real trouble takes root.
When designing a web site for the first time, A/B testing can be used as proof that one design choice is superior to another. This is particularly useful when working with a client who wants evidence backing up every decision they make, or a client who has trouble making decisions.
If you can offer them concrete proof that one design works better than another, they’re often much more comfortable making a decision. HiPPO may win out against testing but if data is ignored, it’s not the designer’s fault.
The most compelling reason to use A/B testing is to gain the most from your site. Why settle for half the response you receive when a little testing can improve response and push your site effectiveness to its top potential?
The old rules of sales and success still apply but the web makes it easier and faster. The same goes for marketing. The web has the same needs of all sales initiatives and marketing results. Rather than the old test groups consisting of people sitting in a room, giving their opinions, A/B testing takes the test to live users and gauges real responses without the subject knowing they are part of the test. It brings out true response in a fraction of the time and at less cost.
Determining how long to run a test is often simple. Look at the traffic patterns on the site via analytics. Most sites have cyclical traffic patterns, with some days consistently getting higher traffic than others. Figure out what factors may affect those patterns (adding a new blog post, beginning of the week, end of the week, etc.) and start to plot what is working to bring in traffic.
For some sites, this cycle will run over a one-week period, while for others it might be a month. You should run your A/B test over at least one cycle to get more accurate numbers.
Tools For A/B Testing
Paras Chopra, founder of Visual Website Optimizer, a simple A/B split and multivariate testing tool. Used by 5000+ companies worldwide, allowing marketers and designers to create A/B tests and make them live on website in less than 10 minutes, has some great links in his article on Smashing Magazine to help you begin with A/B testing. A number of tools are available for A/B testing, with different focuses, price points and feature sets. Here are some:
A/Bingo and Vanity. Server-side frameworks for Ruby on Rails developers. Requires programming and integration in code.
GET IDEAS FOR YOUR NEXT A/B TEST
Which Test Won? A game in which you guess which variation won in a test.
101 A/B Testing Tips. A comprehensive resource of tips, tricks and ideas.
ABtests.com. A place to share and read A/B test results.
INTRODUCTORY PRESENTATIONS AND ARTICLES
THE MATHEMATICS OF A/B TESTING
Image ©GL Stock Images