Tagged: Split Testing RSS Toggle Comment Threads | Keyboard Shortcuts

  • Ben 8:38 on Wednesday, September 21, 2011 Permalink | Reply
    Tags: , , Split Testing   

    A great write-up on determining sample sizes for, and avoiding common traps in, split testing. Yet another good testing post from the folks at 37 signals. R code and discussion of power calcs included. http://37signals.com/svn/posts/3004-ab-testing-tech-note-determining-sample-size

     
  • Ben 8:51 on Wednesday, August 24, 2011 Permalink | Reply
    Tags: , , Split Testing, SQL   

    Links: Using SQL ‘With’ statements, and a great example of A/B Testing 

    Two links worth keeping:

     
  • Ben 8:08 on Thursday, February 10, 2011 Permalink | Reply
    Tags: , Split Testing   

    How one extra word in an email subject line improved end point conversions by 279%. This stuff never ceases to amaze me. http://whichtestwon.com/archives/7353

     
  • Ben 20:23 on Tuesday, November 9, 2010 Permalink | Reply
    Tags: Split Testing   

    Just keeping for later: http://blog.minethatdata.com/2010/11/ab-testing-persecution.html

     
  • Ben 11:53 on Saturday, May 8, 2010 Permalink | Reply
    Tags: , , Split Testing   

    Google’s User Interface Design and Decision Process 

    Here is a link worth keeping.  Google recently updated the look and feel of its search user interface.  This article describes the behind-the-scenes process Googlers followed to get to the end point we are all seeing today.  Unsurprisingly, they followed a thorough research process, incorporating extensive qualitative and quantitative feedback before settling on an optimal solution.

    How Google got its New Look.

     
  • Ben 11:15 on Sunday, December 6, 2009 Permalink | Reply
    Tags: Experimental Design, , Simpson's Paradox, Split Testing   

    Have You Fallen Prey to Simpson’s Paradox? 

    In a previous post on experimentation at Microsoft I linked to a recent presentation by Ron Kohavi (GM of their experimentation platform).  One point he raised was that you can actually get the wrong answers from split tests because of a phenomenon called Simpson’s Paradox.  You read right; your test might tell you that version A is the best bet when in reality the better performing version is B.

    That should send a shiver down the spine of anyone tasked with improving a website’s ROI.

    Simpson’s paradox can occur in any setting where the proportion of people allocated to split groups (e.g., control and test) varies according to some important attribute in the study.  It is easiest to understand the paradox by example.  Thankfully, the Wall Street Journal presented one a couple of days ago in an article on the Flaw of Averages.  Essentially, it showed that although current aggregate unemployment rates in the US (expressed as % jobless) don’t appear as bad as they were during the 80s recession, they are actually consistently worse when the figures are examined by educational subgroup.  This is because the proportion of people in each educational subgroup has shifted between the 1980s and now, and each subgroup has a different susceptibility to unemployment.

    The WSJ article also presents two other examples (U of C Berkeley admissions gender bias and Kidney stone treatment efficacy).  If you are still scratching your head after reading through the narrative explanations, try having a look at a the data-based explanations of the same examples on this Wikipedia entry.

    Turning to a web-based scenario, in a recent paper outlining pitfalls to avoid in online experimentation, the folks at Microsoft showed how Simpson’s Paradox can occur when a test is ‘ramped up’ over time.  Their example involves a page design test run over two days, with a 1% sample of users assigned to the test group on the first day (Friday) and then a 50% sample assigned to the test group on the second day (Saturday).  Here is the data from the paper:

    (Note: The percentage in the version B ‘total’ cell is different here due to an error in the original)

    On both test days ‘B’ was the winning version.  However, the result is reversed in the aggregated total; Version A is the winner.  This is essentially because both the test split allocations and response levels varied by day.

    Test ‘ramp ups’ are quite common.  It is good practice to do a pilot of the test on a small sample to make sure everything is working OK before unleashing it on a larger sample. So, the potential for Simpson’s Paradox to occur is very real.  If you are analysing split test results, you can make sure your analysis avoids the problem by re-weighting the results from periods with different allocation procedures or by simply discarding the results from the pilot phase.

    _____

    ShortURL for this post: http://wp.me/pnqr9-3X

     
  • Ben 18:17 on Saturday, November 28, 2009 Permalink | Reply
    Tags: , , Split Testing   

    Online Experimentation at Microsoft 

    Over the last three years Microsoft embraced experimentation as a mechanism for testing changes to their various online products.  That they are only recently formally adopting a data-driven approach to their design was a little surprising to me, but it is certainly better late than never!

    As part of the process of making the shift away from simply following the Highest Paid Person’s Opinion (HiPPO) to actually testing the ROI of different ideas, the team in charge of experimentation has been disseminating some of their experiences. You can see a recent talk on the topic, presented at a September meeting of Seattle Tech Startups, at the URL below (sorry, the quality isn’t great and I can’t embed because of WordPress.com restrictions).  Alternatively, go to the Microsoft experimentation portal to see other work from this group.

    http://www.ustream.tv/flash/video/2134721

    The talk presents a number of interesting insights, ranging from the results of some tests (winning versions are often different to what you’d think) through to the cultural hurdles arising from an increased reliance on data for decision making (e.g., people with strong opinions get their egos bruised).

    Amazon.com is also mentioned a couple of times.  I think a few of the current Microsoft team originally cut their teeth there, so those of you interested in this topic might also like to see this eMetrics Summit 2004 presentation (pdf).  It showcases the Amazonian approach to deciding on site changes and resolving bitter political disputes over whose pet area should get highly coveted slots on the home page.  Interesting stuff that more and more organisations are going to have to grapple with as their products and services become increasingly digitized.

     
  • Ben 14:23 on Sunday, October 4, 2009 Permalink | Reply
    Tags: , , , , , Split Testing   

    Music to a Data Geek’s Ears 

    “If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what’s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis. So my recommendation is to take lots of courses about how to manipulate and analyze data: databases, machine learning, econometrics, statistics, visualization, and so on.”  Hal Varian, Chief Economist at Google

    Me suffer from confirmation bias? Never!
    _____

    Short URL for this post: http://wp.me/pnqr9-2K

     
  • Ben 18:32 on Monday, August 17, 2009 Permalink | Reply
    Tags: , Conversions, , , , Split Testing   

    Not All Conversions are Created Equal 

    Tim Ferris posted a Google Website Optimizer Case Study the other day showing how data-based design tweaks at Gyminee (now Daily Burn) helped them increase conversions by 20% and then another 16% on top of that.  The post presents a really nice example of how simple it is to use free tools along with good landing page design principles to generate improvements in site goal performance.  That said, I’d add a couple of things to round out the article:

    1. The performance improvement was measured in number of free trial sign-ups.  There is nothing wrong with that if Daily Burn has free sign ups as a key goal.  However, it is worth noting that the improvements in free sign-ups may have had the opposite effect on conversions to paid accounts.  One reason for this is that by reducing the possible actions on the page to one (sign up for a free trial) in the second set of changes, Daily Burn may be seeing an increase in sign-ups from tire kickers who just want to see what the ap looks like.  In the past visitors could click the ‘tour’ button to do this; now they have to go via the free trial route.   If the requirement to sign up also puts some other potential purchasers off before they get a chance to see the product, the net effect of the change may be to decrease the proportion of free trialers that go on to paid subscriptions.  One of the sites I read presented an example of exactly this issue a few weeks back; I think it was Marketing Experiments but now I can’t find the article (doh). [Update: here is a different example with a similar finding ]
    2. Here is a link to the Paradox of Choice concept Tim mentioned.  I’m not so sure the original Gyminee page was overwhelming people with choice (causing choice paralysis) as much as providing too much of an opportunity to get distracted before clicking on the sign-up button.  Ultimately it doesn’t matter; the effect of the modification was positive whatever the underlying reason for the change in behaviour!
    3. Tim didn’t specify the ‘conversion marketing best practices’  behind the design changes tested in the second half of the post.  Going by the screenshots presented, these included the use of testimonials (social proof), awards (authority), and specificity (specific facts are more persuasive).  Feel free to posts others if you spot them…

    ____

    Short URL for this post: http://wp.me/pnqr9-13

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel
Follow

Get every new post delivered to your Inbox.