Updates from July, 2011 Toggle Comment Threads | Keyboard Shortcuts

  • Ben 10:13 on Thursday, February 23, 2012 Permalink | Reply
    Tags: , ,   

    Archimedes and Newton: Old school split testers 

    It is not uncommon to see news stories celebrating the success of some initiative or individual as being due to some bright idea or moment of inspiration.  This phenomenon is not new; every child is taught that Archimedes had his ‘Eureka moment’ and can recite the story of Netwon’s falling apple.  It is these flashes of insight that we remember and strive to emulate.

    However, the focus on creativity is unfortunate because it only paints half the picture.  For instance, the ‘file drawer problem’ means we see those flashes of inspiration that led to success, rather than the countless others that didn’t.  And, it is easy to forget that people like Archimedes and Newton were old-school split testers.  They subjected idea after idea to the brutal scientific method and learned from the many failures they no doubt had.  It is their perseverance and commitment to testing, not just their creativity, that we should remember them for.

    Fortunately, more news is starting to bubble to the surface about the interplay between the creative and scientific processes.  For instance, this wired story shows how the gaming industry (typically considered a bastion of creativity and design) is embracing split testing to drive development decisions.  I also recently saw the following talk shared widely on Twitter about the testing that went into the success of Obama’s 2008 fundraising campaigns.

    These stories highlight the fact that creative ideas are like the random mutations that drive the evolutionary process.  They are necessary, but certainly not sufficient, for progress to occur.  And another interesting recurrent theme is that the mental models underlying our creativity – the source of our ‘gut feelings’ about what will work – are often wrong.  Indeed, testing is essential to updating these models and is an under appreciated input to the creative process.  Together, they form an iterative learning cycle.

    This interplay has implications for organisational and personal development in that as much effort should be put into developing the testing and learning process as goes into supporting the creative process.

     
  • Ben 9:16 on Sunday, January 15, 2012 Permalink | Reply
    Tags:   

    Just keeping for later: Public datasets hosted on Amazon AWS. https://aws.amazon.com/datasets

     
  • Ben 13:03 on Thursday, January 12, 2012 Permalink | Reply  

    A few select pics from a recent trip.  By fluke of nature we managed to catch 11 days of sun from the 13 we were away. The rest of the country wasn’t so lucky.  It was great to get out and see more of the homeland. Like many New Zealanders, prior to this road trip I’d seen more foreign soil than I had of my own.

    The Marina at Picton.

     

    A seal playing.  Royal Albatross centre, Otago Peninsula.

     

    Mark of the seagull. Royal Albatross centre, Otago Peninsula.

     

    New Year Rodeo, Wanaka.

     

    View over Wanaka from Mt. Iron.

     

    An earnest Dork impression. Fox Glacier.

     

    Franz Joseph Glacier.

     

    Inside an abandoned Gold Mine. Near Greymouth.

     

    Steps carved into the Pancake Rocks. Punakaiki.

     

    Sea-spray through a blow hole. Punakaiki.

     

    There are no photos from the Ferry crossing at the end of the trip, but it was eventful enough to remember without them.  We crossed in 50-55 knot gales, so at least half of the passengers got seasick …Myself included.

     
  • Ben 11:54 on Wednesday, December 21, 2011 Permalink | Reply  

    Confidence bias in action 

    I’ve dabbled a little with crowdsourcing for my own projects, but never used it as a primary research tool.  It isn’t hard to see how the major crowdsourcing platforms like Mechanical Turk could be used to undertake quick and cost-effective behavioural research (potential for bias notwithstanding!).  So, the following study by crowdsourcing firm Crowdflower on its own worker base was interesting in itself.  That it related to another interest of mine, human bias, made it even more intriguing :)

    Confidence Bias: Evidence from Crowdsourcing

    The key take-out: over 75% of contributors overestimated their ability to answer multiple choice questions correctly.  The Dunning-Kruger effect is alive and well!

     
  • Ben 8:23 on Friday, November 11, 2011 Permalink | Reply
    Tags: , , , , , Survey Walls   

    Looks like Google is getting into the Survey Business 

    From Neiman Journalism Lab:

    Google appears to be experimenting with a new paywall-esque content roadblock for publishers, and it’s not One Pass. For lack of a better name, let’s call it a “survey wall,” because instead of dollars the system asks readers a question before they can move on to continue reading what they like.

    This could get interesting.  Instead of a standard paywall, people may be able to ‘pay’ for content by answering survey questions.  The publisher gets valuable information it can on-sell to advertisers, and Google dulls the old-media knives that are increasingly aimed at its vital organs. A natural extension of this would be that the publisher would become a survey panel provider of sorts.  Survey companies would be able to buy access to the survey-wall to ask their own questions for a fee-per-answer.  There is also no reason why independent panel companies could attempt to step into the role Google appears to be playing as the third-party technology provider.

    Of course, there are big questions about the quality of data that may come from these distributed surveys.

    • Would people answer honestly?
    • What can reasonably be done with one or two answers from each visitor? (e.g., it would be difficult to examine relationships between more than a couple of variables)
    • Why would we expect people who visit survey-wall sites to be representative of a given population?
    These, and other questions, will keep survey methodologists in business for a while :)
     
    • davidwallacefleming 9:00 on Friday, November 11, 2011 Permalink

      Valuable information to stay appraised of. Thank you. I hope this does not get implemented.

  • Ben 20:01 on Wednesday, October 19, 2011 Permalink | Reply
    Tags: Brand Loyalty, Double Jeopardy,   

    Double Jeopardy in Hotel Ratings 

    A well established, and surprisingly general, empirical pattern in markets is that brands with lower market share have buyers that also exhibit less loyalty toward the brand.  This pattern has a name – Double Jeopardy - and it undermines the logic of niche marketing strategies focussed on appealing to a small group in a larger market in the hope that doing so will garner greater loyalty.

    Another feature of the pattern is that the average small brand buyer tends to be less favourable towards that brand than the average large brand buyer.   Indeed, it appears that Double Jeopardy even applies to hotel rating data presented in a recent post in the Data Miners blog by Michael Berry.  Here is the key quote:

    It is hardly surprising that the Bellagio in Las Vegas has about 250 times more reviews than say, the Cambridge Gateway Inn, an unloved motel in Cambridge, Massachusetts. It may or may not be surprising that these oft-reviewed properties tend to be well-liked by our reviewers. Surprising or not, it’s true: the hotels with the most reviews have a higher average rating than the long tail of hotels, motels, B&Bs, and Inns with only a handful of reviews each.

     
  • Ben 18:46 on Friday, August 26, 2011 Permalink | Reply
    Tags: , Document Classification, ,   

    KiwiPycon 2011: Document Classification with the Natural Language Toolkit 

    I’m heading to KiwiPycon in Welly this weekend to meet some fellow Python fans and give a presentation on using the Python-based Natural Language Toolkit (NLTK) to classify documents.  I’ll be using the Enron emails as an example document set.

    If you’ve travelled here from the future because you saw the presentation and want the files I referred to, here they are.

    There is a missing link between the two code files: changes I made to the dataset to enable training of the classifier and analysis of the results. If you are interested in getting the final dataset, just get in touch.
    ______
    Update: Here is the slideshare version of the presentation with audio.  And here is a text-to-speech video version, with some extra content.
     
  • Ben 8:51 on Wednesday, August 24, 2011 Permalink | Reply
    Tags: , , , SQL   

    Links: Using SQL ‘With’ statements, and a great example of A/B Testing 

    Two links worth keeping:

     
  • Ben 9:24 on Wednesday, August 17, 2011 Permalink | Reply
    Tags:   

    Big/Open Data and the Social Sciences 

    Another great interview and article from Audrey Watters.  You can pretty much replace ‘Political Science’ with the social science subject of your choosing.

     
  • Ben 13:17 on Friday, July 29, 2011 Permalink | Reply
    Tags: ,   

    Kaggle: An interesting source of sample data to model 

    Good real-world datasets used to be quite hard to come by for those interested in playing with different modelling approaches.  However, sites like Kaggle, which expand the crowdsourcing approach to model improvement initiated by events such as the Netflix prize and KDD Cup, are opening up more datasets for statistical modellers to use.  Worth keeping an eye on.

     
c
compose new post
j
next post/next comment
k
previous post/previous comment
r
reply
e
edit
o
show/hide comments
t
go to top
l
go to login
h
show/hide help
shift + esc
cancel
Follow

Get every new post delivered to your Inbox.