<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Ben Healey &#187; Analytics</title>
	<atom:link href="http://benhealey.info/tag/analytics/feed/" rel="self" type="application/rss+xml" />
	<link>http://benhealey.info</link>
	<description>Data Aficionado  &#124;  Wellington, New Zealand</description>
	<lastBuildDate>Sat, 14 Jan 2012 20:18:17 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='benhealey.info' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Ben Healey &#187; Analytics</title>
		<link>http://benhealey.info</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://benhealey.info/osd.xml" title="Ben Healey" />
	<atom:link rel='hub' href='http://benhealey.info/?pushpress=hub'/>
		<item>
		<title>Beware a Statistician with Dating Data</title>
		<link>http://benhealey.info/2011/05/28/beware-a-statistician-with-dating-data/</link>
		<comments>http://benhealey.info/2011/05/28/beware-a-statistician-with-dating-data/#comments</comments>
		<pubDate>Fri, 27 May 2011 22:32:27 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[Funny]]></category>
		<category><![CDATA[Online Surveys]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=464</guid>
		<description><![CDATA[It&#8217;s no secret that as we interact with more web services we are creating a larger and deeper footprint with respect to our digital behaviours. I think we are also volunteering more personal information when asked online.  The result has been an explosion in individual-level data available to data wranglers in organisations with a digital [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=464&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s no secret that as we interact with more web services we are creating a larger and deeper footprint with respect to our digital behaviours. I think we are also volunteering more personal information when asked online.  The result has been an explosion in individual-level data available to data wranglers in organisations with a digital presence.  Often, the negative sides of this are reported in the media; the decline of privacy and the risks of data abuse to <em>individuals</em>.  However, it also provides for some fascinating <em>aggregate-level</em> analysis that just hasn&#8217;t been previously possible.</p>
<p>For instance, <a href="http://www.google.org/flutrends/">Google Flu trends</a> shows how aggregate search behaviours can be used as an early warning signal for potential public health issues.</p>
<p>And then there is a post I recently found which examines correlations across answers to a questionnaire completed by users of a popular dating site&#8230;  The aim: to identify first-date questions that &#8220;(a) most people were comfortable discussing publicly, and (b) were mathematically likely to tell you something you couldn&#8217;t just guess&#8221;.  The analysis isn&#8217;t exactly in the interest of public health, but it <em>is</em> hilarious, well thought through, and accessible.  And no individual&#8217;s data is exposed in the process.</p>
<p>(Note, the content at this link isn&#8217;t really safe for work; if it were a TV show there would be a &#8216;contains explicit language and sexual themes&#8217; disclaimer before it started.)</p>
<p><a href="http://blog.okcupid.com/index.php/the-best-questions-for-first-dates/">OKCupid: The best questions for first dates</a>.</p>
<p>A couple of gems from the post that apply across the sexes (go to the post for the direction and strength of relationship):</p>
<blockquote><p><em>To predict:</em> Will my date have sex on the first date?<br />
<em>Ask:</em> Do you like the taste of beer?</p>
<p><em>To predict:</em> Is my date religious?<br />
<em>Ask:</em> Do spelling and grammar mistakes annoy you?</p></blockquote>
<p>And one that shows just how bad we are at judging our common ground with others:</p>
<blockquote><p>&#8220;<em>Which describes you better, normal or weird?</em> might be fine to ask, but doing so is of little value because almost everyone has the same answer. 79% of people think they are weird.&#8221;</p></blockquote>
<p>Disclaimer: The OKCupid sample is large, but probably doesn&#8217;t reflect the general population of people looking for partners. So, if you attempt to apply these nuggets of wisdom your mileage may vary.  That said, the differences presented are substantial enough that I&#8217;d be surprised if they don&#8217;t hold to at least a small degree outside of OkCupid&#8217;s target market!</p><br />Filed under: <a href='http://benhealey.info/category/post/uncategorized/'>Uncategorized</a> Tagged: <a href='http://benhealey.info/tag/analytics/'>Analytics</a>, <a href='http://benhealey.info/tag/data-mining/'>Data Mining</a>, <a href='http://benhealey.info/tag/funny/'>Funny</a>, <a href='http://benhealey.info/tag/online-surveys/'>Online Surveys</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/464/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/464/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/464/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/464/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/464/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/464/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/464/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/464/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/464/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/464/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/464/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/464/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/464/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/464/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=464&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2011/05/28/beware-a-statistician-with-dating-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>
	</item>
		<item>
		<title>Old School Data Visualisation (Part 2)</title>
		<link>http://benhealey.info/2010/08/29/old-school-data-visualisation-part-2/</link>
		<comments>http://benhealey.info/2010/08/29/old-school-data-visualisation-part-2/#comments</comments>
		<pubDate>Sat, 28 Aug 2010 22:09:57 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Data Visualisation]]></category>
		<category><![CDATA[Human Biases]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=379</guid>
		<description><![CDATA[A quick follow-up to the previous post on the power of data reduction and presentation&#8230; here is another example showing how rounding, ordering and thoughtful presentation can turn an incomprehensible grid of numbers into something most people can grok. It is from the same article (Ehrenberg, Feb 1992, The Problem of Numeracy, AdMap), but this time [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=379&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A quick follow-up to the previous post on the power of <a href="http://benhealey.info/2010/08/15/old-school-data-visualisation-part-1/">data reduction and presentation</a>&#8230; here is another example showing how rounding, ordering and thoughtful presentation can turn an incomprehensible grid of numbers into something most people can <a href="http://en.wikipedia.org/wiki/Grok">grok</a>.</p>
<p>It is from the same article (Ehrenberg, Feb 1992, <em>The Problem of Numeracy</em>, <a href="www.warc.com/admap">AdMap</a>), but this time relates to television programme viewership.  The first table presents detailed correlations for responses to the question &#8216;<em>I really like to watch programme x</em>&#8216; across a range of programmes and two channels (<em>ITV </em>and <em>BBC</em>).</p>
<p><a href="http://benhealey.files.wordpress.com/2010/08/numeracy_table31.gif"><img class="alignnone size-full wp-image-387" title="numeracy_table3" src="http://benhealey.files.wordpress.com/2010/08/numeracy_table31.gif?w=632" alt=""   /></a></p>
<p>Apart from an obvious diagonal line of 1.000 in the table (of course each programmes&#8217; rating correlates perfectly with itself), there isn&#8217;t much else you can take out from it.  The next table renders the data a little more readable by introducing rounding to one decimal place, discarding the redundant leading zeros and disposing of the meaningless 1.000 diagonal.</p>
<p><a href="http://benhealey.files.wordpress.com/2010/08/numeracy_table41.gif"><img class="alignnone size-full wp-image-385" title="numeracy_table4" src="http://benhealey.files.wordpress.com/2010/08/numeracy_table41.gif?w=632" alt=""   /></a></p>
<p>And with a little more thought to row order, spacing and the key data for presentation (i.e., do we really need channel?), we get to the following:</p>
<p><a href="http://benhealey.files.wordpress.com/2010/08/numeracy_table5.gif"><img class="alignnone size-full wp-image-383" title="numeracy_table5" src="http://benhealey.files.wordpress.com/2010/08/numeracy_table5.gif?w=632" alt=""   /></a></p>
<p>Those familiar with television in the UK will now see that people who like to watch one sport programme also like to watch other sports programmes, particularly if they are &#8217;round up&#8217; type shows.  They don&#8217;t, however, like news or current events programmes so much.  A similar pattern occurs for current event watchers, but the programmes within that cluster have slightly lower correlations, meaning viewership is less likely to be homogeneous amongst that group.  If you are an advertiser or producer, this is useful stuff to know because it will give you an idea of the reach of, and competition around, a certain programme.  And you are more likely to understand this if the data is presented in a clear and concise way.</p>
<p>_____</p>
<p>ShortURL for this post: <a href="http://wp.me/pnqr9-67">http://wp.me/pnqr9-67</a></p><br />Filed under: <a href='http://benhealey.info/category/post/thoughts/'>Thoughts</a> Tagged: <a href='http://benhealey.info/tag/analytics/'>Analytics</a>, <a href='http://benhealey.info/tag/business-intelligence/'>Business Intelligence</a>, <a href='http://benhealey.info/tag/data-visualisation/'>Data Visualisation</a>, <a href='http://benhealey.info/tag/human-biases/'>Human Biases</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/379/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/379/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/379/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/379/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/379/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/379/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/379/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/379/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/379/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/379/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/379/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/379/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/379/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/379/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=379&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2010/08/29/old-school-data-visualisation-part-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>

		<media:content url="http://benhealey.files.wordpress.com/2010/08/numeracy_table31.gif" medium="image">
			<media:title type="html">numeracy_table3</media:title>
		</media:content>

		<media:content url="http://benhealey.files.wordpress.com/2010/08/numeracy_table41.gif" medium="image">
			<media:title type="html">numeracy_table4</media:title>
		</media:content>

		<media:content url="http://benhealey.files.wordpress.com/2010/08/numeracy_table5.gif" medium="image">
			<media:title type="html">numeracy_table5</media:title>
		</media:content>
	</item>
		<item>
		<title>Old School Data Visualisation (Part 1)</title>
		<link>http://benhealey.info/2010/08/15/old-school-data-visualisation-part-1/</link>
		<comments>http://benhealey.info/2010/08/15/old-school-data-visualisation-part-1/#comments</comments>
		<pubDate>Sun, 15 Aug 2010 03:30:01 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Data Visualisation]]></category>
		<category><![CDATA[Human Biases]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=369</guid>
		<description><![CDATA[I was talking to a friend last night about data presentation.  We were looking at an iPad ap that allows users to thumb through and drill-down into their sales data for different geographic regions.  Among other things, the ap displayed charts with smoothed trend-lines to help users get a feel for what the future might [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=369&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>I was talking to a friend last night about data presentation.  We were looking at an iPad ap that allows users to thumb through and drill-down into their sales data for different geographic regions.  Among other things, the ap displayed charts with smoothed trend-lines to help users get a feel for what the future might hold. Yet, in the relatively brief time I spent looking at the data it was hard to get any real sense of what the key take-outs might be.</p>
<p>This will have been partly due to my lack of familiarity with the dataset; the person responsible for sales for the organisation would have  brought a wealth of historic knowledge to the data that may have enabled them to quickly see discrepancies or commonalities in the charts.  However, there was also an element of &#8216;too much&#8217; information.  There is only so much we humans can hold in our short term memory before we become overwhelmed and our ability to do mental calculations or comparisons is compromised.  This is why it is critical for anyone presenting data to consider not only the level of detail required, but also how the information should be delivered for quick and clear consumption.</p>
<p>Marketing scientist Andrew Ehrenberg spent a fair amount of time on these issues and was a strong advocate of <a href="http://en.wikipedia.org/wiki/Andrew_S._C._Ehrenberg#Data_reduction">data reduction</a> (which relates to the idea that much success in research relies on the discovery of patterns in data, and that this process is aided by its presentation in simple tables).  In fact, Ehrenberg wrote a <a href="http://www.empgens.com/ArticlesHome/Volume5/DataReduction.html">book on the subject</a> that is freely downloadable from the EmpGens Journal.</p>
<p>Here is an example of Ehrenberg&#8217;s approach.  I&#8217;ve reproduced the tables from a four page article of his in <em><a href="http://www.warc.com/admap">Admap</a> </em>from 1992 titled &#8216;<em>The Problem of Numeracy</em>&#8216;.  First up is a table <em>not </em>optimised for human consumption.  Try to pick out some noteworthy patterns.</p>
<p><img class="size-full wp-image-373 alignnone" title="numeracy_table1" src="http://benhealey.files.wordpress.com/2010/08/numeracy_table1.gif?w=632" alt=""   /></p>
<p>Now try again, using a modified presentation of the same data:</p>
<p><a href="http://benhealey.files.wordpress.com/2010/08/numeracy_table2.gif"><img class="alignnone size-full wp-image-374" title="numeracy_table2" src="http://benhealey.files.wordpress.com/2010/08/numeracy_table2.gif?w=632" alt=""   /></a></p>
<p>The rounding, averages and different row ordering (population size, rather than alphabet) all make it easier to quickly understand the data.  We can now see, for instance, that most regions saw a dip in Q3, that Leeds and Edinburgh have seen strong growth in Q4, and that Leeds is consistently punching above its weight in per capita sales.  We can also easily answer comparative questions like &#8216;<em>how much larger was Edinburgh than Swansea over the year</em>&#8216; (about 2.5x), which were much harder to do from the first table.</p>
<p>People don&#8217;t often think of treating tables like other design elements in a user interface.  Yet as the example shows, they can fairly easily be tweaked to great effect.  And, when presented clearly, a table can convey more information in a short space of time than a series of charts.</p><br />Filed under: <a href='http://benhealey.info/category/post/thoughts/'>Thoughts</a> Tagged: <a href='http://benhealey.info/tag/analytics/'>Analytics</a>, <a href='http://benhealey.info/tag/business-intelligence/'>Business Intelligence</a>, <a href='http://benhealey.info/tag/data-visualisation/'>Data Visualisation</a>, <a href='http://benhealey.info/tag/human-biases/'>Human Biases</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/369/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/369/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/369/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=369&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2010/08/15/old-school-data-visualisation-part-1/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>

		<media:content url="http://benhealey.files.wordpress.com/2010/08/numeracy_table1.gif" medium="image">
			<media:title type="html">numeracy_table1</media:title>
		</media:content>

		<media:content url="http://benhealey.files.wordpress.com/2010/08/numeracy_table2.gif" medium="image">
			<media:title type="html">numeracy_table2</media:title>
		</media:content>
	</item>
		<item>
		<title>Link Post: Google GPS, Fraud Detection and PolitiScience</title>
		<link>http://benhealey.info/2009/10/31/link-post-google-gps-fraud-detection-and-politiscience/</link>
		<comments>http://benhealey.info/2009/10/31/link-post-google-gps-fraud-detection-and-politiscience/#comments</comments>
		<pubDate>Fri, 30 Oct 2009 20:46:37 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Evidence-Based Policy]]></category>
		<category><![CDATA[Fraud Detection]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[GPS]]></category>
		<category><![CDATA[Mobile]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=202</guid>
		<description><![CDATA[A number of interesting links came through the Twitterverse this morning, so I&#8217;m putting them here to share/remember. Google redefines disruption (via @Valuecruncher) &#8211; Fascinating read on Google&#8217;s mobile and mapping developments. How two banks are detecting fraud (via @alisonbolen)  - How some banks are using predictive modelling and network analysis (and SAS) to detect [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=202&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>A number of interesting links came through the Twitterverse this morning, so I&#8217;m putting them here to share/remember.</p>
<ul>
<li><a href="http://abovethecrowd.com/2009/10/29/google-redefines-disruption-the-%E2%80%9Cless-than-free%E2%80%9D-business-model/">Google redefines disruption</a> (via <a href="http://twitter.com/Valuecruncher">@Valuecruncher</a>) &#8211; Fascinating read on Google&#8217;s mobile and mapping developments.</li>
<li><a href="http://blogs.sas.com/sascom/index.php?/archives/604-You-become-the-hunter-and-they-become-the-prey.html">How two banks are detecting fraud</a> (via <a href="http://twitter.com/alisonbolen">@alisonbolen</a>)  - How some banks are using predictive modelling and network analysis (and SAS) to detect and prevent exposure to fraud.</li>
<li><a href="http://news.bbc.co.uk/2/hi/uk_news/8334774.stm">Cannabis row drugs adviser sacked</a> (via <a href="http://twitter.com/jonathanbriggs">@jonathanbriggs</a>) &#8211; The UK has a good reputation for trying to foster evidence-based policy-making, but it appears to have taken a bit of a stumble here.</li>
</ul>
<p>Enjoy!</p>
<p>_____</p>
<p>ShortURL for this post: <a href="http://wp.me/pnqr9-3g">http://wp.me/pnqr9-3g</a></p><br />Posted in Uncategorized Tagged: Analytics, Business Intelligence, Evidence-Based Policy, Fraud Detection, Google, GPS, Mobile <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/202/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/202/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/202/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=202&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2009/10/31/link-post-google-gps-fraud-detection-and-politiscience/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>
	</item>
		<item>
		<title>Music to a Data Geek&#8217;s Ears</title>
		<link>http://benhealey.info/2009/10/04/music-to-a-data-geeks-ears/</link>
		<comments>http://benhealey.info/2009/10/04/music-to-a-data-geeks-ears/#comments</comments>
		<pubDate>Sun, 04 Oct 2009 01:23:24 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Data Transformation]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Metrics]]></category>
		<category><![CDATA[Split Testing]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=170</guid>
		<description><![CDATA[&#8220;If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what&#8217;s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis. So my recommendation is to take lots [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=170&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<blockquote><p>&#8220;If you are looking for a career where your services will be in high demand, you should find something where you provide a scarce, complementary service to something that is getting ubiquitous and cheap. So what&#8217;s getting ubiquitous and cheap? Data. And what is complementary to data? Analysis. So my recommendation is to take lots of courses about how to manipulate and analyze data: databases, machine learning, econometrics, statistics, visualization, and so on.&#8221;  <a href="http://freakonomics.blogs.nytimes.com/2008/02/25/hal-varian-answers-your-questions/">Hal Varian, Chief Economist at Google</a></p></blockquote>
<p>Me suffer from <a href="http://en.wikipedia.org/wiki/Confirmation_bias">confirmation bias</a>? Never!<br />
_____</p>
<p>Short URL for this post: <a href="http://wp.me/pnqr9-2K">http://wp.me/pnqr9-2K</a></p><br />Posted in Thoughts Tagged: Analytics, Business Intelligence, Data Transformation, ETL, Metrics, Split Testing <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/170/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/170/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/170/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/170/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/170/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/170/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/170/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/170/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/170/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/170/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/170/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/170/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/170/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/170/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=170&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2009/10/04/music-to-a-data-geeks-ears/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>
	</item>
		<item>
		<title>A Nifty Trick for Transforming Categorical Data</title>
		<link>http://benhealey.info/2009/09/20/a-nifty-trick-for-transforming-categorical-data/</link>
		<comments>http://benhealey.info/2009/09/20/a-nifty-trick-for-transforming-categorical-data/#comments</comments>
		<pubDate>Sun, 20 Sep 2009 02:50:56 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Business Intelligence]]></category>
		<category><![CDATA[Data Transformation]]></category>
		<category><![CDATA[ETL]]></category>
		<category><![CDATA[Metrics]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=115</guid>
		<description><![CDATA[Categorical variables with lots of options (e.g., country of origin, occupation, postcodes) can be problematic when regression modelling; they have to be dummy coded and use many degrees of freedom, increasing the potential for model overfitting.  The typical approaches to dealing with this are to: Discard the variable if it doesn&#8217;t appear it will be [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=115&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Categorical variables with lots of options (e.g., country of origin, occupation, postcodes) can be problematic when regression modelling; they have to be dummy coded and use many degrees of freedom, increasing the potential for model overfitting.  The typical approaches to dealing with this are to:</p>
<ul>
<li>Discard the variable if it doesn&#8217;t appear it will be a good discriminator. It is sometimes hard to tell this up front when you have loads of categories.</li>
<li>Roll the categories up into larger sets based on conceptual similarity.  This can work for ordinal or geographic data, but is more difficult for purely nominal variables.  There is also the risk that you&#8217;ll &#8216;average away&#8217; some of the predictive value in the variable.</li>
<li>Use a statistical technique (e.g., a decision tree) to work out groupings of categories based on their discriminative power.  This may make for groupings that are hard to explain.</li>
</ul>
<p>Another option I&#8217;ve recently come across is to convert the categorical variable to a metric-level variable using historic response data.  For instance, say you&#8217;ve been collecting your customer&#8217;s postcodes for a while and are looking to employ this variable in a predictive model.  Perhaps you are predicting response to a mailing offer (or something similar) which has been running for at least one learning cycle.  A potential way to deal with the &#8216;too many categories&#8217; problem would be to calculate the proportion of people contacted in each postcode during prior mailings who responded to the offer.  Voilà!  You&#8217;ve now got a metric level and continuous variable to play with.  You can apply the historic response values to any new prospects you are looking to score by matching on the postcode.</p>
<p>There are at least a couple of caveats to consider when attempting this.  One is that the proportion will be less robust when you have very few people in a specific category historically (e.g., rural postcodes).  In these cases you might have to do some category roll-ups first.  Another potential issue is that it assumes historic contacts were made at random, or according to some mechanism that will also be applied in future selection processes, such that you can consider the prior contacts &#8216;representative&#8217; of category membership for the purposes of your modelling.  Violations of the assumption would probably require some statistical adjustment to get around.</p>
<p>If anyone sees other potential issues with this approach, or has other alternatives they use to deal with problematic categorical variables, feel free to comment!</p>
<p>_____</p>
<p>Short URL for this post: <a href="http://wp.me/pnqr9-1R">http://wp.me/pnqr9-1R</a></p><br />Posted in Thoughts Tagged: Analytics, Business Intelligence, Data Transformation, ETL, Metrics <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/115/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/115/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/115/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=115&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2009/09/20/a-nifty-trick-for-transforming-categorical-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>
	</item>
		<item>
		<title>Not All Conversions are Created Equal</title>
		<link>http://benhealey.info/2009/08/17/not-all-conversions-are-created-equal/</link>
		<comments>http://benhealey.info/2009/08/17/not-all-conversions-are-created-equal/#comments</comments>
		<pubDate>Mon, 17 Aug 2009 05:32:45 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Analytics]]></category>
		<category><![CDATA[Conversions]]></category>
		<category><![CDATA[Data-Driven Design]]></category>
		<category><![CDATA[Metrics]]></category>
		<category><![CDATA[Persuasive Elements]]></category>
		<category><![CDATA[Split Testing]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=65</guid>
		<description><![CDATA[Tim Ferris posted a Google Website Optimizer Case Study the other day showing how data-based design tweaks at Gyminee (now Daily Burn) helped them increase conversions by 20% and then another 16% on top of that.  The post presents a really nice example of how simple it is to use free tools along with good [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=65&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Tim Ferris posted a <a title="Google Website Optimizer Case Study" href="http://www.fourhourworkweek.com/blog/2009/08/12/google-website-optimizer-case-study/">Google Website Optimizer Case Study</a> the other day showing how data-based design tweaks at Gyminee (now <a title="Daily Burn" href="http://dailyburn.com/">Daily Burn</a>) helped them increase conversions by 20% and then another 16% on top of that.  The post presents a really nice example of how simple it is to use free tools along with good landing page design principles to generate improvements in site goal performance.  That said, I&#8217;d add a couple of things to round out the article:</p>
<ol>
<li> The performance improvement was measured in number of free trial sign-ups.  There is nothing wrong with that if Daily Burn has free sign ups as a key goal.  However, it is worth noting that the improvements in free sign-ups may have had the opposite effect on conversions to paid accounts.  One reason for this is that by reducing the possible actions on the page to one (sign up for a free trial) in the second set of changes, Daily Burn may be seeing an increase in sign-ups from tire kickers who just want to see what the ap looks like.  In the past visitors could click the &#8216;tour&#8217; button to do this; now they have to go via the free trial route.   If the requirement to sign up also puts some other potential purchasers off before they get a chance to see the product, the net effect of the change may be to decrease the proportion of free trialers that go on to paid subscriptions.  One of the sites I read presented an example of exactly this issue a few weeks back; I think it was <a title="Marketing Experiments" href="http://www.marketingexperiments.com/">Marketing Experiments</a> but now I can&#8217;t find the article (doh). [Update: here is <a title="Which Test Won" href="http://whichtestwon.com/?page_id=1974&amp;pollid=19">a different example with a similar finding</a> ]</li>
<p></p>
<li> Here is a link to the <a href="http://en.wikipedia.org/wiki/index.html?curid=14872453">Paradox of Choice</a> concept Tim mentioned.  I&#8217;m not so sure the original Gyminee page was overwhelming people with choice (causing choice paralysis) as much as providing too much of an opportunity to get distracted before clicking on the sign-up button.  Ultimately it doesn&#8217;t matter; the effect of the modification was positive whatever the underlying reason for the change in behaviour!</li>
<p></p>
<li>Tim didn&#8217;t specify the &#8216;conversion marketing best practices&#8217;  behind the design changes tested in the second half of the post.  Going by the screenshots presented, these included the use of testimonials (social proof), awards (authority), and specificity (specific facts are more persuasive).  Feel free to posts others if you spot them&#8230;</li>
</ol>
<p>____</p>
<p>Short URL for this post: <a href="http://wp.me/pnqr9-13" rel="nofollow">http://wp.me/pnqr9-13</a></p><br />Posted in Thoughts Tagged: Analytics, Conversions, Data-Driven Design, Metrics, Persuasive Elements, Split Testing <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/65/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/65/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/65/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=65&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2009/08/17/not-all-conversions-are-created-equal/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>
	</item>
	</channel>
</rss>
