<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:media="http://search.yahoo.com/mrss/"
	>

<channel>
	<title>Ben Healey &#187; Django</title>
	<atom:link href="http://benhealey.info/tag/django/feed/" rel="self" type="application/rss+xml" />
	<link>http://benhealey.info</link>
	<description>Data Aficionado  &#124;  Wellington, New Zealand</description>
	<lastBuildDate>Sat, 14 Jan 2012 20:18:17 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.com/</generator>
<cloud domain='benhealey.info' port='80' path='/?rsscloud=notify' registerProcedure='' protocol='http-post' />
<image>
		<url>http://s2.wp.com/i/buttonw-com.png</url>
		<title>Ben Healey &#187; Django</title>
		<link>http://benhealey.info</link>
	</image>
	<atom:link rel="search" type="application/opensearchdescription+xml" href="http://benhealey.info/osd.xml" title="Ben Healey" />
	<atom:link rel='hub' href='http://benhealey.info/?pushpress=hub'/>
		<item>
		<title>Implementing Full Text Search on Google App Engine</title>
		<link>http://benhealey.info/2011/04/16/implementing-full-text-search-on-google-app-engine/</link>
		<comments>http://benhealey.info/2011/04/16/implementing-full-text-search-on-google-app-engine/#comments</comments>
		<pubDate>Sat, 16 Apr 2011 06:39:11 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[post]]></category>
		<category><![CDATA[App Engine]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Full Text Search]]></category>
		<category><![CDATA[Google]]></category>
		<category><![CDATA[PySolr]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[Solr]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=499</guid>
		<description><![CDATA[Despite being a product of search giant Google, App Engine doesn&#8217;t yet provide in-built support for full-text searching of substantial strings in your datastore entities.  There are a few approaches to building your own, which involve using equality filters to search on the start of a string or ListProperties to hold lists of terms garnered [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=499&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>Despite being a product of search giant Google, App Engine doesn&#8217;t yet provide in-built support for full-text searching of substantial strings in your datastore entities.  There are a few approaches to building your own, which involve using <a href="http://googlecode.blogspot.com/2010/05/google-app-engine-basic-text-search.html">equality filters to search on the start of a string</a> or <a href="http://www.billkatz.com/2009/6/Simple-Full-Text-Search-for-App-Engine">ListProperties to hold lists of terms</a> garnered from your text (as long as you stay within the limits of allowable indexes for a given enitity).  However, if you want to be able to run an index on larger documents or support more advanced search features like faceting and scoring you may find yourself scratching your head.  Unfortunately, the sandbox environment of GAE also restricts your ability to employ third-party open source search solutions like <a href="http://lucene.apache.org/java/docs/index.html">Lucene</a>.</p>
<p>Native full text search functionality will no doubt come to App Engine in due course.  But in the meantime my solution has been to use a remote-hosted Solr instance from <a href="http://websolr.com/">WebSolr</a> and a slightly modified version of <a href="http://code.google.com/p/pysolr/">PySolr</a> to get the job done.  Why PySolr rather than other Python-based interface packages like <a href="http://haystacksearch.org/">Haystack</a> or <a href="https://github.com/tow/sunburnt/">Sunburnt</a>?  The simple reason is that none of these will work out-of-the box on App Engine and PySolr was the simplest of them all to modify for my (relatively modest) needs.  You can <a href="http://inquisio.co.nz/resources/pysolrGAE.zip">grab a copy of the PySolr code modifed for App Engine</a> if you want it.</p>
<p>Here&#8217;s a quick overview of my setup in case you are looking to do something similar.  I use Django as my framework, so your specifics may vary.</p>
<ol>
<li>Put a copy of PySolrGAE in your app directory so you&#8217;ll be able to import the module into your views as needed.</li>
<li>Add the following variables in your settings:<br />
<span class="Apple-style-span" style="font-family:Consolas, Monaco, 'Courier New', Courier, monospace;font-size:12px;line-height:18px;white-space:pre;">SOLR_PATH = &#8216;<a href="http://index.websolr.com/solr/yourkey/&#8217;" rel="nofollow">http://index.websolr.com/solr/yourkey/&#8217;</a><br />
</span><span class="Apple-style-span" style="font-family:Consolas, Monaco, 'Courier New', Courier, monospace;font-size:12px;line-height:18px;white-space:pre;">SOLR_BATCH_SIZE = 100<br />
</span><span class="Apple-style-span" style="font-family:Consolas, Monaco, 'Courier New', Courier, monospace;font-size:12px;line-height:18px;white-space:pre;">MAX_RESULT_SIZE = 100<br />
(obviously, your values will differ!)</span></li>
<li>Set up a schema document (XML) and put it up on your Solr instance so it knows what particular fields you will be passing to it and how it should tokenise, stem and otherwise work its magic on the text within them.  The Solr documentation is pretty good, so it is easy to pick up.</li>
<li>Import the module (e.g. from apps.search.pysolrGAE import Solr) into your views and use it to interface with your solr instance.  The &#8216;readme&#8217; included with the modified PySolr code gives an overview of the syntax for adding, deleting and modifying entries in your index.  I&#8217;ve managed to set up views to delete the index, re-create it, and return results which are then passed to a template.  You can also set up a hook in the &#8216;save&#8217; method of your models to incrementally add/modify or delete items depending on what you&#8217;ve done to a particular entity.</li>
</ol>
<p>One of the nice things about Solr is that you can pass it a field which will not be indexed but is stored alongside an entry.  You can get this field returned as part of a query response.  Hence, you can set up an HTML rendered version of the search result snippet for a particular entry and pass it to Solr at the time you add the entry to the index.  Then, when you run a query you can get that field back and simply pass it through to your template.  This saves you a round trip to the datastore to get a copy of the entity for presentation.  Sweet!</p><br />Filed under: <a href='http://benhealey.info/category/post/'>post</a> Tagged: <a href='http://benhealey.info/tag/app-engine/'>App Engine</a>, <a href='http://benhealey.info/tag/django/'>Django</a>, <a href='http://benhealey.info/tag/full-text-search/'>Full Text Search</a>, <a href='http://benhealey.info/tag/google/'>Google</a>, <a href='http://benhealey.info/tag/pysolr/'>PySolr</a>, <a href='http://benhealey.info/tag/python/'>Python</a>, <a href='http://benhealey.info/tag/solr/'>Solr</a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/499/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/499/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/499/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/499/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/499/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/499/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/499/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/499/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/499/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/499/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/499/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/499/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/499/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/499/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=499&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2011/04/16/implementing-full-text-search-on-google-app-engine/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>
	</item>
		<item>
		<title>Getting a &#8216;No FlatPage matches the given query&#8217; error?</title>
		<link>http://benhealey.info/2009/07/07/getting-a-no-flatpage-matches-the-given-query-error/</link>
		<comments>http://benhealey.info/2009/07/07/getting-a-no-flatpage-matches-the-given-query-error/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 03:53:44 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Thoughts]]></category>
		<category><![CDATA[Django]]></category>
		<category><![CDATA[Flatpages]]></category>

		<guid isPermaLink="false">http://benhealey.info/?p=51</guid>
		<description><![CDATA[This may be useful if you are a Django newb.  All others&#8230; this will probably be gibberish. Are you are working through James Bennett&#8217;s &#8216;Practical Django Projects&#8217; (second edition) and getting the above error when trying to view your first flatpage?  This could be because you did not put leading and trailing forward slashes in the [...]<img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=51&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></description>
			<content:encoded><![CDATA[<p>This may be useful if you are a Django newb.  All others&#8230; this will probably be gibberish.</p>
<p>Are you are working through James Bennett&#8217;s &#8216;Practical Django Projects&#8217; (second edition) and getting the above error when trying to view your first flatpage?  This could be because you did not put leading and trailing forward slashes in the flatpage URL field when setting up your flatpage in the admin panel.</p>
<p>The book actually tells you to do this in the example it gives, but in my speed-reading I ignored the slashes and treated the URL field as though I was entering a wordpress page slug.  Four hours, a django reinstall, and much angst later, I have relearned the lesson that it always pays to read the instructions carefully.</p>
<p>Perhaps you can avoid my mistake.<br />
_______<br />
Short URL for this post: <a href="http://wp.me/pnqr9-P" rel="nofollow">http://wp.me/pnqr9-P</a></p><br />Posted in Thoughts Tagged: Django, Flatpages <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gocomments/benhealey.wordpress.com/51/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/comments/benhealey.wordpress.com/51/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godelicious/benhealey.wordpress.com/51/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/delicious/benhealey.wordpress.com/51/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gofacebook/benhealey.wordpress.com/51/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/facebook/benhealey.wordpress.com/51/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gotwitter/benhealey.wordpress.com/51/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/twitter/benhealey.wordpress.com/51/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/gostumble/benhealey.wordpress.com/51/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/stumble/benhealey.wordpress.com/51/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/godigg/benhealey.wordpress.com/51/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/digg/benhealey.wordpress.com/51/" /></a> <a rel="nofollow" href="http://feeds.wordpress.com/1.0/goreddit/benhealey.wordpress.com/51/"><img alt="" border="0" src="http://feeds.wordpress.com/1.0/reddit/benhealey.wordpress.com/51/" /></a> <img alt="" border="0" src="http://stats.wordpress.com/b.gif?host=benhealey.info&amp;blog=5583171&amp;post=51&amp;subd=benhealey&amp;ref=&amp;feed=1" width="1" height="1" />]]></content:encoded>
			<wfw:commentRss>http://benhealey.info/2009/07/07/getting-a-no-flatpage-matches-the-given-query-error/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
	
		<media:content url="http://1.gravatar.com/avatar/7242c6f38f9056b8d9a96695535fe428?s=96&#38;d=identicon&#38;r=PG" medium="image">
			<media:title type="html">Ben</media:title>
		</media:content>
	</item>
	</channel>
</rss>
