The Spatial Miscellany

Avatar

A weblog. A website. A geospatial miscellany…

Fight spam with reCAPTCHA

When you submit details to a website via a web form, increasingly you’re asked to interpret a picture of a word and type the answer in a text box, this type of puzzle is known as a ‘CAPTCHA’ – if you’re not sure what I’m talking about, check out the comments section of this post for an example.

The CAPTCHA was created by Luis von Ahn, in an attempt to fight Spam…if the CAPTCHA is completed successfully you are assumed to be human and the web form is submitted, if the CAPTCHA is failed you are assumed to be a computer and web form submission is prevented. The CAPTCHA is a classic example of a Turing Test, as proposed by the eponymous researcher in his 1950 paper ‘Computing Machinery and Intelligence’ – indeed the name CAPTCHA coined by Luis von Ahn is an acronym for a “Completely Automated Public Turing test to tell Computers and Humans Apart”.

An example captcha

Ahn was initially proud of his highly effective solution for preventing Spam, but was subsequently frustrated with the cumulative amount of time being consumed by millions of people across the globe filling in CAPTCHA’s and producing very little in return. Personally I found the CAPTCHA mightily offensive and avoided them like the plague, until I discovered them to be the only effective way to stop computers spamming this blog. Fortunately, Ahn has recently worked with Ben Maurer to address his frustration and recently released reCAPTCHA. Here’s the deal…

The Internet Archive is attempting to automatically digitize old books using optical text recognition software, they are largely successful but the text recognition software struggles with recognizing ye olde English, meaning that roughly 8% of words are digitized incorrectly or not at all. reCAPTCHA addresses this problem by asking a user to interpret a picture of two words (instead of just one), the first word is a known word and the second word an unknown word. If you correctly interpret the first known word, it is assumed you have also correctly interpreted the second word that wasn’t recognized by the text recognition software. So when you fill in a reCAPTCHA, not only are you proving that you are human, but you are also helping to digitize old books! For me, application of technology in this way is poetry.

Obviously Luis von Ahn has applied this technique to digitizing old books, but it had me thinking of the potential to apply such an approach to digitizing maps…is there scope for a geoCAPTCHA?

The above cartoon is the copyrighted work of David Farley

Continue



Free GIS Software...

Download ArcGIS Explorer, a free geobrowser from ESRI!

 

Use ArcGIS Explorer to visualise geographic datasets. The latest build provides full access to Virtual Earth imagery and comprehensive support for several data formats including GeoRSS, KML and ArcGIS Layerfiles.

Before you go

Going so soon? Test your geography with the...

 

Do you support the campaign? Should government-funded and approved agencies such as the Ordnance Survey collect data with significant indirect contributions from the UK tax-payer, but then charge users and companies for access to it?

 

Download Flash plugin