Recapturing the Moment
May 25th 2007 06:19
On the Internet things can take off at the drop of a hat – if they capture people’s imagination.
If you’ve used the Internet recently you’ll have come across a ‘captcha’. It’s that box that has distorted text in it that humans can read, but other computers can’t – at least not at the moment. Usually you find them on sites where they won’t let you in unless you tell the system what the word is. I find some of them so distorted that I can’t read them anyway, but that’s neither here nor there. Maybe my computer and I have more in common than we thought.
Now there’s 'recaptcha.’ This goes a step further – a considerable step further. In recaptcha there are two words. You are asked to type both of them in. But this isn’t to make it more difficult for you: it’s to add to an ongoing project that began about a week ago and is already booming.
It’s thought that around 60 million captchas are solved by people around the world each day. That’s a lot of effort going into just letting people into sites. The recaptcha takes that effort and puts it to good use.
The Internet Archive is digitizing a number of books that are out of print or old or archival or whatever. But this is a slow process because in the method used – Optical Character Recognition – the computers struggle to identify a number of words. (Sibelius, the music program, uses a similar method to turn scanned music into ‘real’ music – it has similar problems.) What recaptcha does is offer a known word to the person wanting access. They type in the answer. The second word is one that’s not been readable by the OCR. The human does the solving. How simple and effective!
The word that’s unknown to the computer is offered to a several humans. If the answer is consistent over a number of offers, then the computer can be sure it’s got it right. Voila! Humans beating the computer at its own game.
If you’ve used the Internet recently you’ll have come across a ‘captcha’. It’s that box that has distorted text in it that humans can read, but other computers can’t – at least not at the moment. Usually you find them on sites where they won’t let you in unless you tell the system what the word is. I find some of them so distorted that I can’t read them anyway, but that’s neither here nor there. Maybe my computer and I have more in common than we thought.
Now there’s 'recaptcha.’ This goes a step further – a considerable step further. In recaptcha there are two words. You are asked to type both of them in. But this isn’t to make it more difficult for you: it’s to add to an ongoing project that began about a week ago and is already booming.
It’s thought that around 60 million captchas are solved by people around the world each day. That’s a lot of effort going into just letting people into sites. The recaptcha takes that effort and puts it to good use.
The Internet Archive is digitizing a number of books that are out of print or old or archival or whatever. But this is a slow process because in the method used – Optical Character Recognition – the computers struggle to identify a number of words. (Sibelius, the music program, uses a similar method to turn scanned music into ‘real’ music – it has similar problems.) What recaptcha does is offer a known word to the person wanting access. They type in the answer. The second word is one that’s not been readable by the OCR. The human does the solving. How simple and effective!
The word that’s unknown to the computer is offered to a several humans. If the answer is consistent over a number of offers, then the computer can be sure it’s got it right. Voila! Humans beating the computer at its own game.
| 29 |
| Vote |
Subscribe to this blog

















