Internet Archive News

updates about archive.org

Why Computers Can’t Do The Job

As we work towards a re-release of the Full Text Search feature on Open Library, we’ve seen much more of the OCR output of our book scans. Depending on the text, the OCR can range from 99% perfect to covered in gobbledygook. Hence my delight to see oldweather.org from the National Maritime Museum in Greenwich, where we can “help improve reconstructions of past weather and climate across the world by finding and recording historical weather observations in handwritten Royal Navy ship logs.”

Why computers can’t do the job from National Maritime Museum on Vimeo.

It’s fun to think about ways we might be able to encourage people to help correct bad OCR. We’re definitely looking towards the National Library of Australia’s great Trove Newspapers site for inspiration (and collaboration?).

Originally posted on The Open Library Blog by George Oates.
Advertisements

Written by internetarchive

October 12, 2010 at 3:43 pm

Posted in Uncategorized

%d bloggers like this: