Why Computers Can’t Do The Job
As we work towards a re-release of the Full Text Search feature on Open Library, we’ve seen much more of the OCR output of our book scans. Depending on the text, the OCR can range from 99% perfect to covered in gobbledygook. Hence my delight to see oldweather.org from the National Maritime Museum in Greenwich, where we can “help improve reconstructions of past weather and climate across the world by finding and recording historical weather observations in handwritten Royal Navy ship logs.”
It’s fun to think about ways we might be able to encourage people to help correct bad OCR. We’re definitely looking towards the National Library of Australia’s great Trove Newspapers site for inspiration (and collaboration?).