• Converted
  • 2
  • 0
  • 7
  • 7
  • 5
  • 8
  • 7
  • 5
  • 1
  • Characters

Open source OCR software by Google

In April 2007 at the IUPR Research Group, Google sponsored the development of open source OCR software called- OCRopus, it was a high-tech document analysis and Optical Character Recognition system. Some of its features included:

Pluggable layout analysis
Pluggable character recognition
Statistical natural language
Multi-lingual capabilities

The end goal of the project was to improve the condition of OCR as well as other related technologies and to provide the best optical character recognition system for, document conversions, electronic libraries.vision impaired users, historical document analysis and general desktop use.

Part of the software is based on Tesseract; one of the best open source OCR engine’s available in the market today. The project is expected to be released at the end of this year and will be utilised for Google's book scanning project. There are a few fascinating applications the team has in mind for the software-

web service interface
Integration with desktop search tools (e.g., beagle, spotlight etc.)
PDF, camera and screen OCR

More information here:

About OCR

Optical Character Recognition is the process of recognising characters from an image. This is especially useful if you want to edit a scanned document.
You can use our free online OCR service to convert your scanned documents and download it as a text file ready for editing. Your document should be a valid image for example: .jpg, .png, .gif
Our free OCR service can handle several languages including: Chinese, English, Portuguese, Spanish, etc

More Details


  • Precise image to text conversion
  • Keeps the layout and formatting
  • Supports PDF, GIF, BMP, JPEG, TIFF or PNG as input
  • Supports TXT, DOC, or PDF as output
  • Automatically rotates pages
  • Supports low resolution images
  • Keeps the image layer of a scanned PDF
  • Works online - no installation
  • Keeps your data confidential and secure