• OCR to help generate search text in scan document

    Posted on April 25th, 2009 admin No comments

    Digitizing a magazine article or a printed contract is often a common needs. We could either spend hours retyping and then correcting misprints or we could convert all the required materials into digital format in several minutes using a scanner and Optical Character Recognition (OCR) software. Although scanning pages would be an expensive and time-consuming undertaking, the benefits are huge.


    OCR is a process of converting different types of documents, such as scanned paper documents, PDF files or images captured by a digital camera into text or word processing files that can be easily edited and stored.

    OCR is a field of research in pattern recognition, artificial intelligence and machine vision. It has been used to enter data automatically into a computer for dissemination and processing. This technology has enabled such materials to be stored using much less storage space than the hard copy materials. OCR technology has made a huge impact on the way information is stored, shared and edited. Prior to Optical Character Recognition, if someone wanted to turn a book into a word processing file, each page would have to be typed word for word.
    Read the rest of this entry »