|
<< Click to Display Table of Contents >> Navigation: Add-ons > Optical Character Recognition Add-on > Optimizing OCR Accuracy |
OCR Accuracy
OCR accuracy depends upon the scan quality of the original document.
Points to keep in mind to increase the accuracy of OCR processing and indexing:
Use a Good Quality Scanner
The higher quality the scanner you use the higher quality the images that it produces. Accurate images make for less errors and therefore faster and more accurate results.
Always check the images for scanning problems
If you're processing a small number of documents, it's always worth having a quick look at them to check for anything that might cause a problem.
Use 300 DPI
This is the optimum resolution for representing a normal sized character. It provides just the right amount for accuracy and efficiency. If the resolution is too low then the characters will be difficult to recognize. If it's too high it is slower to process and uses more storage.
Scan in black and white
Using color or grey scale can increase the image file size by between 10 to 50 times. To keep the amount of data being processed and stored to a minimum, always scan in black and white where possible.
Character Accuracy
Factors which can affect the characters recognized are creative typefaces, shading, broken or touching characters, skewed and curved baselines, space errors and underlined text all of which can slow down the performance of OCR.
Optimization for poor backgrounds
The quality of the background of a document can also have an impact on the recognition of characters. Photocopied, faxed and crumpled documents can deform and distort character images rendering them difficult to recognize.