What is OCR?

Optical character recognition (OCR) involves the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a method of digitising documents so it can be edited, stored, searchable, utilised by robots (Robotic Processing Automation) and other forms in process automation and artificial intelligence.

Most traditional OCR solutions utilise templates to capture the desired text. The OCR platform will search for the selected text and record it in a digital format (eg PDF). One of the biggest players in enterprise-ready OCR software is ABBYY with its offerings ABBYY Fine Reader and FlexiCapture.

It is not uncommon for platforms to maintain a rate of accuracy of 98%+ with a good quality scan. Although this rate of accuracy does appear extremely high, small mistakes can often result in the loss of important data. If we take the example of invoice processing, omitting key data points such as supplier name can warrant the exercise as useless. Often organisations will implement a manual review process or a human verification station to increase the reliability of the scans. Despite this commitment of human labour, the benefits of automation are still strong enough to return a good value proposition.

OCR Solutions Provide Value, But We Can Do Better!

The most publicised drawback with most traditional OCR solutions such as ABBYY FlexiCapture, is that templates need to be updated with each change to the form. If the form deviates from the rules, the output is often scrambled and unusable. Furthermore, the ability to read handwriting is a common hurdle that is often avoided and placed in the exception’s basket.

New Entrants and Start-Ups Are Driving Change

The market is demanding AI-driven alternatives to boost their efficiency and human workplace transformation. Data capture is simply not enough, enterprises want insights as well.

The Tech Giants, such as Amazon and Microsoft, have entered this space building on their existing offerings and technology stacks. Of particular interest is Amazon’s offering, Textract, which claims to overcome the requirement for a verification station by using machine learning to instantly “read” virtually any type of document to accurately extract text and data. Textract enables you to detect key-value pairs in document images automatically so that you can retain the inherent context of the document without any manual intervention. A key-value pair is a set of linked data items. For instance, on a document, the field “First Name” would be the key and “Jane” would be the value. This makes it easy to import the extracted data into a database or to provide it as a variable into an application. With traditional OCR solutions, keys and values are extracted as simple text. The relationship between them is lost unless hard-coded rules are written and maintained for each form.

What is OCR?

Optical character recognition (OCR) involves the electronic or mechanical conversion of images of typed, handwritten or printed text into machine-encoded text. It is widely used as a method of digitising documents so it can be edited, stored, searchable, utilised by robots (Robotic Processing Automation) and other forms in process automation and artificial intelligence.

Most traditional OCR solutions utilise templates to capture the desired text. The OCR platform will search for the selected text and record it in a digital format (eg PDF). One of the biggest players in enterprise-ready OCR software is ABBYY with its offerings ABBYY Fine Reader and FlexiCapture.

It is not uncommon for platforms to maintain a rate of accuracy of 98%+ with a good quality scan. Although this rate of accuracy does appear extremely high, small mistakes can often result in the loss of important data. If we take the example of invoice processing, omitting key data points such as supplier name can warrant the exercise as useless. Often organisations will implement a manual review process or a human verification station to increase the reliability of the scans. Despite this commitment of human labour, the benefits of automation are still strong enough to return a good value proposition.

OCR Solutions Provide Value, But We Can Do Better!

The most publicised drawback with most traditional OCR solutions such as ABBYY FlexiCapture, is that templates need to be updated with each change to the form. If the form deviates from the rules, the output is often scrambled and unusable. Furthermore, the ability to read handwriting is a common hurdle that is often avoided and placed in the exception’s basket.

New Entrants and Start-Ups Are Driving Change

The market is demanding AI-driven alternatives to boost their efficiency and human workplace transformation. Data capture is simply not enough, enterprises want insights as well.

The Tech Giants, such as Amazon and Microsoft, have entered this space building on their existing offerings and technology stacks. Of particular interest is Amazon’s offering, Textract, which claims to overcome the requirement for a verification station by using machine learning to instantly “read” virtually any type of document to accurately extract text and data. Textract enables you to detect key-value pairs in document images automatically so that you can retain the inherent context of the document without any manual intervention. A key-value pair is a set of linked data items. For instance, on a document, the field “First Name” would be the key and “Jane” would be the value. This makes it easy to import the extracted data into a database or to provide it as a variable into an application. With traditional OCR solutions, keys and values are extracted as simple text. The relationship between them is lost unless hard-coded rules are written and maintained for each form.

Microsoft by way of its Power Platform offers AI Builder Forms Processing. This app allows you to create AI models that use machine learning technology to identify and extract key-value pairs and table data from documents. Like traditional OCR, document training is required to enable the machine learning to understand what information needs to be extracted. This handy addition to Power Platform not only provides basic OCR to create and empower your AI models but drives reporting and automation benefits through utilising other apps in the Power Platform, such as Power Apps and Flow.

Although these offerings are largely new-to-market, they provide strong valuation propositions to organisations that have their existing technology stack. Basic in offering, it may provide your organisation with a cost-effective and convenient solution.

There is also a growing number of start-ups popping up selling new, specialised solutions such as HyperScience which specialises in handwritten OCR. These will continue to challenge these giant tech players or become acquisition targets for the Tech Giants.

ABBYY realising it must respond added AI/ML predominately around classification of documents. Their new intelligent Image Classifier can collect and process visual information about document images and delivers fast classification results. The advanced Text Classifier can extract and process information about the documents’ content, which increases the classification accuracy. The Image Classifier and the Text Classifier can be used individually or in combination.

Final Comments

If your organisation has a Microsoft or Amazon tech stack, why not look at their offerings. Despite being new to market, the convenience and opportunities may forge a cost-effective outcome for your organisation.

The traditional players such as ABBYY, provide an enterprise-ready and comprehensive solution, with a proven track record in providing high volume OCR tech. ABBYY will continue to broaden and refine their AI/ML offerings as this capability is now expected as part of any information capture solution platform.

Microsoft by way of its Power Platform offers AI Builder Forms Processing. This app allows you to create AI models that use machine learning technology to identify and extract key-value pairs and table data from documents. Like traditional OCR, document training is required to enable the machine learning to understand what information needs to be extracted. This handy addition to Power Platform not only provides basic OCR to create and empower your AI models but drives reporting and automation benefits through utilising other apps in the Power Platform, such as Power Apps and Flow.

Although these offerings are largely new-to-market, they provide strong valuation propositions to organisations that have their existing technology stack. Basic in offering, it may provide your organisation with a cost-effective and convenient solution.

There is also a growing number of start-ups popping up selling new, specialised solutions such as HyperScience which specialises in handwritten OCR. These will continue to challenge these giant tech players or become acquisition targets for the Tech Giants.

ABBYY realising it must respond added AI/ML predominately around classification of documents. Their new intelligent Image Classifier can collect and process visual information about document images and delivers fast classification results. The advanced Text Classifier can extract and process information about the documents’ content, which increases the classification accuracy. The Image Classifier and the Text Classifier can be used individually or in combination.

Final Comments

If your organisation has a Microsoft or Amazon tech stack, why not look at their offerings. Despite being new to market, the convenience and opportunities may forge a cost-effective outcome for your organisation.

The traditional players such as ABBYY, provide an enterprise-ready and comprehensive solution, with a proven track record in providing high volume OCR tech. ABBYY will continue to broaden and refine their AI/ML offerings as this capability is now expected as part of any information capture solution platform.