Image page segmentation mode in the OCR of Arigamix CRM

Arigamix CRM with built-in OCR module

Upcoming Arigamix CRM comes with an efficient and user-friendly optical character recognition module. After the document reading process is finished, user can choose the view of the recognised data for easier metadata filling. Next to the default option, user can clear currently unneeded elements and leave only one view, choosing from: text, paragraph, line, word, table or barcode. Elements that are recognised in the chosen view are marked with different colours for easier data processing.   

What is image page segmentation mode?

The image page segmentation mode sets the rule for splitting an image into parts and analyzing the entire image for further processing. It allows you to tell the recognition tool how to analyze and process the input data. Depending on the different modes, different results may be obtained. Therefore, the more accurately the recognition mode is selected, the more accurate will be the output result – the recognized text.

Types of image page segmentation modes

  1. Orientation and language script detection without image page segmentation and optical recognition detection mode is administrative and does not perform OCR, but only helps to determine: page orientation in degrees 0, 90, 180, 270 and font reliability (i.e. graphic characters / writing system), such as Latin, Cyrillic, etc. .d.

 

  1. Automatic image page segmentation with orientation and script detection mode automatically detects the page layout and position of the text on it, performs optical text recognition, determines the page orientation and font reliability
  2. Automatic image page segmentation without orientation, language script and optical recognition detection
  3. Automatic image page segmentation without orientation and language script detection mode is the same as mode 1, however, when OCR is performed, no operations are performed to determine the image page orientation and font reliability. Therefore, the recognition tool will segment the text, treating it as a “correct page” of text with several words, several lines, several paragraphs, etc. To determine the page orientation and font reliability, you must first perform recognition with mode 1, and then with mode 3.
  4. Image page segmentation as a single column with variable size text without orientation and language script detection mode should be used when you need to recognize column data and want text to be concatenated line by line (for example, tabular data or receipts). When OCR is performed, no operations are performed to determine the image page orientation and font reliability. 
  5. Image page segmentation as a single uniform block of vertically aligned text without orientation and language script detection mode is similar to mode 4, but only for an image rotated 90 degrees clockwise.

 

  1. Image page segmentation as a single homogeneous block with vertically aligned text mode is best suited for recognizing pages like book pages, which tend to use the same typeface and dense text throughout the entire book. The keyword here is uniform text meaning that the text is in one font without any variation.

 

  1. Image page segmentation as a single line with text without orientation and language script detection This mode should be used when working with one line of universal text. For example, the mode can be used when it is necessary to recognize license plates or any codes.
  2. Image page segmentation as a single word without orientation and language script detection This mode should be used when working with one word of the universal text. For example, the mode can be used when it is necessary to recognize license plates or any codes.
  3. Image page segmentation as a circle word or word in a circle without orientation and language script detection This mode should be used when the text in the image is either inside a circle or wraps around an invisible circular/arc area.

 

  1. Image page segmentation as a single character without orientation and language script detection

This mode should be used when it is necessary to recognize a single character in an image. Usually, it should be used when the image is split into individual characters (say, a license plate) and then you need to recognize each of the characters. This approach will give greater accuracy than full license plate recognition, but will require more resources.

 

  1. Image page segmentation as a collection of words in no particular order without orientation and language script detection

This mode should be used when there is a lot of sparse text in the image that needs to be extracted. In this case, the structure of the document, the order and grouping of the text is not important, the text itself is important. 

 

  1. Image page segmentation as sparse text with orientation and language script detection

This mode is similar to mode 11, but also takes into account the orientation in degrees 0, 90, 180, 270 and determines the reliability of the font.

  1. Image page segmentation as a single text line without orientation and language script detection

This mode works similarly to mode 7, but should be used when OSD, segmentation, and other OCR-specific internal pre-processing methods degrade OCR performance: reduced accuracy, no text detected. This usually happens if a piece of text is severely cut off, the text is computer generated/styled in some way, or it is a font that the recognition tool might not recognize automatically.

#Arigamix #CRM #OCR #imagepagesegmentation #datarecognition #testnow #subscribe #SaaS #cloudsolutions