With the AnylineOCR Module, you have the ability to create your own OCR use case, on the fly with almost no effort. It offers a variety of parameters for you to adjust the scanning process to your use case.
This section describes the parameters in detail.
If you are looking for a How-To on loading the Anyline OCR Module on your platform, please refer to the following sections:
Simultaneous Barcode Scanning
Starting from SDK 3.8 Anyline supports simultaneous barcode scanning for any module. Additional Information can be found under Simultaneous Barcode Scanning
- iOS: Anyline OCR Module
- Android: Anyline OCR Module
- Cordova: Set the Scan Mode
- React-Native: Set the Scan Mode
- Xamarin.iOS: Implementing Anyline
- Xamarin.Android: Adding the Module and configure it
scanMode provides the basis for the scanning experience. There are three options:
As a rule of thumb: If you can place a grid on top of the text that you want to scan, use
GRID mode, otherwise use
AUTO automatically detects the valid text within the cutout.
New in version 3.11.
AUTO mode automatically detects the text to be scanned if placed within the cutout.
It automatically detects if the text to be scanned is formed of one or multiple lines, upper or lowercase characters and/or numbers - and adjusts the scan parameters automatically.
In this mode, all parameters are optional. The parameters contained in the following table can be set in order to improve the scanning process.
If the font you are trying to scan differs from a standard sans-serif font,
this parameter should be set.
Otherwise there is no need
Helps to filter false positives like the number 8 instead of the letter B
Validates the result against the desired structure.
Therefore this parameter helps to avoid false scans.
Especially if the cutout is not placed on top of the text at the start of the scanning.
As of version 3.12, the
AUTO mode detects text automatically up to 30 characters per line in up to 7 lines.
Version 3.11 did not include multiline support, and lowercase character detection was performed checking the charWhitelist - this was changed in version 3.12 and higher
LINE mode is the best option for scanning multiple or single line(s) of text with an arbitrary length.
This could be an IBAN code, or a mail header with a prior unknown number of lines and length of the lines.
Number of characters
LINE mode requires at least 4 characers per line. If your use case has 3 or less characters, consider using the
Defines the minimum height that the symbols need to be considered in the scanning process.
If, for example, you know that the text you are going to scan is rather big, setting this to a high value prevents smaller contours in the image from being taken into account.
Defines the maximum height that the symbols need to be considered in the scanning process.
If, for example, you know that the text you are going to scan is rather small, setting this to a low value prevents bigger contours in the image from being taken into account.
The OCR part of the SDK relies on so called traineddata files, which are specific to a font and language.
This parameter tells the module which traineddata file to use when performing the OCR.
You can use one of the default traineddata files that comes with the SDK bundle, like
Load the traineddata file on Android
On Android, the traineddata files must be copied first via copyTrainedData
If you have a Font you want to use, you can head over to trainyourtesseract.com and create a traineddata file for free
Defines a whitelist of characters that are allowed in a result.
Setting this parameter thoroughly has two benefits:
First, the accuracy of the results will be improved. If you have a code that only contains the number 8, but not the letter B, removing B from the
charWhitelist will improve the confidence of the result (as the two symbols can look alike)
Second, this parameter, together with validationRegex will prevent you from getting incorrect results.
Missing Characters in
If symbols are detected by the scan, that are not in the
charWhitelist, the performance of the scan may suffer
Defines a Regular Expression which the detected result is validated against.
If a detected result does not match the
validationRegex, it will not be returned.
The Regular Expression is in ECMAScript regular expressions pattern syntax
As of version 3.12, the Anyline OCR Module provides predefined Regular Expressions for
Please see the iOS API Reference and the Android API Reference for further details
Defines a minimum confidence the SDK has to have in the result to consider it valid.
The confidence describes how certain the SDK feels that the detected result equals the target to scan.
Additional Settings in
If set to
true, small contours in the text will not be considered during the scanning process.
If your scanning use case only includes latin capital letters and/or numbers, set this to
If you also want to scan lower case letters, or other symbols, set this to
false, as it may otherwise remove details like the dot of the lower case i
Defines a minimum sharpness that is required of the image to be processed further in the SDK.
It is used to avoid time consuming processing of blurry images which are unlikley to return a result.
This parameter is experimental. It is recommended to set an initial sharpness of 50 and gradually increase the value to a threshold where you get satisfying results
If set to
true, any whitespace in the returned result will be removed.
This can be useful in scenarios where the information might be printed with whitespaces for better readability, but is not necessary. Scanning IBAN codes is one of the examples for this scenario.
Additional Settings in
Defines the number of symbols in horizontal direction in the grid.
For example, if a code to scan consists of 2 rows with 4 symbols each, this would be set to 4.
Defines the number of symbols in vertical direction in the grid.
For example, if a code to scan consists of 2 rows with 4 symbols each, this would be set to 2.
Defines the average horizontal distance between two characters, measured in percentage of the characters width.
Defines the average vertical distance between two characters, measured in percentage of the characters width.
If set to
true, the SDK looks for bright symbols on a dark background. If set to
false, the SDK looks for dark symbols on bright background.
Setting a Custom Command File
If your use case requires special opimisation, you will be provided a Custom Command File (
.ale) by Anyline.
In order to load the custom command file, please refer to the platform specific implementations
Settings and Custom Command File
Notice that the custom script will override all settings made to the Anyline OCR Config, so you don’t have to set the parameters manually as they are optimized for your use-case