The one and only image recognition library for Delphi.
Description
The general purpose of the algorithm is to detect if a small image (called “pattern”) is present in a big image (main image). The algorithm allows for some variations, so the pattern image would be found even if it were slightly rotated (maximum 20 degrees) or resized (maximum 30%).
Applicability
A good usage example is finding out checkboxes (checked or not) in a form.
If a checkbox is found, the algorithm will indicate the position of each checkbox. The checkbox can have two states (checked/unchecked). The algorithm has to be executed for each state.
The Input
The output quality of the algorithm is (naturally) directly proportional to the quality of the input. The algorithm was designed for scanned images, stored under non-lossy compression algorithms (such as zip, bmp, png, tiff). The algorithm might also work on noisy images – images that have been acquired with photo cameras instead of scanners, and images that have been compressed using a lossy compression algorithm (such as jpeg, gif). The user will have to convert the input format (jpg, png, etc) to the accepted input type of the algorithm (grayscale 8-bit bitmap). For very noisy images, the user is advised to use cleaning techniques to clean the input (straightening, deblocking, denoising, jpg artifact removal, etc).
As a concrete example, let’s say we want to detect how many checkboxes are checked in this Are you crazy – Self diagnostic test. We grab a screenshot of the form as completed by the patient:
We load the above screenshot into the program:
Of course, we need to also load a small crop of the area where the check boxes are. We need both check boxes:
After we run the program, it will indicate the coordinates where the check boxes have been detected:
And as visual feedback it will also show in red rectangles the position of the check boxes:
The program can process thousands of such forms per hour.
Compilation
Written in Delphi 10.4 – Delphi 11. Should work also under Lazarus.
Speed
The speed of the algorithm is exponentially proportional to the size of the input; therefore, one could scale down the image in order to improve the speed. The speed of the algorithm can be dramatically improved by several techniques (downsizing the image, pyramid search, multi-threading, caching some of the calculations). I will try to implement these in the next version of the algorithm. This could improve the speed at least 30 times.
For the above input image, the speed of the program (in debug mode) is 4.1 seconds (on my 11 years old hardware).