Overview
Usage examples
1. Natural images: You can search if a specific car is present in a image or not.
2. Generated images: Finding out if a check box is checked or not in an input image. The image can be a screenshot (in this case you get 100.0% accuracy) of a program or a scan of a paper form.
3. OCR – Finding text in an image. You can search a whole string (fast), or you can search each character in the alphabet (slow). This way you can basically build an OCR program.
etc
Is this AI image recognition?
No. This is different than AI image recognition algorithms. An AI will recognize all type of cars in the main image.
This NGC algorithm will recognize only one pattern (car). If you have 100 types of cars, you need to search each one individually (therefore, slower than the AI).
However, compared with the AI where you have to pay for each image you process, this NGC algorithm has no additional costs.
Input
Input source
The output quality of the algorithm is (naturally) directly proportional to the quality of the input image:
1. Screenshots: Provide optimal recognition conditions with 100% accuracy.
2. Images scanned with a scanner: Very very good accuracy when sourced from flatbed scanners.
3. Smart phone: The algorithm works also on noisy images (images that have been acquired with photo cameras instead of scanners) but its accuracy will be lower.
Storage type (recommendation)
1. Non-loosy compression: To prevent further degradation of your input, store your images under non-lossy compression algorithms (such as zipped bmp, bmp, png, tiff).
2. Lossy compression: Lossy compression algorithms (such as jpeg, gif) can dramatically affect the quality of the input image. Of course, once you converted the image to JPG, the damage was done – the quality was lost forever even if you convert your image back to bmp or png.
The recommended storage format is PNG.
Input format
The accepted input type of the algorithm is grayscale 8-bit bitmaps. Of course, you can write a small routine to convert your input from png, jpg, etc to bmp.
The cGraphLoader.pas in my LightSaber Delphi library offers a function called LoadGraph that can load common file formats (GIF, JPG, BMP, PNG, WB1, RainDrop, JPG2K) from disk. Of course, you can also use your own.
Input flexibility
The algorithm allows for some variations, so the pattern image would be found even if it were slightly rotated (maximum 20 degrees) or resized (maximum 30%).
Speed
For the input image below, the speed of the program (in debug mode) is 4.1 seconds (on a 12-years old AMD FX 8350). The time increases (of course) exponentially with the size of the image. If you have really large images, you can of course scale them down.
The speed of the algorithm can be dramatically improved by several techniques (downsizing the image, pyramid search, multi-threading, caching some of the calculations). I will try to implement these in the next version of the algorithm. This could improve the speed at least 30 times making the algorithm more than suitable for real-time image processing.
Note that there are other ways to dramatically speed up the algorithm (by cutting some corners) but they will result in less than 100% accuracy.
The ideal scenario is when you don’t need to process images in real time. In this case, you can simply process the images in background and store the results (the coordinates where the pattern was detected in the main image) somewhere to a file (INI file, etc). This way, you can simulate real-time performance.
Demo program
[NGC demo program.zip – 2MB]
Easy fix: If you have an images that has the pattern right on the border on the Main image, you will have to put a fake border around your Main image.
Techniques to improve the accuracy
If you have noisy or low-quality inputs, employ preprocessing methods to enhance accuracy:
- Straightening misaligned images.
- Deblocking and denoising.
- Removing compression artifacts.
- Adjusting contrast and brightness.
Delphi compatibility
Written and tested in Delphi 10.4 / Delphi 11, but it should work without any modifications on all Delphi editions that support inline variables. If you have an older edition you need to re-declare the variables (5 minutes work). The code should work also under Lazarus/FPC but I have never tested it.
The library has been tested with FastMM4 and has no memory leaks.
This NGC library needs some functions from my LightSaber library, in order/process to open images.
If you want, you can remove this dependency (you can write your own image opening routines – or you can extract them from LightSaber).
Related image processing libraries
- Delphi-SSIM-ImageQuality
- Delphi-LightSaber (General purpose)
- Multithread thumbnail viewer