First of all, I would like to ask “what is OCR?”. OCR is Optical Character Read or reading, if we need to read an image’s words, sentences, we use OCR Technologies and programs. To this article I’ll talk about tesseract. Tesseract is an OCR program which include Leptonica and OCR libraries. Tesseract is an open source program and it is developing by volunteers.
Formerly, HP and Google support the technology and tesseract’s algorithm, but now fully open source. Tesseract is very trusted program that open sources. Most of developers and engineers which Works on computer vision or image processing areas, use tesseract OCR and OpenCV libraries.
In my opinion OpenCV and tesseract is great chance to image processing and OCR. So, let suppose we need to find a word in our Picture, we took a photo and started to image processing;
We need to optimize the photo to best result in OCR. So, I want to use OpenCV with C++.
The details in the photo are causing the fault. Edit the photo to improve Tesseract performance.
Thinning and blurring functions may work. I will mention to this in my next posts.
Now, we can use tesseract code...
How to add Tesseract to Visual Studio 15? You can look at our article about tesseract VS installation.
Opencv thining function result and tesseract ocr result:
Thinning and blurring may work.