Installing Tesseract for OCR Tesseract, originally developed by Hewlett Packard in the 1980s, was open-sourced in 2005. Later, in 2006, Google adopted the project and has been a sponsor ever since. Installing Tesseract on Windows. Tesseract suggests you use the Tesseract installer from UB Mannheim (Mannheim University Library). From there, you can download the installer, and simply follow those directions. If you want to use another language, download the appropriate training data from here tesseract-ocr/tesseract unpack it using 7-zip, and copy the.traineddata file into the 'tessdata' directory, probably C: Program Files Tesseract-OCR tessdata. To access tesseract-OCR from any location you may have to add the directory where the tesseract-OCR binaries are located to the Path variables, probably. Tesseract is the most accurate and most adaptable open source OCR engine I know of. For my master thesis, I needed to be able to change the inner workings of Tesseract. That’s why I had to compile it. Cygwin is a set of GNU tools for Microsoft Windows which gives you a POSIX environment on Windows. Xiao Ling / January 5, 2015 September 19, 2016 / OCR / OCR, tesseract Leave comment Previously, I shared an article Making an Android OCR Application with Tesseract. This time, I’d like to share how to build the tesseract OCR library with Microsoft Visual Studio 2008 on Windows.
Update note: Updated for Xcode 10.2, Swift 5, iOS 12.1 and TesseractOCRiOS (5.0.1).
We at raywenderlich.com have figured out a sure-fire way to fulfill your true heart’s desire — and you’re about to build the app to make it happen with the help of Optical Character Recognition (OCR).
OCR is the process of electronically extracting text from images. You’ve undoubtedly seen it before — it’s widely used to process everything from scanned documents, to the handwritten scribbles on your tablet PC, to the Word Lens technology in the GoogleTranslate app.
In this tutorial, you’ll learn how to use Tesseract, an open-source OCR engine maintained by Google, to grab text from a love poem and make it your own. Get ready to impress!
Getting Started
Download the materials for this tutorial by clicking the Download Materials button at the top or bottom of this page, then extract the folder to a convenient location.
The Love In A Snap directory contains three others:
Open Love In A Snap Starter/Love In A Snap.xcodeproj in Xcode, then build and run the starter app. Click around a bit to get a feel for the UI.
Back in Xcode, take a look at ViewController.swift. It already contains a few
@IBOutlet s and empty @IBAction methods that link the view controller to its pre-made Main.storyboard interface. It also contains performImageRecognition(_:) where Tesseract will eventually do its work.
Scroll farther down the page and you’ll see:
Now, it’s your turn to take the reins and bring this app to life!
Tesseract’s Limitations
Tesseract OCR is quite powerful, but does have the following limitations:
Wait. WHAT?
Uh oh! How are you going to use this in iOS? Nexor Technology has created a compatible Swift wrapper for Tesseract OCR.
Adding the Tesseract Framework
First, you’ll have to install Tesseract OCR iOS via CocoaPods, a widely used dependency manager for iOS projects.
If you haven’t already installed CocoaPods on your computer, open Terminal, then execute the following command:
Enter your computer’s password when requested to complete the CocoaPods installation.
Next,
cd into the Love In A Snap starter project folder. For example, if you’ve added Love In A Snap to your desktop, you can enter:
Next, enter:
This creates a Podfile for your project.
Replace the contents of Podfile with:
This tells CocoaPods that you want to include TesseractOCRiOS as a dependency for your project.
Back in Terminal, enter:
This installs the pod into your project.
As the terminal output instructs, “Please close any current Xcode sessions and use `Love In A Snap.xcworkspace` for this project from now on.” Open Love In A Snap.xcworkspace in Xcode.
How Tesseract OCR Works
Generally speaking, OCR uses artificial intelligence to find and recognize text in images.
Some OCR engines rely on a type of artificial intelligence called machine learning. Machine learning allows a system to learn from and adapt to data by identifying and predicting patterns.
The Tesseract OCR iOS engine uses a specific type of machine-learning model called a neural network.
Neural networks are loosely modeled after those in the human brain. Our brains contain about 86 billion connected neurons grouped into various networks that are capable of learning specific functions through repetition. Similarly, on a much simpler scale, an artificial neural network takes in a diverse set of sample inputs and produces increasingly accurate outputs by learning from both its successes and failures over time. These sample inputs are called “training data.”
While educating a system, this training data:
Then that output is compared to the desired output and the edge weights are adjusted accordingly so that subsequent training data passed into the neural network returns increasingly accurate results.
Tesseract looks for patterns in pixels, letters, words and sentences. Tesseract uses a two-pass approach called adaptive recognition. It takes one pass over the data to recognize characters, then takes a second pass to fill in any letters it was unsure about with letters that most likely fit the given word or sentence context.
Adding Trained Data
In order to better hone its predictions within the limits of a given language, Tesseract requires language-specific training data to perform its OCR.
Navigate to Love In A Snap/Resources in Finder. The tessdata folder contains a bunch of English and French training files. The love poem you’ll process during this tutorial is mainly in English, but also contains a bit of French. Très romantique!
Your poem vil impress vith French! Ze language ov love! *Haugh* *Haugh* *Haugh*
Now, you’ll add tessdata to your project. Tesseract OCR iOS requires you to add tessdata as a referenced folder.
You should now see a blue tessdata folder in the navigator. The blue color indicates that the folder is referenced rather than an Xcode group.
Now that you’ve added the Tesseract framework and language data, it’s time to get started with the fun coding stuff!
Loading the Image
First, you’ll create a way to access images from the device’s camera or photo library.
Open ViewController.swift and insert the following into
takePhoto(_:) :
Here, you:
Immediately below
import UIKit add:
This gives
ViewController access to the kUTTypeImage abstract image identifier, which you’ll use to limit the image picker’s media type.
Now within the
cameraButton UIAlertAction ’s closure, replace the // TODO comment with:
So when the user taps
cameraButton , this code:
Similarly, within
libraryButton ’s closure, add:
This is identical to the code you just added to
cameraButton ’s closure aside from imagePicker.sourceType = .photoLibrary . Here, you set the image picker to present the device’s photo library as opposed to the camera.
Next, to process the captured or selected image, insert the following into
imagePickerController(_:didFinishPickingMediaWithInfo:) :
Here, you:
You’ll code
performImageRecognition in the next section of the tutorial, but, for now, just open Info.plist. Hover your cursor over the top cell, Information Property List, then click the + button twice when it appears.
In the Key fields of those two new entries, add Privacy – Camera Usage Description to one and Privacy – Photo Library Usage Description to the other. Select type String for each. Then in the Value column, enter whatever text you’d like to display to the user when requesting permission to access their camera and photo library, respectively.
Build and run your project. Tap the Snap/Upload Image button and you should see the
UIAlertController you just created.
Test out the action sheet options and grant the app access to your camera and/or library when prompted. Confirm the photo library and camera display as expected.
Note: If you’re running on a simulator, there’s no physical camera available, so you won’t see the “Take Photo” option.
All good? If so, it’s finally time to use Tesseract!
Implementing Tesseract OCR
First, add the following below
import MobileCoreServices to make the Tesseract framework available to ViewController :
Now, in
performImageRecognition(_:) , replace the // TODO comment with the following:
Since this is the meat of this tutorial, here’s a detailed break down, line by line:
Now, it’s time to test out this first batch of new code!
Processing Your First Image
In Finder, navigate to Love In A Snap/Resources/Lenore.png to find the sample image.
Lenore.png is an image of a love poem addressed to a “Lenore,” but with a few edits you can turn it into a poem that is sure to get the attention of the one you desire! :]
Although you could print a copy of the image, then snap a picture with the app to perform the OCR, you’ll make it easy on yourself and add the image directly to your device’s camera roll. This eliminates the potential for human error, further lighting inconsistencies, skewed text and flawed printing among other things. After all, the image is already dark and blurry as is.
Note: If you’re using a simulator, simply drag-and-drop the image file onto the simulator to add it to its photo library.
Install Tesseract On Windows
Build and run your app. Tap Snap/Upload Image, tap Choose Existing, then choose the sample image from the photo library to run it through OCR.
Note: You can safely ignore the hundreds of compilation warnings the TesseractOCR library produces.
Uh oh! Nothing appears! That’s because the current image size is too big for Tesseract to handle. Time to change that!
Scaling Images While Preserving Aspect RatioHow To Install Tesseract In Windows
The aspect ratio of an image is the proportional relationship between its width and height. Mathematically speaking, to reduce the size of the original image without affecting the aspect ratio, you must keep the width-to-height ratio constant.
When you know both the height and the width of the original image, and you know either the desired height or width of the final image, you can rearrange the aspect ratio equation as follows:
This results in the two formulas.
Formula 1: When the image’s width is greater than its height.
Formula 2: When the image’s height is greater than its width.
Now, add the following extension and method to the bottom of ViewController.swift:
This code does the following:
Whew! </math>
Now, within the top of
performImageRecognition(_:) , include:
This will attempt to scale the image so that it’s no bigger than 1,000 points wide or long. If
scaledImage() fails to return a scaled image, the constant will default to the original image.
Then, replace
tesseract.image = image with:
This assigns the scaled image to the Tesseract object instead.
Build, run and select the poem again from the photo library.
Much better. :]
But chances are that your results aren’t perfect. There’s still room for improvement…
Improving OCR Accuracy
“Garbage In, Garbage Out.” The easiest way to improve the quality of the output is to improve the quality of the input. As Google lists on their Tesseract OCR site, dark or uneven lighting, image noise, skewed text orientation and thick dark image borders can all contribute to less-than-perfect results.
Examples of potentially problematic image inputs that can be corrected for improved results. Source: Google’s Tesseract OCR site.
Next, you’ll improve the image’s quality.
Xgboost python install. To use the Python module you can copy xgboost.dll into python-package/xgboost. After the build process successfully ends, you will find a xgboost.dll library file inside./lib/ folder, copy this file to the the API package folder like python-package/xgboost if you are using Python API.
Improving Image Quality
The Tesseract iOS framework used to have built-in methods to improve image quality, but these methods have since been deprecated and the framework’s documentation now recommends using Brad Larson’s GPUImage framework instead.
GPUImage is available via CocoaPods, so immediately below
pod 'TesseractOCRiOS' in Podfile, add:
Then, in Terminal, re-run:
This should now make GPUImage available in your project.
Note: It will also add several hundred more compilation warnings. You can safely ignore these also.
Back in ViewController.swift, add the following below
import TesseractOCR to make GPUImage available in the class:
Directly below
scaledImage(_:) , also within the UIImage extension, add:
Here, you:
Back in
performImageRecognition(_:) , immediately underneath the scaledImage constant instantiation, add:
This code attempts to run the
scaledImage through the GPUImage filter, but defaults to using the non-filtered scaledImage if preprocessedImage() ’s filter fails.
Then, replace
tesseract.image = scaledImage with:
This asks Tesseract to process the scaled and filtered image instead.
Now that you’ve gotten all of that out of the way, build, run and select the image again.
Tesserocr Windows
Voilà! Hopefully, your results are now either perfect or closer-to-perfect than before.
But if the apple of your eye isn’t named “Lenore,” he or she probably won’t appreciate this poem coming from you as it stands… and they’ll likely want to know who this “Lenore” character is! ;]
Replace “Lenore” with the name of your beloved and… presto chango! You’ve created a love poem tailored to your sweetheart and your sweetheart alone.
That’s it! Your Love In A Snap app is complete — and sure to win over the heart of the one you adore.
How To Install Tesseract Ocr In Windows
Or if you’re anything like me, you’ll replace Lenore’s name with your own, send that poem to your inbox through a burner account, stay in for the evening, order in some Bibimbap, have a glass of wine, get a bit bleary-eyed, then pretend that email you received is from the Queen of England for an especially classy and sophisticated evening full of romance, mystery and intrigue. But maybe that’s just me…
Where to Go From Here?
Use the Download Materials button at the top or bottom of this tutorial to download the project if you haven’t already, then check out the project in Love In A Snap Final.
Try out the app with other text to see how the OCR results vary between sources and download more language data as needed.
You can also train Tesseract to further improve its output. After all, if you’re capable of deciphering characters using your eyes or ears or even fingertips, you’re a certifiable expert at character recognition already and are fully capable of teaching your computer so much more than it already knows.
As always, if you have comments or questions on this tutorial, Tesseract or OCR strategies, feel free to join the discussion below!
Add a rating for this contentComments are closed.
|
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |