But, it's not exactly turn-key. You need to:
1. download the source
2. compile it for a Mac
3. download a language file
4. copy it to the appropriate directory
5. run it on TIFF files that need to be renamed to a .tif extension.
Tesseract won't run unless you copy the language file to /usr/local/share/tessdata. Which is strange, because it uses it very irregularly. Most of the miss-read results are simple English words: you get "iist" instead of "list", "lf" instead of "if". It makes you wonder how exactly it is applying this language file.
If you use a Mac utility like Textedit, or Word, or Open Office, the spell-checker can find and help you fix these in a matter of moments. But, still, it's irritating, when you have a long document. This software needs to be 'productized'.
So, the actual sequence:
1. go here, and download tesseract-2.03.tar.gz.
2. In a Terminal window (Applications->Utilities), find your download directory, cd there, and:
: gunzip tesseract-2.03.tar.gz
: tar xvf tesseract-2.03.tar
3. cd to the tesseract-2.03 directory, then:
:./configure
:sudo make
:sudo make install
4. Go back here, and download tesseract-2.00.eng.tar.gz, then, find your download directory, and:
: gunzip tesseract-2.00.eng.tar.gz
: tar xvf tesseract-2.00.eng.tar
: cd tessdata
: sudo bash
: cp * /usr/local/share/tessdata/
Then hit control-d to exit the sudo bash shell.
Make a TIFF file, be sure it has a .tif extension, and then issue a command like this:
tesseract document-image.tif document-results
... and then you'll have text in document-results.txt
Works great. It should come standard with a Mac. With a graphic user interface. And some corrections to the language file use.
3 comments:
Please help me im very eager to learn and trying my best to research.
i get this error:
eoins-macbook:tesseract-2.03 Eoin$ ./configure
checking build system type... i686-apple-darwin9.6.0
checking host system type... i686-apple-darwin9.6.0
checking for cl.exe... no
checking for g++... no
checking for C++ compiler default output file name... configure: error: C++ compiler cannot create executables
See `config.log' for more details.
i cannot continue with the process.
i get the same error message ...
jafri, try to install
http://developer.apple.com/TOOLS/Xcode/
Xcode for your mac,
Then update all the mac updates you can get.
Then try the process again.
let me know, it was so good when i got this working.
Post a Comment