Socrates

Asked: 2017-05-19 07:54:20 +0800 CST2017-05-19 07:54:20 +0800 CST 2017-05-19 07:54:20 +0800 CST

Extract text from image

I am looking for software that recognizes text within images. I tried out all of the tools mentioned here (gocr, fuzzyocr, libhocr0, ocrad, ocrfeeder, ocropus, tesseract-ocr, cuneiform). My input was a photograph of a printed document, hence not hand writing, just printed letters. Of all the tools, tesseract-ocr is the most accurate in my tests, but it still produces many many errors. Hence, scanning a document to some image file, and then continuing with indexing it or performing some NLP, sadly isn't an option. The error rate is too high.

So, given the age of the above mentioned posting, are there any better tools for extracting text from images or photographs?

EDIT 1:

With "image containing text" I mean, that I have a PNG/JPG/BMP file as a source and that I want to extract the pixelized text within it and have an ASCII/UTF-8 text as result and output.

Extract text from image

How to install Google Chrome

Is there a command to list all users? Also to add, delete, modify users, in the terminal?

How to delete a non-empty directory in Terminal?

How to unzip a zip file from the Terminal?

How can I copy the contents of a folder to another folder in a different directory using terminal?

How do I install a .deb file via the command line?

How do I run .sh scripts?

How do I install a .tar.gz (or .tar.bz2) file?

How to list all installed packages

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?

Extract text from image

0 Answers