How can I use docker without sudo?

Question

vladimir pavloski

Asked: 2016-08-17 12:26:52 +0800 CST2016-08-17 12:26:52 +0800 CST 2016-08-17 12:26:52 +0800 CST

How to search text within PDF files with docfetcher?

772

I'm trying to find some text within PDF files, but the results are not accurate! For exemple I have 2 PDF files which have the word domiciiado. When I run a search for this word (domiciliado), docfetcher shows only ONE PDF file with this word. My question is why docfetcher doesn't show the other PDF file with this word? Is there a difference between PDF files? In one PDF I have only text and the other PDFs are texts and images and this is from a scanned page. What is the catch?

P.S.: the 2 PDF files are in the same directory

1 Answers

Voted

Anwar · Answer 1 · 2016-08-17T12:37:48+08:00

Best Answer

Anwar

2016-08-17T12:37:48+08:002016-08-17T12:37:48+08:00

Is there any difference between PDF files with only text and PDF files with texts and images scanned pages?

Yes, PDF files with text and PDF files with scanned images are different. In Image based pdf, the computer only sees images and recognizing texts within these images requires extra capabilities be built into the PDF engine, such as Optical Character Recognition (OCR). The PDFs with text are easier for computer to search because computer can recognize text directly.

Recommendation

One way to search scanned pdf is first doing OCR on them to extract text and then perform search. Have a look at this question for some good OCR for Ubuntu What's the best, simplest OCR solution?
For searching texts in PDFs with Text only, I recommend command line tool pdfgrep. There are other good options too. Take a look at this question How do I search a PDF file from command line?

3

How to search text within PDF files with docfetcher?

How to install Google Chrome

Is there a command to list all users? Also to add, delete, modify users, in the terminal?

How to delete a non-empty directory in Terminal?

How to unzip a zip file from the Terminal?

How can I copy the contents of a folder to another folder in a different directory using terminal?

How do I install a .deb file via the command line?

How do I run .sh scripts?

How do I install a .tar.gz (or .tar.bz2) file?

How to list all installed packages

Unable to lock the administration directory (/var/lib/dpkg/) is another process using it?