My scanner has broken! I do have a good camera though so I've taken a few photos of the documents I want to scan... However they look like photos of paper, not scanned documents:
- Images aren't flat
- Lighting isn't even (shadows as the page warps, etc)
- Text obviously isn't processed into copy-pasteable PDF text.
They simply aren't suitable for professional use, but they're close.
I'm looking for some thing (or method) that can do any or all of the above so I can go from a number of JPG files to a single [optionally] annotated PDF of the whole thing, that's in the right format (A4 typically).
Any suggestions (short of going out and buying a new scanner)?
There are several ways to do that. Though all my suggeed ways have one problem they won't really flatten your picture. A more or less good picture would be still required.
One easy way is that you try the software ScanTailor (scantailor.org)
It takes you through 6 steps to optimize your photos. At the last step you can select the option "Equalize illumination" this will give you a nice clean look!
Personally I usually just use GIMP. But you need some basic skills to reach your aim.
Colors
->Curves
option to manipulate the color output in a way you want it...Another nice little program is gscan2pdf, where you can also load photos and export them as PDF. There is is even a link to GIMP so you can improve the photo with the above described steps.
To generate a printable copy or a PDF from a camera photo of a document we have to manually convert quite a lot achieve an image similar to he output from a scanner. Most of these conversions can be done with Gimp.
Try to make the best you can original source image:
Consider desaturation to greyscale for better contrast and removal of coloured pixel artifacts.
Adjust brightness and contrast to make the presumably grey background white, and the black letters pitch black.
Remove cushion distortion?
Depending on our photo lens quality and the zoom level we had used we may have some cushion artifacts leading to bending of the document's outer borders. There are plugins to also remove these artifacts but we may find it quicker to choose a zoom level of our camera where they are minimal only. After cropping (5.) we may not even notice them any more. So removing cushion artifacts may only be needed in case our source image has a lot of straight lines in the outer parts.
Rotate and crop or perspective transform the image if needed.
Unlike a scanner our camera may not get the source in parallel to the image borders. The Gimp Rotate or Perspective tool will give us a visual feedback to be able to rotate or adjust the perspective of an image until the text lines are in parallel to the page.
Perspective Tool on the right side
Now we can select the document source with the rectangle select tool to crop the image inside of the document.
Remove unwanted shadows from bending, folds, or vignetting artifacts from the camera lens.
The quickest method therefore is to simply use the eraser tool to remove all those ugly shadows outside of the text (which we should spare).
erased ->
Scale image?
Depending on the camera resolution scaling up the image to a scanner image size will only increase the file size but will have no benefit on the image quality. Scaling down will remove details. Thererfore we should not scale the image but adjust the print size from the printer dialog (or below in 8.).
Generate PDF
We can import our now nicely manually restored image to LibreOffice (Insert > Media) to
If you already have the image of the document, just download CamScanner app to your phone/tablet. It will allow you import the image, then will do a suggested crop and allow you to flatten as well as adjust colours/contrast etc. Only takes a minute.